News

03 MAR 2022

Coreon Wins €16m EU Semantic Web Consulting Contract

The framework contract Semantic Web Consultancy and Support (procurement OP/LUX/2021/OP/0006) for the Publications Office of the European Union, at a value of 16 Million Euro, was won by a consortium of Infeurope S.A., INTRASOFT International S.A., Cognizone BV, and Coreon GmbH. The consortium compiles vast knowledge and experience in providing software and services in AI, NLP, data and semantic technologies. The consortium members have already worked together in several projects.

The main tasks concern the elaboration of studies, technical specifications, and prototypes for the improvement of the current implementation and configuration of Ceres, CELLAR, and other systems using semantic technology…
The framework contract Semantic Web Consultancy and Support (procurement OP/LUX/2021/OP/0006) for the Publications Office of the European Union, at a value of 16 Million Euro, was won by a consortium of Infeurope S.A., INTRASOFT International S.A., Cognizone BV, and Coreon GmbH. The consortium compiles vast knowledge and experience in providing software and services in AI, NLP, data and semantic technologies. The consortium members have already worked together in several projects.

The main tasks concern the elaboration of studies, technical specifications, and prototypes for the improvement of the current implementation and configuration of Ceres, CELLAR, and other systems using semantic technology. In addition, it is foreseen that the consortium provides technical assistance for the preparation and the execution of tests demonstrating that developed systems conform to technical specifications including the production of test reports and data curation.

Semantic data assets are deployed to improve operations, especially when it comes to supporting users with information needs across languages and domains. The rich multilingual knowledge resources of the Publication Office and the European Commission, namely EuroVoc combined with IATE, will underpin such efforts nicely.
03 JUL 2020

Coreon MKS as LLOD is European Language Grid top funded project

Coreon’s proposal for using the European Language Grid (ELG) as a platform for making multilingual interoperability assets discoverable and retrievable has been awarded. This will be achieved by complementing Multilingual Knowledge Systems with a SPARQL interface. The ELG Open Call 1 received 121 proposals, of which 110 were eligible and 10 were selected. Coreon’s proposal “MKS as Linguistic Linked Open Data” was amongst the three winning proposal from industry and received the highest funding.

The goals of the project are a) to enable Semantic Web systems to query Coreon’s richly elaborated multilingual terminologies stored in concept systems and knowledge graphs…
Coreon’s proposal for using the European Language Grid (ELG) as a platform for making multilingual interoperability assets discoverable and retrievable has been awarded. This will be achieved by complementing Multilingual Knowledge Systems with a SPARQL interface. The ELG Open Call 1 received 121 proposals, of which 110 were eligible and 10 were selected. Coreon’s proposal “MKS as Linguistic Linked Open Data” was amongst the three winning proposal from industry and received the highest funding.

The goals of the project are a) to enable Semantic Web systems to query Coreon’s richly elaborated multilingual terminologies stored in concept systems and knowledge graphs and b) to prove how to overcome the limits of RDF/knowledge graph editors, which usually are fine to model concept relations, but are weak in capturing linguistic information. When deployed in March 2021 on the ELG, the innovation will enable the Semantic Web community to query rich multilingual data with a familiar, industry standard syntax.
07 NOV 2019

CEFAT4Cities Action Gets Funding

The CEFAT4Cities Action, to be executed by a multinational consortium of five partners, led by CrossLang, has received funding. The action starts in April 2020 and runs up to March 2022.
The main objective of the CEFAT4Cities Action is to develop a “Smart cities natural language context”, providing multilingual interoperability of the Context Broker DSI and making public “smart city” services multilingual, with pilots in Vienna and Brussels.
The language resources that will be created will be committed to the ELRC repository and the following languages will be developed: Dutch, English, French, German, Italian, Slovenian, Croatian and Norwegian.

Coreon's…
The CEFAT4Cities Action, to be executed by a multinational consortium of five partners, led by CrossLang, has received funding. The action starts in April 2020 and runs up to March 2022.
The main objective of the CEFAT4Cities Action is to develop a “Smart cities natural language context”, providing multilingual interoperability of the Context Broker DSI and making public “smart city” services multilingual, with pilots in Vienna and Brussels.
The language resources that will be created will be committed to the ELRC repository and the following languages will be developed: Dutch, English, French, German, Italian, Slovenian, Croatian and Norwegian.

Coreon's role in the consortium is provide the appropriate technology, to turn vocabularies into multilingual knowledge graphs, to curate and extend them to model the domain of smart cities.
20 SEP 2022

Semantics 2022: Coreon & Partners Win Best Industry Contribution Award

Smart Chatbots Talk At Semantics Vienna - Knowledge Graphs Improve Chatbots

From 13-15 September, the Semantics 2022 conference took place in Vienna, together with the LTInnovate LI@Work conference. For me and many of my fellow participants, it was the first in-person event after two years of online conferencing. There were around 350 people on-site and 150 online participants. It was a real pleasure to talk to so many industry experts and to spend two evenings with them in Vienna!

The event closed with a very nice surprise for us. Together with the Vienna Business Agency, we received the 'Best Industry Contribution Award' for the topic of 'Smartening Up Chatbots with…

Smart Chatbots Talk At Semantics Vienna - Knowledge Graphs Improve Chatbots

From 13-15 September, the Semantics 2022 conference took place in Vienna, together with the LTInnovate LI@Work conference. For me and many of my fellow participants, it was the first in-person event after two years of online conferencing. There were around 350 people on-site and 150 online participants. It was a real pleasure to talk to so many industry experts and to spend two evenings with them in Vienna!

The event closed with a very nice surprise for us. Together with the Vienna Business Agency, we received the 'Best Industry Contribution Award' for the topic of 'Smartening Up Chatbots with Language-Agnostic Knowledge Graphs'. We showed how, with the help of Machine Translation and a Multilingual Knowledge System, a chatbot can be trained in another language with a push of a button!

How Can Knowledge Graphs Improve Chatbots?

The project, part of the CEFAT4Cities Action, was to build a chatbot which would guide through the jungle of public grants the Vienna Business Agency works with. Our key findings were:

  • How to best teach chatbots, i.e. conversational agents such as Rasa, domain language (terms, synonyms) and knowledge (entities and their relations).
  • No-code: Subject matter experts easily capture and share their knowledge with chatbot developers through the Coreon Multilingual Knowledge System.
  • Entities are queried by Rasa in the MKS during run-time in any language.
  • Through concept relations in the MKS, the chatbot becomes semantically tolerant.
  • Expanding to another language is efficient. Leverage the existing stories, translate/curate entities in the multilingual knowledge graph, MT the dialogs, train the chatbot, et voilà!

If you would like to learn more, my colleague Alena has already shared in two previous blog posts how to make chatbots both smart and polyglott.

We are very proud and happy to have received an award for proofing how multilingual knowledge graphs do improve chatbots, and other applications for Multilingual AI.

The post Semantics 2022: Coreon & Partners Win Best Industry Contribution Award appeared first on .

8 JUL 2022

Multilingual Chatbot: Behind The Scenes

Multilingual Chatbot Challenges

Together with partners, Coreon is a part of CEFAT4Cities Action, a project co-financed by the Connecting Europe Facility of the European Union. It targets the interaction of EU residents and businesses with public services. Two of its outcomes are an open multilingual linked data repository and a pilot chatbot project for the Vienna Business Agency, which leverages the created resource.

A Chatbot, But Make It Smart

If you've ever scratched the surface of conversational AI development, you know that there are innumerable ways to develop a chatbot.

In this post, we’ll talk about technical challenges and solutions we’ve come…

Multilingual Chatbot Challenges

Together with partners, Coreon is a part of CEFAT4Cities Action, a project co-financed by the Connecting Europe Facility of the European Union. It targets the interaction of EU residents and businesses with public services. Two of its outcomes are an open multilingual linked data repository and a pilot chatbot project for the Vienna Business Agency, which leverages the created resource.

A Chatbot, But Make It Smart

If you've ever scratched the surface of conversational AI development, you know that there are innumerable ways to develop a chatbot.

In this post, we’ll talk about technical challenges and solutions we’ve come up with when developing a multilingual chatbot for our partner, the Vienna Business Agency (VBA), in scope of the CEFAT4Cities project.

On a high level, the CEFAT4Cities initiative aims to speed up the development and adoption of multilingual cross-border eGovernment services, converting natural-language administrative procedures into machine-readable data and integrating them into a variety of software solutions.

The goal of SmartBot, in turn, is to make VBA's services discoverable in a user-friendly and interactive way. The services include assistance with starting a business, finding relevant counseling, or drawing up a short list of grants from dozens of available funding opportunities for companies of various scale.

Summary

We share our bot-building experience, demonstrating how to overcome the language gap of local public services on a European scale and reduce red tape for citizens and businesses.

Our solution is driven by multilingual AI and leverages a language-agnostic knowledge graph in the Coreon Multilingual Knowledge System (MKS). The graph contains VBA-specific domain knowledge, an integrated ISA2 interoperability layer, and vocabulary-shaped results of the CEFAT4Cities workflows.

This blogpost covers the following aspects:

  • Rasa, SmartBot's Skeleton
  • The Quest of Domain Knowledge
  • Drafting Domain Knowledge
  • Going Multilingual
  • Relations for Semantic Search
  • Homonymy and Terms Unseen by Model
  • Conclusion

Rasa, SmartBot's Skeleton

SmartBot's natural language understanding component (NLU) and dialogue management are powered by Rasa Open Source, a framework for building conversational AI.

Of course, there is a wide variety of frameworks for building chatbots out there. We've chosen Rasa because it's adjustable, modular, open source, and ticks all boxes with our feature requirements. It can be run in containers and hosted anywhere, providing most popular built-in connectors as well as letting you build the custom ones. Rasa's blog offers a variety of tutorials, and there is an active community of fellow developers at the Rasa Community Forum.

However, there is a catch: expect to work up some sweat when setting things up, and maybe look somewhere else if your engineering resources are scarce or you'd rather go for an easier no-code solution that can be supported by a less ‘tech-y’ part of the team.

But back to Rasa. To briefly summarize the main points (for the architecture components see https://rasa.com/docs/rasa/img/architecture.png ): Rasa NLU handles intent classification (aka "what the user is implying") and entity identification in a user's input; Dialogue management predicts a next action in a conversation based on the context. These actions can be simple responses, serving strings of text/links/pics/buttons to the user, as well as custom actions containing arbitrary functions.

In SmartBot's case, we also heavily rely on Rasa SDK -- it handles all our custom code organized as custom actions (e.g., search a database, make an API call, trigger a handover of the conversation to a human, etc).

The Quest Of Domain Knowledge

In a nutshell, the MKS is a semantic knowledge repository, comprised of concepts linked via relations.

So, let’s add knowledge graphs to this chatbot business. By now you probably have a feeling that a big chunk of implementation work is associated with domain knowledge. The challenge here is that proprietary domain experts and NLP engineers are rarely the same people, so coming up with a process that ensures a smooth cooperation in knowledge transfer can save a lot of time and hassle.

That’s why our architecture features an integration with the Coreon MKS that allows domain experts to easily and intuitively draft their domain knowledge as a graph.

In a nutshell, the MKS is a semantic knowledge repository, comprised of concepts linked via relations. It caters for discovery, access, drafting, and re-usability of any assets, organized in language-agnostic knowledge graphs.

Since linking is performed at the concept level, we can abstract from language-specific terms and model structured knowledge for phenomena that reflect the non-deterministic nature of the human language (e.g., word sense ambiguity, synonymy, homonymy, multilingualism).

This linking 'per concept' also ensures smooth maintenance of relations without additional data clutter and helps us exchange information among acting systems so that precise meaning is understood and preserved among all parties, in any language.

If this still sounds like more work rather than a challenge-overcoming solution, please carry on reading. We’ll now go through concrete cases where we scored multiple points relying on Coreon’s functionality.

Drafting Domain Knowledge

We asked the Vienna Business Agency to draft their knowledge graph and, based on their graph input, we sketched core entities and identified intents that must be recognized in user queries by SmartBot.   

In this screenshot you see how VBA’s domain knowledge is represented and curated in Coreon, and how it is reflected in the chatbot’s UI.

In essence, to determine relevant business grants, SmartBot guides the user through a series of questions, narrowing down the selection to most suitable funding programs and fetching back all applicable grant records. While traversing trees of questions, we collect relevant information from the user and further search the VBA grant database or make API calls to Coreon if the NLU model gets confused.

Aside from the VBA domain knowledge, we also use the MKS to curate the interoperability layer and public service multilingual vocabularies, the result of applying the CEFAT4Cities workflows.

Drafting domain knowledge to boost your multilingual chatbot

Going Multilingual

We see multilinguality as one of our biggest assets, but this feature also comprises a big conceptual challenge: it is very important to have a very clear idea about the kind of multilinguality you want to serve because most likely it will fully shape or at least heavily influence the architecture and scalability of the final solution.

Do you need the chatbot to detect the language of the user's message and carry on the conversation in the detected language? Or would you prefer the user selects the language manually in the UI, from the limited language options on offer? Should the bot 'understand' all languages but reply only in one? Or reply in the one specified by the user? Or maybe reply in the detected language of the last message? Are you working on the solution for an audience that tends to mix/speak several languages?

Serving the same info in different languages?

  • Language detection vs manual selection in UI
  • Should Bot 'understand' all languages but reply in one?
  • Or rather reply in the language of the last message? 

While sticking all languages in one chatbot deployment appears to save resources, most likely you'd have to deal with a lot of mess in the data and potential confusion among languages. That hardly sounds like a sustainable and a robust option.

We decided to go with individual NLU models per language, i.e. keeping them language-specific, while making dialogue management/Stories universal for them all, adding an additional layer of abstraction to maintain consistency in the bot’s behavior.

Our approach

  • NLU models per language
  • One universal Core model/Stories

With all its beauty, this approach brings another challenge along: the core model, responsible for dialogue management. It shouldn’t have a single language-specific string among the training data, so we need to find an abstraction for the representation of entities, aka keywords of variety of types, found in the user’s input.

Challenge:

  • Stories contain entities, aka language-specific keywords
  • thus, should be mapped to some language-independent IDs

Luckily, we don’t have to look any further.

On this screenshot you see the output from EN NLU model, given the sample input, “hi, I’m interested in subsidies”.

You get the recognized intent of this message, want_funding, and below that are the entity type, funding, and the value, the string subsidies.

While NLU digests the user's input and neatly extracts entities recognizing their types, entity records in the domain knowledge of the VBA are being constantly updated, which needs to be addressed by the bot. We therefore abstract from entity maintenance in distinct languages and replace language-specific terms with their unique Coreon concept IDs, resulting in the output below:

Again, we do it because maintaining entities in each language separately would be tedious and not consistent, particularly since the VBA domain knowledge is not static. Also, agnostic entities are crucial for keeping the Rasa Core module language-agnostic, abstracted from entity names in a specific language.

Once VBA decides to expand SmartBot's language capabilities with a new language, 'universal entities' will ensure a smooth model development and minimization of the labeling effort – as the entities are already there.

Relations For Semantic Search

A few words about another handy feature that is brought into this project by the mighty knowledge graph.

By now it is clear that SmartBot serves the user relevant grant recommendations based on the input they provide. Of course, it implies that at some point the bot will query some data source that contains records on VBA grants and other funding opportunities.

To retrieve appropriate records, we need to match all information that influences the funding outcome. Since SmartBot gives users freedom of free text input queries and not just fixed button-clicking, solely using keyword extraction and further string matching would not help fetch any relevant records from the database.

This is where we can leverage relations between concepts stored in Coreon. For example, in their query the user may use a term that is synonymous or closely related to the one accepted by the VBA database.

When asked what kind of company they represent, the user inputs 'Einzelkaufmann', German for 'registered merchant'. This entry doesn’t show up in the database, and computers are still notoriously bad at semantics.

However, with a knowledge graph at hand, they can employ search via the navigation of parental and associative relations of entity 'registered merchant' in Coreon, and now the bot can infer that it is semantically close and connected to Founder (see below).

Homonymy & Terms Unseen By The Model

Another interesting use-case where a knowledge graph can help us is dealing with unseen terms and homonymy. If the user chooses to use slang or domain-specific terminology that was previously unknown to the model, SmartBot will first try to get its meaning using the connector to Coreon, not by taking a standard fallback at the first opportunity.

In the bot UI screenshot, the user enquires in German about the amount of money that can be received from the VBA. They refer to money as 'Kohle', which is a slang term that is also homonymous to 'coal', a fossil (think the analogy 'dough' in English).

Our training data did not contain this term, but the bot makes an API request and searches for it in the knowledge repository. There are two hits in the domain knowledge of the VBA: checking parent concepts and relations of both instances, we see that they belong to two distinct concepts.

The first instance belongs to the CO2 concept in the 'smart city' sub-tree, which hosts concepts related to resource-saving, renewable energy, and sustainability. And the second instance is found among synonyms for 'financial means' concept, which belongs to the concept denoting financial funds and has a more generic parent of 'money', a well-known instance in our NLU model.

Given the context of the conversation is corresponding, the meaning of 'Kohle' is disambiguated for the chatbot, and SmartBot informs the user about the amount of money they can qualify for.

Conclusion

To sum it all up: if you are considering developing a chatbot and are looking for ways to enhance its performance, try injecting a knowledge graph into it.

Aside from acquiring a resulting solution that is technically robust and easily scalable, you’d get a chance to reuse numerous open linked data resources. Your solution would profit from carefully structured knowledge that is sitting there waiting to be used.

The Coreon knowledge graph unlocks multilinguality, a super-power in the European context, and it also allows maintainers manage data consistently across languages, in a visual and user-friendly way. And of course, it reduces red tape, making business processes more rational and services more discoverable.

*Feature Image: Photo by Alex Knight from Pexels

The post Multilingual Chatbot: Behind The Scenes appeared first on .

19 APR 2022

So, You Think You Want A Chatbot?

Challenges Of Chatbot Development

Coreon is a part of CEFAT4Cities Action, a project co-financed by the Connecting Europe Facility of the European Union that targets the interaction of EU residents and businesses with smart city services. One of its outcomes is an open multilingual linked data repository and a pilot chatbot project for the Vienna Business Agency, which leverages the created resource.

In a small series of blog posts, we will share our experiences building a multilingual chatbot. We will demonstrate how to overcome the language gap of local public services on a European scale, thus reducing red tape for citizens and businesses…

Challenges Of Chatbot Development

Coreon is a part of CEFAT4Cities Action, a project co-financed by the Connecting Europe Facility of the European Union that targets the interaction of EU residents and businesses with smart city services. One of its outcomes is an open multilingual linked data repository and a pilot chatbot project for the Vienna Business Agency, which leverages the created resource.

In a small series of blog posts, we will share our experiences building a multilingual chatbot. We will demonstrate how to overcome the language gap of local public services on a European scale, thus reducing red tape for citizens and businesses. In this opening article we steer clear of concrete frameworks and instead focus on the challenges of chatbot development.

Before You Summon The Engineers...

Dialogue is a natural way for humans to interact -- we express ideas and share information, convey mood and engage in debates via conversational exchange. 

Chatbots, also known as ‘conversational agents’ or ‘conversational assistants’, are designed to mimic this behavior, letting us interact with digital services as if they were a real person. With the latest tech advances, we regularly come across conversational agents -- booking tickets, ordering food, obtaining directions, managing bank accounts, receiving assistance from Siri and Alexa -- the list goes on!

Here are some key (although they may appear trivial at first) questions you should ask your project manager before starting out on the development of a new chatbot:

  • What do you want to achieve with the bot? What are the limits of its capabilities? What tasks is it supposed to be tackling? Will it be of a transactional nature or should it rather provide advice, assist with search and the retrieval of information, or combine all of these capabilities in one?
  • What audience is this bot targeting? Are you interested in deploying it internally for a limited audience, or will it be used broadly, e.g., as part of customer care? Which languages does the audience speak, or which are they comfortable working with?
  • How do you capture and maintain the knowledge the bot is supposed to provide to the user?
  • What features does your Minimum Viable Product incorporate, regardless of the framework? Is the bot’s appearance important? Do you need it to have a specific personality, or would you rather keep it neutral?

You get the idea. Once the conceptual foundation is there, it’s time to dive deeper into the sea of frameworks and features.

What Are Chatbots Made Of?

As humans, we barely consciously pay attention to or think too much about sub-processes firing in our brain when engaging in a conversation.

Depending on the technology used, chatbots can vary from simple interactive FAQ-like programs to vastly adaptive, sophisticated digital assistants that can handle complex scenarios and offer a high degree of personalization due to their ability to learn and evolve.

There are a few key concepts that engineers rely on when they develop a chatbot. As humans, we barely consciously pay attention to or think too much about sub-processes firing in our brain when engaging in a conversation. As NLP engineers of a solution that aims to imitate human behavior, however, we should have a clear division of these sub-processes. A chatbot should not only be able to ‘digest’ text or voice input from a user, but also find a way to ‘understand’ it by inferring the semantics of the user's intent and generating an appropriate answer.

To recognise the needs and goals of the user, modern chatbot frameworks rely on Natural Language Processing (NLP), Natural Language Understanding (NLU) and Natural Language Generation (NLG).

NLP is responsible for parsing user's input, or utterances; NLU identifies and extracts its intents (goals) and entities (specific concepts like locations, product names, companies, -- anything really), while an NLG component generates relevant responses to the user.

So You Think You Want A Chatbot

When it comes to bot development activities, there is a variety of closed- and open-source tools and frameworks to choose from. This choice is already a challenge of its own.

Before settling for a specific solution that can accomplish your primary goal, it's helpful to analyze both the short- and long-term project objectives:

  • Have you chosen your features wisely? Are the essential ones mature enough for the project?
  • Are there framework-bound scalability limitations in case of project growth?
  • Can you easily integrate additional building blocks offered by third-party providers?
  • Does the framework satisfy your deployment and security requirements? Are they compliant with your client's demands? Have these demands been clearly defined (e.g. on-premise vs. cloud-based solution, specific feature requirements, containerization)?
  • What kind of staff and how much of their capacity is needed to deploy and maintain the deployed solution?

Another chunk of pre-implementation work is associated with domain knowledge -- proprietary domain experts and NLP engineers are rarely the same people, so coming up with a process that ensures a smooth cooperation in knowledge transfer can save a lot of time and hustle. It is also a good chance to clarify conversational limitations of the future chatbot: does it need to cover chit-chat or support human handover, make API calls and pull up information on request? Does it need to be context-aware, or is it enough to support one conversational turn? You don't want to fully draft your bot's behavior but rather establish its foundation and mark the boundaries.

Once these questions are clarified, you can roll up your sleeves and start coding. The process tends to be highly iterative -- despite all the buzz around AI, you need conversational data from real users from the early stages onwards if you want to design a human-like conversation flow and build a robust virtual assistant.

The rule of thumb is to share your prototype with test users early and continuously collect data from test users so you can adapt the bot's logic, annotate utterances, and retrain NLU models with the updated real data. And once your assistant is mature enough, you are ready to deploy! 🚀

Stay tuned for implementation details on the multilingual abilities of SmartBot, our chatbot for the Vienna Business Agency, as well as some nifty tricks to ensure consistence in a bot’s language abilities, no matter your language of choice.

*Feature Image: Robot Hand Photo created by rawpixel.com - www.freepik.com

The post So, You Think You Want A Chatbot? appeared first on .

2 MAR 2022

Maintaining Concept Maps: A Time-Saver For Terminologists

Maintaining concept maps makes the handling of terminology data more efficient.

Maintaining concept maps involves curating terminology data in a knowledge graph that visually displays the relations between concepts. The benefits of this include:

All desirable features for terminologists and their organizations, yet we often hear the following reaction when discussing the use of the Coreon Multilingual Knowledge System (MKS):

I am already so busy with other daily duties. I have no time to maintain a concept map…

Maintaining concept maps makes the handling of terminology data more efficient.

Maintaining concept maps involves curating terminology data in a knowledge graph that visually displays the relations between concepts. The benefits of this include:

All desirable features for terminologists and their organizations, yet we often hear the following reaction when discussing the use of the Coreon Multilingual Knowledge System (MKS):

I am already so busy with other daily duties. I have no time to maintain a concept map…

This is a common misconception, leading many to stick to the old method of storing an endless number of concepts in all required languages, while also writing lengthy definitions to explain and illustrate what each is about.

Particularly as a terminologist, working with a 'concept map' (such as the one visualised in the Coreon MKS) makes your life significantly easier and more efficient. So, let’s address a few concerns around the perceived burden of maintaining concept maps and look at exactly why they are in fact worth your attention!


Concept maps sound great but, as a terminologist, this just means additional work

Not if you make concept maps an integral part of your terminology management.

Consider your steps when adding a new term to your data set – let’s say HDMI input port. Firstly, you check if it already exists in your repository – you may just search for it (although read below why I think this is a dangerous method). However, if you have an already-developed concept system you can simply navigate to where you would expect concepts linked to 'screens' and other output devices to be stored. You may identify the concept 'frame', which is stored underneath the broader concept 'screen'. You also see semantically similar concepts with terms such as on/off button, power cable, VGA port, or USB port.

You also need to decide – is the new HDMI input port simply the English term for an already-existing concept? Perhaps a colleague has already added it using the German term HDMI-Eingang, or using another synonym? In that case, you’d simply add HDMI input port as a term to the existing concept.

If the concept is indeed missing, however, you will want to add it. Now comes the key point – you are already at the location in your concept system where HDMI input port needs to be inserted. With the Coreon MKS, you would simply click ‘Insert new concept here’ and the relation between HDMI input port and its broader concept 'frame' is created in the background.

Relations between terms are created automatically when maintaining concept maps.
Simply insert a new concept in the right place and the relation is created automatically.

No additional work, then, but rather a welcome side-effect of adding new concepts in a systematic fashion. It’s comparable to the basic decision of where to save a new Word document in a file system, nothing more. And we do that every day, don’t we?


I see little value besides the nice visualization

Well, the value is in fact inherent in the visualization!

Let’s say you are faced with the task of illustrating and documenting your concepts by writing a definition.

How do you usually craft a definition? An established way is to explicitly differentiate the characteristics of a concept from its general, broader concepts as well as semantically closer concepts. An HDMI input port is part of the frame, and is also somewhat complementary to the VGA input.

In the concept map you see all related concepts at hand in one view, so you can write a clear text definition much faster. This is not a hypothesis – users of the Coreon MKS have confirmed that having the concept map at hand enables faster writing of the definition.

You also benefit when crafting and rating terms. Say you’d like to add a concept with a term such as LCD screen. Is this phrasing correct, or do we prefer LCD monitor? Luckily, through the concept map we have the broader concept with its term screen in view as well as two variants, TFT screen and LED screen. In all these cases the component screen was favored over monitor, so it’s a quick and easy decision for LCD screen over LCD monitor.

All in all you benefit from the linguistic work put into related concepts, enabling consistency across concepts!


I won’t click through maps, I prefer searching

At times we all use search, but it basically means having a guess with a keyword and hoping that this string is present in the repository.

Search only displays concepts where the search term, or slight linguistic variations of it, occurs. So, if you search for 'screen' you would find TFT screen, LCD screen etc., but what if you queried for monitor or display? Or in another language, say the French écran or the German Bildschirm? Your search would miss the concept screen!

Consequently you decide to create a new concept triggered by your new term monitor – and you unfortunately just created a doublette – one concept for screen and one for monitor – even though these are synonyms. This is also a reason why I am not a big fan of duplicate recognition – a cool-sounding feature but one that only checks for homonyms and not redundant concepts…but that’s a topic for another blog post.

Navigating and interacting through a concept map is therefore the key best practice when it comes to updating or maintaining a repository. It keeps the data clean, allowing you to identify gaps to avoid redundancies and achieve high quality data that both your users and audiences can rely on as a trustworthy resource.


Your arguments are convincing – but I have no time to post-edit my existing large terminology collections

Doing this manually would indeed be time-consuming, but there is a solution.

We’ve produced a way to automate the process by using advanced AI and NLP methods to ‘draft’ a knowledge graph and speed up its creation dramatically.

If you own a ‘flat’ terminology collection of several thousand concepts – available in formats such as ISO TBX, SDL MultiTerm, or MS Excel – this auto-taxonomization method can now elevate the data into a knowledge graph faster, getting the most tedious and time-consuming part done before you apply your own expert knowledge manually.


A boost, not a burden.

So, there it is. Maintaining concept maps doesn't just add ‘additional’ illustrative flavors to your terminology data. In fact it should be an integral part of your process as you look to maximize efficiency.

Want to read more on this? Have a look at Malcolm Chisholm's post where he discusses Types of Concept System: "Concepts do not exist as isolated units of knowledge but always in relation to each other." (ISO 704)

Get in touch with us here to learn more about maintaining concept maps and how they can revolutionize your workflow.

The post Maintaining Concept Maps: A Time-Saver For Terminologists appeared first on .

29 SEP 2021

Winning with LangOps

Winning with Language Operations (LangOps)

In a recent Forbes Technology article, council member Joao Graca states that Language Operations should be the new paradigm in globalization. He hits the nail on the head by saying that serving global markets is no longer about broadcasting translated content, but rather enabling businesses to communicate with stakeholders no matter what language they speak. LangOps is an enterprise function formed of cross-functional and multidisciplinary teams which efficiently operationalize the management of textual data. Neural machine translation (NMT) and multilingual knowledge management are indispensable tools to win, understand, and support global customers.

Release the Machine Translation Handbrake

NMT is approaching…

Winning with Language Operations (LangOps)

In a recent Forbes Technology article, council member Joao Graca states that Language Operations should be the new paradigm in globalization. He hits the nail on the head by saying that serving global markets is no longer about broadcasting translated content, but rather enabling businesses to communicate with stakeholders no matter what language they speak. LangOps is an enterprise function formed of cross-functional and multidisciplinary teams which efficiently operationalize the management of textual data. Neural machine translation (NMT) and multilingual knowledge management are indispensable tools to win, understand, and support global customers.

Release the Machine Translation Handbrake

NMT is approaching human parity for many domains and language pairs thanks to algorithmic progress, computing power, and the availability of data. Yet executives are still asking themselves why these breakthroughs have so far had only marginal effects on translation costs, lag, and quality.

The main reasons for this are a price model still based on translation memory (TM) match categories and the use of the timeworn formula IF Fuzzy < x% THEN MT. In addition, terminology – which is crucial for quality, process, and analytics – often leads a pitiful existence in Excel columns or sidelined term bases. While most focus on how to squeeze the last, rather meaningless drop of BLEU score out of the NMT black box, the real benefits will only be delivered by a LangOps strategy carried out by an automated workflow and reliable resource management.

Language Operations

LangOps is built on software that automates translation and language management. AI and Machine Learning have revolutionized the process, but for many tasks a rule-based approach is still superior. As always in engineering, it’s a question of piecing it smartly and pragmatically together. For example, while NMT is replacing segment-based translation memories, the cheapest and best method will always be the recycling of previously translated content. Terminology is baked into both NMT and TM, and thus is easily overlooked. LangOps, on the other hand, elevates terminology to multilingual knowledge. It is not only used for quality estimation and assurance, but also as the key meta data to drive processes. LangOps builds a multilingual data factory optimized for costs, time, and quality needs.

AI with Experts-in-the-Loop

LangOps will enable the building of scalable language factories...and will power a move towards cloud-based service levels.

The efficiency of LangOps needs to be complemented by the part of the process which involves humans. LangOps classifies linguistic assets, human resources, workflow rules, and projects in a unified system which is expandable, dynamic, and provides fallback paths. For example, the workflow knows who has carried out a similar project before, who has expertise in a particular domain, or how many hours an expert will typically need for a specific task. LangOps will enable the building of scalable language factories that leave the outdated price-per-word business model in the dust of transactional translations, and will power a move towards cloud-based service levels.

Cut Costs, then Drive the Top-Line

LangOps typically starts with translation because that’s where enterprises have created their linguistic assets. While cutting globalization costs is important, executives are more interested in how LangOps can drive growth.

Machine translation allows enterprises to communicate instantly with their customers. Terminology databases can be upgraded to multilingual knowledge systems (MKS), which allow companies to not only broadcast localized content to global customers, but also actually understand them when they talk back. An MKS not only enables e-Commerce players to deploy language-neutral product search, but is also a proven solution to make data repositories, systems, organizations, and even countries interoperable. It also crucially provides the unified semantics for the Internet of Things. All of these benefits boost LangOps, which owns the normalized enterprise knowledge and is the basis for many critical customer-facing activities such as customer support, chatbots, text analytics, spare part ordering, compliance, and sales.

Get in touch with us here to learn more about how LangOps can grow also your top-line.

The post Winning with LangOps appeared first on .

21 SEP 2021

Building a Chatbot with Coreon

Building a Chatbot with Coreon

A chatbot informs or guides human users on a specific topic, but a machine can only ‘know’ what it is taught by humans. This means the chatbot must ‘know’ about the topic and – above all – how to relate this information to the request of the user. Ontologies are a very helpful data source for this because their purpose is to represent knowledge in context.

What is an Ontology?

An ontology is defined as the ‘shared and formal modelling of knowledge about a domain’ (IEC 62656-5:2017-06). It consists of classes (or concepts), relations, instances, and axioms (http://www.cs.man.ac.uk/~stevensr/onto/node3.html)…

Building a Chatbot with Coreon

A chatbot informs or guides human users on a specific topic, but a machine can only ‘know’ what it is taught by humans. This means the chatbot must ‘know’ about the topic and – above all – how to relate this information to the request of the user. Ontologies are a very helpful data source for this because their purpose is to represent knowledge in context.

What is an Ontology?

An ontology is defined as the ‘shared and formal modelling of knowledge about a domain’ (IEC 62656-5:2017-06). It consists of classes (or concepts), relations, instances, and axioms (http://www.cs.man.ac.uk/~stevensr/onto/node3.html). Classes refer to a ‘set […] of entities or 'things' within a domain’. Relations represent the ‘interactions between concepts or a concept's properties’ and instances are ‘the 'things' represented by a concept’. Moreover, axioms are ‘used to constrain values for classes or instances’. This means that axioms define what values a class or instance can or cannot have. Axioms are used to represent additional knowledge that cannot be derived from the classes or instances themselves (e. g., ‘there can be no train connection between Europe and America’).

An ontology can be summarized as a knowledge base consisting of concepts, as well as relations between the concepts and additional information. Ontologies are made machine-readable through standardized ontology languages such as OWL (Web Ontology Language) or RDF. They make it possible for the knowledge represented in an ontology to be understood by machines and programs, such as chatbots.

The Role of Concept Maps

Users can access terminology more easily when they see a concept in context.

In the Coreon Multilingual Knowledge System, concept maps are built as part of the terminology work. This is a very helpful addition to terminology management because the relations between concepts are captured and displayed next to the concept information. Thanks to this feature, terminologists and experts can define concepts more precisely, as the relationships with and differences to neighbor concepts are crucial factors when settling on a definition. Furthermore, users can access the terminology more easily when they see a concept in context.

Concept maps in Coreon are the perfect base for an ontology, and this is an important advantage when re-using terminology for machine applications. The information stored in concept maps can be exported, analyzed, and used by various machine applications, including chatbots. For exports from Coreon, the proprietary language coreon.xml or the standardized ontology language RDF can be used.

Use Case: A Chatbot for Company Services

In our use case we created a prototype for a chatbot to represent the services of our company, berns language consulting (blc). Its purpose was to lead users on the company website from ‘utterances’ (i.e., questions or messages typed by users) to solutions. If, for example, a customer asks: ‘How do I get a fast translation into 10 languages?’, they are led to the company service ‘Machine Translation’. Not only do customers immediately learn the name of the service, but also additional information about the advantages and different aspects of machine translation. A call to action is also displayed, e.g., an offer to speak directly to a human expert.

We created the chatbot with the programming language Python. As a database we used the export of a concept map we had created in Coreon beforehand. By doing this, we were able to use the concept map as an ontology. In the concept map, we displayed the following concept classes:

  • company service, e. g., machine translation
  • solutions (part of the service, but also concrete solutions for customers’ problems), e. g., preprocessing
  • customer experience, e. g., translation too expensive
  • other concepts, e. g., MT engine
A concept map in Coreon, focusing on blc services and solutions

The aim of the chatbot is to lead customers from their ‘utterance’ to possible solutions and company services. For this, the chatbot extracts key words from the customer’s typed enquiry and maps them to the concepts in the concept map. It then follows the paths in the concept map until it reaches solutions and/or specific company services. The extracted solutions and services determine the answer of the chatbot. To enable the chatbot to understand as many utterances as possible, there should be a large number of concepts that are related to the range of customer services.

A Smarter, More Advanced Chatbot

The major advantage of using an ontology as a database for a chatbot is that it helps the machine to understand the relationships between concepts. Users’ utterances are easily analyzed and mapped to concepts in the ontology and once the entry into a concept in the ontology is made, related concepts are found and proposed to the user. Another crucial benefit is that a concept map is a controlled database. The administrator can decide which utterances lead to which solutions. Of course, building a concept map as a base for a chatbot entails some manual effort. However, automatic procedures can be included to speed up the terminological work.

A third big advantage is that the ontology cannot only be used in this particular use case. In theory it can be re-used in practically every use case where a machine is trying to ‘understand’ a human. Such scenarios include language assistance systems, text generators, classifiers, or intelligent search engines.

Do you have a good use case for starting an ontology, or would you like to start one but don’t know how? Do you need help building an ontology? Contact us, we are happy to help!

https://berns-language-consulting.de/language/en/terminology-ontology/

Jenny Seidel is responsible for terminology management and language quality at berns language consulting (blc). She helps customers set up terminology processes and implement terminology tools for specific use cases. Her recent focus has been the potential of ontologies as a base for Machine Learning.

The post Building a Chatbot with Coreon appeared first on .