MAIA: Multilingual AI Agent Assistants

MAIA (“MAIA: Multilingual AI Agent Assistants”) is a collaborative research project led by Unbabel and funded by the Agência Nacional de Inovação (ANI), the Fundação para a Ciência e Tecnologia (FCT), and the CMU-Portugal partnership program.*


Online chat is one of the fastest-growing customer support channels, especially with Millennials. In today’s world, supporting international customers via chat requires hiring native speaking agents for multiple languages – a scarce and costly resource. But with numerous advancements in language technology, and in particular, machine translation (MT) and dialogue systems, this can easily be automated.

Figure 1: Example of a conversation between a Portuguese customer and an English agent where the context is key to translate the phrase “your boss” correctly. To infer that the correct translation is “a sua chefe” (a female boss) the system needs to take into account the first question posed by the customer, which indicates that the boss is a female.

Figure 4: Mockup of the user interface to be used by the human agent. Illustrated is the conversation history (on the agent’s language), a list of answer suggestions, a message box supporting auto-completion where the agent can type the response, and an indicator of the sentiment of the customer throughout the conversation.

However, current MT systems are still too brittle and impractical: first, they require too much data and computing power, failing for domains or languages where labeled data is scarce; secondly, they do not capture contextual information (e.g. current MT systems work on a sentence-by-sentence basis, ignoring the conversational context); thirdly, fully automatic systems lack human empathy and fail on unexpected scenarios, often leading to low customer satisfaction.

With MAIA, we’re developing a multilingual conversational platform where AI agents will assist human agents. This approach will overcome the above limitations through:

  • New memory-efficient neural models for context-aware machine translation, suitable for online and real-time translation. These models will retain key aspects of a conversation (e.g., the gender of the customer), bringing them up whenever needed to translate a message.
  • New answer generation techniques where the human agent (e.g., a tourism officer) will receive suggestions that reduce effort and increase the customer’s (e.g. a tourist) satisfaction.
  • New techniques for conversational quality estimation and sentiment analysis to assess how well the conversation is addressing the customer’s needs, while simultaneously increasing “human empathy”.
  • Integration of the scientific advances above into a full end-to-end product.

*This research is filed under the P2020 program, supervised by ANI under contract number 045909. It runs 2020-2023 within Unbabel, Instituto de Telecomunicações, INESC-ID, and Carnegie Mellon University.