The “why” behind the Multilingual AI Agent Assistants (MAIA) project: A conversation with Dr. Graham Neubig

September 2, 2020

Unbabel recently announced that the company’s AI research team is partnering with Carnegie Mellon University, INESC-ID, and Instituto de Telecomunicações. The goal? Reduce language-based borders by making multilingual chat work better.

The MAIA: Multilingual AI Agent Assistants large scale research project will augment customer service agents with AI, making it more efficient for enterprises to deliver chat in 30 languages and improve customer satisfaction through human empathy.

We at Unbabel are working closely with universities and researchers like Dr. Graham Neubig on MAIA. We know customer satisfaction requires human empathy, and speaking the customer’s language is a powerful way to demonstrate this.

Recently, we sat down with Dr. Neubig to learn more about his background, what led him to his field of research, why he’s excited about MAIA and the biggest challenges on the road ahead. Read more below.

Interview with Dr. Graham Neubig of Carnegie Mellon University

Unbabel: How many languages do you speak?

Dr. Graham Neubig: I speak two languages. English is my native language. I learned Japanese by studying it in university and then living in Japan for 11 years. I also can read a bit of Chinese, Korean, and Spanish.

Unbabel: How has your experience led you to the research you do?

Dr. Graham Neubig: When I started university, I was most interested in music processing. I didn’t really get interested in natural language processing until I studied abroad in Japan my junior year and started learning the language. That really sparked my interest, and it has grown from there.

Unbabel: Your areas of research include machine learning approaches that are both linguistically motivated and tailored to applications (such as machine translation and natural language understanding). What does “linguistically motivated” mean to you?

Dr. Graham Neubig: Basically, it means taking into consideration what we know about various aspects of linguistics: phonology, syntax, semantics. For example, we know that all humans produce speech using the articulators (anatomical parts of their mouths and throats). So all speech in different languages is going to be similar in some ways. We also know that all languages have some variety of recursive structure. And we know that languages evolve from each other and naturally share many similarities. We can use all of this linguistic knowledge to build more effective machine learning models.

Unbabel: What drove your interest in the MAIA research project specifically?

Dr. Graham Neubig: I’m very interested in communication in the conversational context. Conversation poses some unique challenges and opportunities. For example, conversations are highly context-dependent. It may not be clear how to translate a particular sentence unless you have a model that can handle the whole, broader context.

Also, you need to hit the appropriate register. In other words, you don’t want to be talking to a business customer using slang or impolite expressions. Plus, conversations have active participants. We may be able to leverage this fact to inform the model about how it should be translating some terms or to provide guidance about when it’s doing well or doing poorly.

Unbabel: Why do you think it’s important for academics and the industry to partner up when it comes to this type of research?

Dr. Graham Neubig: It’s great to be able to partner with Unbabel! I know and greatly respect many of the technical people they have on the team, and it’s great to see all the success they’ve had through innovation on the business side as well.

On a broader level, joint projects between academia and the industry are interesting, because the industry often has complex, real-world problems that beg to be solved on a faster timeline. In academia, we can often take a slightly more deliberate pace to break problems down to their essences, formulate them cleanly, and come up with long-term solutions. It’s nice to partner up, because it forces us to strike a balance.

Unbabel: Speaking of balance, how do you think about balancing the theoretical with the practical when it comes to your research?

Dr. Graham Neubig: My personal research style is often to start with a practical problem, get a good understanding of the problem, and then try to generalize or simplify the problem setting. Then we can run carefully designed experiments that may allow us to test general theories. In that way, I think I’m on the more practical side as far as academic researchers go. I really like when we can solve big problems that affect many practical situations in one fell swoop.

Unbabel: What are the hottest frontiers in natural language processing right now, in your opinion?

Dr. Graham Neubig: We’re in a very exciting time for NLP right now, with large-scale neural models consistently improving accuracy and expanding the boundaries of what we can do. Long-tail languages are certainly an issue close to my heart, as well as models that are more specific to different topical domains or dialects. In addition, bias, model interpretability, and further expanding the application scenarios of NLP technology are all fascinating topics to me.

Unbabel: What are some of the most difficult languages to build NLP models for? Why?

Dr. Graham Neubig: Given the current technology, languages with fewer resources are hardest to build models for, especially when they are not very similar to other languages that have more resources. For languages that have few resources but are similar lexically or syntactically to another language with more resources, we now have reasonably good tools. We can use data from the higher-resourced language to improve accuracy on the lower-resourced languages. There are certainly some factors of languages that make them a bit harder to model (e.g. complicated morphology, when information is left implicit, etc.) However, it seems that, with our current models, presence or absence of lots of data really is the overwhelming factor.

Unbabel: What real-world applications of NLP do you find the most interesting or important and why?

Dr. Graham Neubig: With regards to important, there are widely used ones, such as machine translation, speech recognition, chatbots, and questioning answering. But I’m also interested in some more esoteric or emerging applications, such as natural language programming (where humans command computers in natural language vs. code), and language learning applications for low-resource and/or endangered languages.

The MAIA Project: Borderless customer conversations

Over the course of 36 months, the research team proposes to build a toolbox of machine learning technologies for online multilingual customer service, which includes context-aware machine translation, automatic answer generation, and conversational quality estimation.

This will significantly expand the current portfolio of Unbabel AI technologies, creating a new model: an agent assistant that will facilitate the communication between human agents and international customers, making live chat customer service platforms more multilingual, scalable and capable of ensuring higher customer satisfaction.

It will also help agents with new dialogue-oriented productivity tools, and expand Unbabel’s renowned quality estimation technology to assess conversational quality.

We look forward to reporting back on our progress together along the way.

The post The “why” behind the Multilingual AI Agent Assistants (MAIA) project: A conversation with Dr. Graham Neubig appeared first on Unbabel.

About the Author

Profile Photo of André Martins
André Martins

André Martins is the VP of AI Research at Unbabel, an Associate Professor at IST, and a researcher at IT. He received a dual-degree PhD (2012) in Language Technologies from Carnegie Mellon University and IST. His PhD thesis received an Honorable Mention in CMU's SCS Dissertation Award and the Portuguese IBM Scientific Prize. His research interests include natural language processing (NLP), ML, structured prediction, and sparse modeling, in particular the use of sparse attention mechanisms to induce interpretability in deep learning systems. He co-founded and co-organizes the Lisbon Machine Learning School (LxMLS 2011--2019). He received a best paper award at ACL 2009 and a best system demonstration paper award at ACL 2019. A. Martins recently won an ERC starting grant for his DeepSPIN project (2018-23), whose goal is to develop new deep learning models and algorithms for structured prediction, for NLP applications.