Unbabel Unbabel API Chat FAQs Tickets Video Facebook Instagram LinkedIn Twitter YouTube

Research at Unbabel

Where machine learning meets human ingenuity

Our research team is advancing the state of the art — and changing the way humans work for the better.

Leading voices in the field

At Unbabel, we’ve built up a team of area experts from around the world, with particular strengths in natural language processing.

João Graça

Founder and CTO

André Martins

VP of Artificial Intelligence Research

Paulo Dimas

VP of Product Innovation

Alon Lavie

VP of Language Technologies

Recent publications

We support our researchers as they place their innovative work in top publications. You can see some recent highlights from our team below.

2020

COMET: A Neural Framework for MT Evaluation

  • Ricardo Rei
  • Craig Stewart
  • Ana C Farinha
  • Alon Lavie

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020

Document-level Neural MT: A Systematic Comparison

Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT) 2020

Learning Non-Monotonic Automatic Post-Editing of Translations from Human Orderings

Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT) 2020

Automatic Truecasing of Video Subtitles Using BERT: A Multilingual Adaptable Approach

  • Ricardo Rei
  • Nuno Miguel Guerreiro
  • Fernando Batista

Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2020. CCIS, volume 1237 2020

A Post-Editing Dataset in the Legal Domain: Do we Underestimate Neural Machine Translation Quality?

  • Julia Ive
  • Lucia Specia
  • Sara Szoc
  • Tom Vanallemeersch
  • Joachim Van den Bogaert
  • Eduardo Farah
  • Christine Maroti
  • Artur Ventura
  • Maxim Khalilov

Proceedings of the 12th Language Resources and Evaluation Conference (LREC) 2020

Projects

All our research is designed for maximum impact. For our contributions to natural language processing and applied AI, we’ve received numerous awards.

COMET

COMET (Crosslingual Optimized Metric for Evaluation of Translation) is a new neural framework for training multilingual machine translation (MT) evaluation models. COMET predicts human judgments of MT quality. This “ready to use” trained COMET model is available as open-source to benefit the wider MT R&D community.

Read More

MAIA

MAIA will employ cutting-edge machine learning and natural language processing technologies to build multilingual AI agent assistants, eliminating language barriers. MAIA's 'translation layer' will empower human agents to provide customer support in real-time, in any language, with human quality.

Read More

Read Press Release

User-Focused Marian

Improve the pre-existing neural machine translation toolkit “Marian” to address the needs of CEF eTranslation and to broaden its user base (H2020 Co-Funded Project). Terminology, on-the-fly domain adaptation, better documentation and GPU optimization are the focus areas in this Marian iteneration.

MT4ALL

Aims at building data for under-resourced languages in fields of public interest, such as Health and Justice. It’ll contribute to the CEF Automated Translation Building block by enlarging its coverage for language pairs and domains for which parallel data does not exist (H2020 Co-Funded Project).

Unbabel4EU

We’re working on advancing European language engines for borderless business communication. Create Europe’s Translation Layer, specifically, by enabling seamless human-quality translation between any pairing of the 24 official languages of the EU in different content types such as Email, Chat and Listings (P2020 Co-Funded Project).

OpenKiwi

Quality estimation (QE) is one of the challenges in MT: it evaluates a system’s quality without access to reference translations. We released OpenKiwi, a PyTorch-based open-source framework that implements the best QE systems from WMT 2015-18 shared tasks. The accompanying paper won the best system paper at ACL 2019.

Read More

APE-QUEST

Setting up a quality gate and crowdsourcing workflow to improve translation quality in specific domains. Boost CEF eTranslation with Automated Post-Editing (APE) & Quality Estimation (QE) for Electronic Exchange of Social Security Information (EESSI) and Online Dispute Resolution (ODR) DSIs and related national services (H2020 Co-Funded Project).

Read More

INTERACT - International Network On Crisis Translation

Timely and accurate communication is essential for crisis management, but what if the only information available to you is in a language you cannot understand? Created to answer the need for quality translation in health-crisis scenarios, INTERACT is an interdisciplinary European project.

Read More

Unbabel Scribe

Transcription can be a big piece of translation flows, especially when it comes to audiovisual content. This project aims to research & develop a technical solution for automatic transcription and translation of audiovisual content by leveraging a community of human translators (P2020 Co-Funded Project).

DeepSPIN

Deep learning is revolutionizing the field of Natural Language Processing (NLP), with breakthroughs in machine translation, speech recognition, and question answering. New language interfaces (digital assistants, messenger apps, customer service bots) are emerging as the next technologies for seamless, multilingual communication among humans and machines.

Read More

Most Innovative Company

2017, 2015

Best Global Machine Translation Quality Estimation System

2019, 2016

Blending Human & Artificial Intelligence, in partnership with Concentrix, UK

2019

CBInsights AI List of Most Innovative Artificial Intelligence Startups for Disruptive Technology of the Year

2019

Want to do research with us?

Our AI team is growing — and we’re looking for candidates with unique skills and stories.