Translational Research

Unmatched technology + unparalleled research

We’ve assembled a diverse team of experts in the AI and NLP fields. Their unparalleled research and award-winning breakthroughs continues to set industry standards, taking us closer to our vision: creating a world without language barriers.

Our research team

Projects

Open-source tools

Awards

Publications

Leading voices in the field

Our research team is changing the way humans communicate

João Graça
Co-Founder and Chief Technology Officer
André Martins
VP of Research
Paulo Dimas
VP of Product Innovation

José Souza
Staff AI Research Scientist
M. Amin Farajian
Senior AI Research Scientist
Fabio Kepler
Senior AI Research Scientist
Ricardo Rei
Senior AI Research Scientist
Catarina Farinha
AI Research Manager
João Alves
AI Research Scientist
Daan van Stigt
Senior AI Research Scientist
Maria Ana Henriques
R&D Project Manager
Nuno André
Senior Grants Coordinator
Vera Cabarrão
Senior AI Quality Manager
Marianna Buchicchio
Senior Manager AI Quality
João Godinho
Senior AI Research Engineer
Pedro Mota
Senior AI Research Engineer
Nuno Guerreiro
AI Research Scientist
José Pombal
AI Research Scientist
Pedro Martins
Senior AI Research Scientist
Muhammad Bilal
Senior Backend Engineering Manager
Ana Oliveira António
Junior Communications and Project Manager
António Novais
Junior Grants Coordinator

Projects

Center for Responsible AI
The Center for Responsible AI is one of the largest centers dedicated to Responsible AI, bringing together ten startups, eight research centers, a law firm, and five industry leaders, that will collaborate to develop 21 innovative AI products leveraged by Responsible AI technologies such as equity, explainability, and sustainability. The center is co-funded by the Portuguese PRR.
Learn more
Paper
Press release
UTTER
UTTER – Unified Transcription and Translation for Extended Reality – is a collaborative Research and Innovation project funded under Horizon Europe that aims to leverage large language models to build the next generation of multimodal eXtended reality (XR) technologies for transcription, translation, summarisation, and minuting. UTTER’s use-case prototypes will cover (i) a personal assistant for meetings that can improve communication in the online world and (ii) an advanced customer service assistant to support global markets.
Learn more
QUARTZ
QUARTZ (“Quality-Aware Machine Translation”) is a cutting edge research project funded by the ELISE Open Call to build Responsible MT for conversational data: high-quality MT to unlock new markets where critical MT errors can’t be tolerated.
Learn more
Paper
Press release
MAIA
MAIA will employ cutting-edge machine learning and natural language processing technologies to build multilingual AI agent assistants, eliminating language barriers. MAIA’s ‘translation layer’ will empower human agents to provide customer support in real-time, in any language, with human quality.
Learn more
Paper
Press release
User-Focused Marian
Improve the pre-existing neural machine translation toolkit “Marian” to address the needs of CEF eTranslation and to broaden its user base (H2020 Co-Funded Project). Terminology, on-the-fly domain adaptation, better documentation and GPU optimization are the focus areas in this Marian iteration.
MT4ALL
Aims at building data for under-resourced languages in fields of public interest, such as Health and Justice. It’ll contribute to the CEF Automated Translation Building block by enlarging its coverage for language pairs and domains for which parallel data does not exist (H2020 Co-Funded Project).
Unbabel4EU
We’re working on advancing European language engines for borderless business communication. Create Europe’s Translation Layer, specifically, by enabling seamless human-quality translation between any pairing of the 24 official languages of the EU in different content types such as Email, Chat and Listings (P2020 Co-Funded Project).
Learn more
Paper
Press release
APE-QUEST
Setting up a quality gate and crowdsourcing workflow to improve translation quality in specific domains. Boost CEF eTranslation with Automated Post-Editing (APE) & Quality Estimation (QE) for Electronic Exchange of Social Security Information (EESSI) and Online Dispute Resolution (ODR) DSIs and related national services (H2020 Co-Funded Project).
Learn more
Paper
Press release
INTERACT
Timely and accurate communication is essential for crisis management, but what if the only information available to you is in a language you cannot understand? Created to answer the need for quality translation in health-crisis scenarios, INTERACT is an interdisciplinary European project.
Learn more
Paper
Press release
Unbabel Scribe
Transcription can be a big piece of translation flows, especially when it comes to audiovisual content. This project aims to research & develop a technical solution for automatic transcription and translation of audiovisual content by leveraging a community of human translators (P2020 Co-Funded Project).
Learn more
Paper
DeepSPIN
Deep learning is revolutionizing the field of Natural Language Processing (NLP), with breakthroughs in machine translation, speech recognition, and question answering. New language interfaces (digital assistants, messenger apps, customer service bots) are emerging as the next technologies for seamless, multilingual communication among humans and machines.
Learn more
Paper
Press release
Unbabel’s Internationalization Plan
Unbabel’s Internationalization Plan (“Unbabel 2017-2019: Plano de Internacionalização”) is a project led by Unbabel and co-funded by Portugal 2020 – Sistema de Incentivos à Internacionalização das PME.
Learn more
Paper
Press release
Unbabel 2017: A new ecosystem of Machine + Crowd Translation
“Unbabel 2017: A new ecosystem of Machine + Crowd Translation” is a project led by Unbabel and co-funded by Portugal 2020 – Sistema de Incentivos à Investigação e Desenvolvimento Tecnológico (SI I&DT)
Learn more
Paper

Projects & publications

MT-Telescope
MT-Telescope provides a fine-grained, visual comparison of the quality performance of two machine translation (MT) systems. It lifts the hood on the automatic quality score, allowing users to filter quality performance by keywords, terminology, and segment length. MT-Telescope is available as open-source to benefit the wider MT R&D community.
Learn more
Paper
Press release
COMET
COMET (Crosslingual Optimized Metric for Evaluation of Translation) is a new neural framework for training multilingual machine translation (MT) evaluation models. COMET predicts human judgments of MT quality. This “ready to use” trained COMET model is available as open-source to benefit the wider MT R&D community.
Learn more
Paper
Press release
MAIA
MAIA will employ cutting-edge machine learning and natural language processing technologies to build multilingual AI agent assistants, eliminating language barriers. MAIA’s ‘translation layer’ will empower human agents to provide customer support in real-time, in any language, with human quality.
Learn more
Paper
Press release
User-Focused Marian
Improve the pre-existing neural machine translation toolkit “Marian” to address the needs of CEF eTranslation and to broaden its user base (H2020 Co-Funded Project). Terminology, on-the-fly domain adaptation, better documentation and GPU optimization are the focus areas in this Marian iteneration.
MT4ALL
Aims at building data for under-resourced languages in fields of public interest, such as Health and Justice. It’ll contribute to the CEF Automated Translation Building block by enlarging its coverage for language pairs and domains for which parallel data does not exist (H2020 Co-Funded Project).
Unbabel4EU
We’re working on advancing European language engines for borderless business communication. Create Europe’s Translation Layer, specifically, by enabling seamless human-quality translation between any pairing of the 24 official languages of the EU in different content types such as Email, Chat and Listings (P2020 Co-Funded Project).
Learn more
Paper
Press release
OpenKiwi
Quality estimation (QE) is one of the challenges in MT: it evaluates a system’s quality without access to reference translations. We released OpenKiwi, a PyTorch-based open-source framework that implements the best QE systems from WMT 2015-18 shared tasks. The accompanying paper won the best system paper at ACL 2019.
Learn more
Paper
Press release
APE-QUEST
Setting up a quality gate and crowdsourcing workflow to improve translation quality in specific domains. Boost CEF eTranslation with Automated Post-Editing (APE) & Quality Estimation (QE) for Electronic Exchange of Social Security Information (EESSI) and Online Dispute Resolution (ODR) DSIs and related national services (H2020 Co-Funded Project).
Learn more
Paper
Press release
INTERACT
Timely and accurate communication is essential for crisis management, but what if the only information available to you is in a language you cannot understand? Created to answer the need for quality translation in health-crisis scenarios, INTERACT is an interdisciplinary European project.
Learn more
Paper
Press release
Unbabel Scribe
Transcription can be a big piece of translation flows, especially when it comes to audiovisual content. This project aims to research & develop a technical solution for automatic transcription and translation of audiovisual content by leveraging a community of human translators (P2020 Co-Funded Project).
Learn more
Paper
Press release
DeepSPIN
Deep learning is revolutionizing the field of Natural Language Processing (NLP), with breakthroughs in machine translation, speech recognition, and question answering. New language interfaces (digital assistants, messenger apps, customer service bots) are emerging as the next technologies for seamless, multilingual communication among humans and machines.
Learn more
Paper
Press release

Open-source tools

XCOMET
XCOMET is a cutting-edge, open-source metric designed to be more interpretable and better aligned with MQM human evaluations. XCOMET combines sentence-level evaluation, similar to neural metrics such as COMET and BLEURT, and error span detection capabilities.
Learn more
TowerLLM
TowerLLM is a suite of multilingual large language models (LLM) optimized for translation-related tasks ranging from pre-translation, to translation and evaluation tasks, such as machine translation (MT), automatic post-editing (APE), and translation ranking. Tower is built on top of LLaMA2 [1], comes in two sizes — 7B and 13B parameters —, and currently supports 10 languages.
Learn more
CometKiwi XL
CometKiwiXL is large language model (LLM) specialized in predicting the quality of a translation. CometKiwi XL (3.5B) and CometKiwi XXL (10.7B) are the open-sourced versions of our state-of-the-art Quality Estimation model.
Learn more
MT-Telescope
MT-Telescope provides a fine-grained, visual comparison of the quality performance of two machine translation (MT) systems. It lifts the hood on the automatic quality score, allowing users to filter quality performance by keywords, terminology, and segment length. MT-Telescope is available as open-source to benefit the wider MT R&D community.
Learn more
Paper
Press release
COMET
COMET (Crosslingual Optimized Metric for Evaluation of Translation) is a new neural framework for training multilingual machine translation (MT) evaluation models. COMET predicts human judgments of MT quality. This “ready to use” trained COMET model is available as open-source to benefit the wider MT R&D community.
Learn more
Paper
Press release
OpenKiwi
Quality estimation (QE) is one of the challenges in MT: it evaluates a system’s quality without access to reference translations. We released OpenKiwi, a PyTorch-based open-source framework that implements the best QE systems from WMT 2015-18 shared tasks. The accompanying paper won the best system paper at ACL 2019.
Learn more
Paper
Press release

Awards

Eighth Conference on Machine Translation
Winners of the WMT 2023 QE Shared Task
Eighth Conference on Machine Translation
Winners of the WMT 2023 Metrics Shared Task
COMET22
Winning submission for the Chinese-English language pair and the second best for the other two language pairs in the WMT2022 Metrics shared task
CometKiwi
Winning submission of the WMT 2022 Quality Estimation (QE) shared task
Best Presentation award for the Users and Providers track
AMTA Conference 2022
Best Paper Award
EAMT Conference 2022 ( Title: Searching for Cometinho: The Little Metric That Could)
Most Innovative Company
Most Innovative Company (at the Game Changer Innovation Contest), TAUS (Translation Automation User Society)
2015, 2017
Best Global Machine Translation Quality Estimation System
WMT – Conference on Machine Translation,
2016, 2019
Best Global Machine Translation Automatic Post-Editing System
WMT – Conference on Machine Translation,
2019
Best System Demonstration Award
Association for Computational Linguistics,
2019
Blending Human & Artificial Intelligence
Blending Human & Artificial Intelligence, in partnership with Concentrix, UK, National Innovation Awards,
2019
Best Innovation in Customer Service
Best Innovation in Customer Service, in partnership with Concentrix, ECCCSA – European Contact Center and Customer Service Awards,
2019
Best use of AI and associated technologies
Best use of AI and associated technologies, in partnership with Microsoft, ECCCSA – European Contact Center and Customer Service Awards,
2019
Most Innovative Artificial Intelligence Startups for Disruptive Technology
List of Most Innovative Artificial Intelligence Startups for Disruptive Technology of the Year, CBInsights,
2019
Most Innovative Companies
Fast Company’s Annual List of the World’s Most Innovative Companies for 2020, Fast Company,
2020
Product of the Year Award Winner
Product of the Year Award Winners, presented by CUSTOMER magazine,
2021
Best Explainability Approach Award
Workshop on Evaluation & Comparison of NLP Systems, Co-located at EMNLP,
2021

Publications

See all publications

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G.C. de Souza, André F.T. Martins | COLM

Steering Large Language Models for Machine Translation with Finetuning and In-context Learning

Duarte M. Alves, Nuno M. Guerreiro, João Alves, José Pombal, Ricardo Rei, José G. C. de Souza, Pierre Colombo, André F. T. Martins | Findings of EMNLP 2023

Scaling up CometKiwi: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task

Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André Martins | WMT 2023

xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection

Nuno M. Guerreiro, Ricardo Rei, Daan van Stigt, Luisa Coheur, Pierre Colombo, André F.T. Martins | Association for Computational Linguistics

CroissantLLM: A Truly Bilingual French-English Language Model

Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves. Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt10 François Yvon, André F.T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo | TMLR

Hallucinations in Large Multilingual Translation Models

Nuno M. Guerreiro, Duarte Alves, Jonas Waldendorf, Barry Haddow, Alexandra Birch, Pierre Colombo, André F. T. Martins | TACL Transactions of the Association for Computational Linguistics

Uncertainty-Aware Machine Translation Evaluation

Taisiya Glushkova, Chryssa Zerva, Ricardo Rei, André Martins | Findings of the Association for Computational Linguistics: EMNLP 2021

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Patrick Fernandes, Kayo Yin, Graham Neubig, André Martins | 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Smoothing and Shrinking the Sparse Seq2Seq Search Space

Ben Peters, André Martins | 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Document-level Neural MT: A Systematic Comparison

António Lopes, Amin Farajian, Rachel Bawden, Michael Zhang, André Martins | 22nd Annual Conference of the European Association for Machine Translation

Translator2Vec: Understanding and Representing Human Post-Editors

António Góis, André Martins | Machine Translation Summit XVII: Research Track

Our research team is changing the way humans communicate

Projects

Projects & publications

Open-source tools

Awards

Publications

EuroLLM and OpenEuroLLM Secure Supercomputing Power to Build a Multilingual Synthetic AI Dataset for Europe

Introducing Translator Copilot: Bridging Customers and Translators with AI

Unbabel’s Latest Releases – Q1 2025

Acclaro & Unbabel Partner to Take Global AI Powered Translation Services to the Next Level

Unbabel Supercharges Widn.Ai with Quality Evaluation

How AboutYou scales customer support around big retail spikes

Europe steps up its game to boost AI sovereignty with launch of ‘EuroLLM’

Unbabel backs LLMs to lead the future of Language AI with launch of Widn.AI

An Analysis of LLMs in Translation

Unbabel’s Latest Releases – Q3 2024

Seamless Translations: How to Get the Most Out of Unbabel – Webinar

Unbabel’s Latest Releases – Q2 2024 Product Highlights

Announcing TowerLLM: A New Level of Machine Translation Quality for Unbabel Customers – Recording

Introducing TowerLLM: The Best-In-Class-Generative AI for Machine Translation – Webinar

Introducing TowerLLM: The Best-In-Class Generative AI for Machine Translation

Unbabel releases TowerLLM, the first Generative AI model to outperform GPT-4o, GPT-3.5 and lead the market in machine translation

TowerLLM, Unbabel’s GenAI for translation, ushers in the next era of machine translation

Unbabel A-Z: An Exclusive Tour to Mastering Your Translation Platform – Webinar

Unbabel A-Z: An Exclusive Tour to Mastering Your Translation Platform

Open Your Website to the World with Unbabel’s Bablic Integration

Unbabel releases Quality Intelligence API to provide access to award-winning Quality Estimation models

The State of Translations in Business

What’s New at Unbabel – A Recap of Recent Launches

Announcing Tower: An Open Multilingual LLM for Translation-Related Tasks

Best Practices in Marketing Localization Part Three: Localizing Written Content

A Year of GPT with Unbabel, CSA Research and Dell

Taking Your New Product Global With Linguistic Quality Assurance

Best Practices in Marketing Localization Part Two: Localizing Web Content

Translating Game Literature for Gamers: Strategies for Success

The Impact of Mistranslated Product Info (And How to Fix It)

Best Practices in Marketing Localization Part One: Localizing Video Content

Year One of ChatGPT: Impact on Localization and a 2024 Outlook

Year One of ChatGPT: Impact on Localization and a 2024 Outlook – Webinar

Improving Tech Support with Multilingual Content

Optimizing Your E-commerce Site for Multilingual SEO

Leveraging AI Translations to Improve Speed to Market

Empowering Customer Support: Unbabel Announces Integration with Freshchat, Messaging App for Sales and Customer Engagement Teams

AI in Action: How to Create a Team of Super Agents – Webinar

Groupon’s Guide to Holiday Support: From Backlog to Bliss With Adam Lindsey, Senior Director, Global Service Operations

AI in Action: How to Create a Team of Super Agents

Best Practices in Marketing Localization

Harnessing the Power of High Quality Translations

Written Content & Images

Videos

Websites

Introducing XCOMET: A New Frontier in Translation Quality Analysis

Translation: The Key to Higher Email Open Rates

AI Translations and the Benefit for Marketing Techniques

Publications

Customer portal

Manage your Language Operations

Editor interface

Start translating

Be an Unbabel insider