Lisbon, Portugal and San Francisco, USA: Unbabel, the leading translation scaleup, announces the release of the open source edition of its award-winning quality estimation and automatic post editing suite to the global technology community.
Since 2016, Unbabel’s AI team has been focused on continually improving Quality Estimation (QE). An active part of the QE research community, Unbabel’s teams have participated, won, and co-organized various QE shared tasks at the Conference for World Machine Translation (WMT), and last year organized the first workshop on QE and Automatic Post-Editing in AMTA to discuss the future of the field.
In an effort to make AI research reproducible, Unbabel decided to make its QE system available to external researchers in the form of OpenKiwi, a Pytorch-based open-source framework that implements the best Quality Estimation systems from the WMT 2015-18 shared tasks, with added improvements.
The AI-human framework will be used as a baseline system, enabling businesses to provide fast, accurate translations at the scale of machine translation.
“Quality estimation has already proven itself in terms of reducing the time and costs associated with post-editing, and we want to share this toolset with the rest of the world so that teams can contribute to the global development of QE,” says Unbabel co-founder and CTO João Graça. “We’re excited to be examining the new issues presented to QE and automatic post-editing by neural machine translation, and we look forward to feedback from the global QE community.”
OpenKiwi is implemented in Python using Pytorch as its deep learning framework, and has a user-friendly API which can be imported as a package in other projects, or run from the command line. With this release, teams taking part in the shared tasks of WMT19, the fourth conference on Machine Translation, can use OpenKiwi to examine automatic methods for estimating the quality of machine translation output at run-time, covering estimation at various levels and studying the performance of quality estimation approach on the output of neural machine translation systems.
“Over the last decade, artificial intelligence applications such as machine translation have helped break down language barriers, both for consumers but also for enterprises,” says Christian Federmann, Senior Data Scientist, Microsoft Translator and Research Director, Association for Machine Translation in the Americas (AMTA). “Faced with an increasing amount of machine translated content, there is a growing need for quality estimation to identify which content may be ready for publication, and which may still need human refinements. This process is at the core of Unbabel’s business and has resulted in the creation of OpenKiwi which will now be released publicly, under an open-source license. This release will benefit both machine translation researchers and translation business alike, enabling them to integrate more machine translation into their workflows, at a higher quality, further expanding personal and professional communication capabilities.”
OpenKiwi is an open source project hosted in GitHub: https://github.com/Unbabel/OpenKiwi.
- Implementation of state-of-the-art QE systems which won WMT shared tasks in 2016–2018.
- Implemented in Python using Pytorch as the deep learning framework.
- Easy to use API: can be imported as a package in other projects or run from the command line.
- Ability to train new QE models on new data.
- Ability to use pre-trained QE models on data from the WMT 2018 campaign.
- Easy to track and reproduce experiments via yaml configuration files.
- Open-source licence.
Unbabel enables modern enterprises to serve customers in their native languages, with always-on, scalable translation across digital channels.
Powered by AI and refined by a global community of translators, Unbabel combines the speed and scale of machine translation with the authenticity that can come only from a native speaker.
Unbabel has raised over $31M in funding and has over 200 employees across its Lisbon headquarters and offices in San Francisco, New York, and Pittsburgh. Leading brands like Facebook, Microsoft, Booking.com, and easyJet use Unbabel to make their customers happier and their support operations vastly more efficient.
For interview and commentary requests, please contact the following Unbabel media Contacts:
UK & Europe: Edward Clark: +44 (0)203 697 6680, firstname.lastname@example.org
US & Canada: Jennifer Reid: +1 778-772-0754, email@example.com