Lights, camera, captions: the making of Unbabel Video

July 4, 2019

As audiences and the content they consume have become more international, subtitles have become an essential part of good user experience. 80% of YouTube views are from outside the US, and only 25% of internet users speak English as a first language. Not only that, 85% of video on Facebook is watched without any sound, which makes subtitles critical in meeting your audience where they are. If you only make content available in a single language you’ll leave viewers — along with revenue from advertising or subscriptions — on the table for someone else to take.

Not only do subtitles mean viewers can watch your content, it makes it much easier for them to find it. YouTube’s search algorithm hugely favors videos with subtitles because it is much more efficient at analyzing and categorizing text than video. This means more viewers and more revenue across languages for content creators.

The ease of analyzing text compared to video doesn’t only matter to SEO. It’s impossible to mine video for data and insights in the same way as text. Relying on low-quality captions and subtitles makes it impossible for humans or AI to take insights from interviews, focus groups or video diaries.

Current methods for subtitling video aren’t just slowing down marketing intelligence firms, for that matter. Even media companies who are already localization experts struggle to transcribe and translate video content efficiently, relying on unreliable machine-only translation or professional translation teams that are difficult to scale. This can often mean multiple agencies or teams, along with increased turnover times. More time translating also means less time making content.

The challenges of delivering multilingual subtitles

Daavid Kahn, Founder of Instapanel highlights, “The five percent error rate that characterizes most transcription services might not seem like a big deal at first. But that error multiplies and cascades through all steps, and it gets out of hand.”

This risk of multiplying and cascading errors is a major one facing multilingual subtitles. There are free, entirely automated options for creating subtitles in English. YouTube even has one integrated into its player. However, academic research has found that it has high error rates, which are even worse when an accent is not American, and that it also has higher error rates for women’s voices than mens’. (I have a Glaswegian colleague who is particularly outraged that Scottish accents have the highest error rate of all.) A bigger indicator of the issues with these captions is that Google and YouTube won’t use them for search indexing because the quality isn’t high enough.

If the original language captions are bad, even the best translation will be fatally flawed. And if you use pure Machine Translation (MT), then the problems will be compounded, leaving your audience with unreadable subtitles — making the whole exercise pointless.

The alternative is human transcription and translation, which provides a high quality result, but is very expensive and is often a huge challenge to manage at scale.

Translation is what we do at Unbabel, and when we realized the obstacles in creating native- quality multilingual subtitles, we saw a clear challenge we could use our experience and technology to solve.

Introducing Unbabel Video

That solution is Unbabel Video — bringing the power of Unbabel’s human+AI translation pipeline to deliver native-quality captions and multilingual subtitles across all kinds of video content.

Unbabel Video allows users to quickly and easily add high quality multilingual subtitles to their videos. Simply paste a link to a video from YouTube or your own storage and select the languages that you need. Then we create the subtitles using our world class machine translation engines and our global community of multilingual translators. Soon they’ll be ready for you to share your content in all the languages that you need.

Unbabel Video brings native quality multilingual subtitles to your video quickly, easily and at the scale you need.

Rewind: how we got here

Unbabel Video is the culmination of years of testing and research into optimizing our AI+human translation model for video. The project began with my team in Labs, Unbabel’s in-house innovation team whose projects test the limits of what’s possible.

We built Unbabel Cast — an app that allowed you to record yourself and immediately translate what you said into multiple languages then upload it to social media. The app worked really well — and our CEO and co-founder Vasco Pedro proudly showed it off to the world on stage at Pioneers 17.

As much as we loved using Cast, we decided that building a platform for creating high-quality captions across languages at enterprise scale was a much more exciting challenge for us to solve – so we got to work.

The Video team

To bring this concept to reality, Unbabel Video left Labs to become its own product. We created a dedicated team for this product, bringing together Product Management, Applied AI, Community Management, Linguistic Services, Product Design, Product Developers and Product Marketing into a single cohesive team focused on building, maintaining, and expanding the product. The team came to Lisbon from all over the world, with roots in Canada, Brazil and the US in the west, and as far east as Armenia.

The team have built new tools and interfaces that let our translator community work on videos as quickly and accurately as possible and found new ways to helps manage video translation at scale.

The Unbabel Video team in June 2018

How it works behind the scenes

Like Unbabel’s Customer Service Solution, Unbabel Video combines cutting edge Neural Machine Translation (NMT) and a human community of freelance editors to achieve fast and native quality translations.

When we create captions, the first draft is made by our industry-leading Neural Machine Translation. Then our community corrects any errors against the video to ensure it’s as accurate as possible. Not only does this create very high quality captions, it also gives us the best possible base for translations by eliminating errors that would be exacerbated through the process.

In the translation stage, our MT is presented with the original language captions and the video to ensure that the translations are accurate and remain synched in the new language.

To improve speed and accuracy, we have built entirely new tools that blend the productivity mechanisms of a text editor with an intuitive video player, which allows our editors to focus on their work, not learning to use the tools. We have tested and iterated these interfaces repeatedly, and we’ve made incredible gains in reducing the time it takes our community to caption or translate each minute of footage. This means faster and better results for our clients, and happier translators.

Using Unbabel Video

Unbabel Video was a chance to build a video translation platform from the ground up to be easy to use at any volume. This is why we have integrated with YouTube to allow users to simply paste a link to their videos rather than upload directly to our servers. It also works if the video is hosted on the client’s own servers.

In addition, we have built a robust and intuitive API and are open to integrating to any digital asset management platform that our customers use in order to make their workflows as easy as possible.

We’ve also developed archivable project files that allow our clients to manage their current projects while also having access to every translation they’ve requested. This was based on feedback from our development clients who started to have issues with large amounts of videos.

Quality control

As with Unbabel’s customer service products, we regularly assess the quality of translations to ensure that we are delivering high quality work to our clients. However, moving from purely text translation to video also presented challenges for how we assess translation quality.

To ensure consistent quality, we worked with a community of linguists to better understand the errors that are more likely to occur in translating subtitles when compared to other kinds of content. This helps us to efficiently flag and assign our human translators segments to review, and more accurately assess their work. We conduct regular assessments of every translator in our community, and regular quality audits for every client in order to ensure we always have the fullest possible picture of our translation quality.

Fast forward to now

Unbabel Video isn’t a crazy idea for an app or an experiment anymore — it’s a highly effective product that has become a key tool for companies including Living Lens and Instapanel, who have combined our translation with their innovative machine learning to provide global insights to their clients. Great Big Story has been able to bring their amazing videos to audiences across Latin America at a much greater scale than they had been able to before.

That’s why Unbabel Video is now in an Early Access Program (EAP), to find a further select group of clients who are eager to make their mark on the product and work with real world use cases and volumes.

We’re particularly looking to work with media companies who either want to expand the reach of their content by offering multilingual subtitles, or those who already do this with a localization team and could use Unbabel Video to drive efficiency and increase output.

The other key use case at for the EAP is in market research. Unbabel Video allows researchers to draw insight from testimony in every language, and give their clients a truly global perspective on what customers think about their products.

However, if you’re in another sector and think that Unbabel Video would work for you, please don’t hesitate to apply to be part of the program, we’re always excited to talk about new use cases and bringing multilingual video to new areas.

Being part of the EAP means that you will be able to closely work with our product team to shape the product and its features to perfectly suit your workflow and use case, as well as pricing discounts — you also get a head-start on the competition.

If you’re interested, you can apply to be part of the Early Access Program here, and our team will be in touch to talk more. But be warned, we’re really excited about Unbabel Video so we may be talking for a while…

The post Lights, camera, captions: the making of Unbabel Video appeared first on Unbabel.

About the Author

Profile Photo of Paulo Dimas
Paulo Dimas

Paulo Dimas is the VP of Product Innovation at Unbabel, contributing to build the world’s translation layer by combining AI with a global community of human translators. Joining when Unbabel was a 12 people-strong team, Paulo has helped Unbabel’s growth through 3 series of funding, totalling $88 million, by creating new game-changing AI products. His passion for startups and products took him to co-found two startups and, at 14 years of age, develop and launch the first commercial product.