Brought to you by Unbabel

Life imitating art: the quest for a universal translator

Despite technology’s stubbornness in not giving us hoverboards, time travelling DeLoreans or pizza hydrators, there has been a lot that real life has borrowed from science fiction.

(Please note: this is not a hoverboard)

In the 19th century, visonary author Jules Verne described global networks, newscasts and videoconferences, sent us to the ocean’s depths in submarines (1870) and to the moon in projectile lunar modules (1865).

H.G. Wells devised a Heat-Ray much like our contemporary lasers in The War of Worlds (1898), audio books and email in Men Like Gods (1923) and atomic bombs (1914) in The World Set Free. Credit cards were first seen in a 1888 socialist utopian novel by Edward Bellamy.

Supercomputers have also been around for a while. In the iconic 2001: A Space Odyssey, based upon a 1950 short story by Arthur C. Clark, HAL 9000 is a sentient computer capable of speech recognition, natural language processing, automated reasoning, and eventually, murder: “I’m sorry Dave, I’m afraid I can’t do that”. And we’ve marvelled as Professor X augments his telepathic ability using Cerebro (1964) to detect mutant brain waves across the world.

The more we look for, the more we find. Although Mary Shelley’s Frankenstein is considered the first true novel to consolidate the genre, lack of consensus on what science fiction really is makes it hard to define how long it has been around. Some date it back to an alegoric 1616 novel by Johann Valentin Andreae called The Chemical Wedding — an alchemic quest that begins with an invitation to a royal wedding (however, as some authors such as Adam Roberts pointed out, it’s quite a stretch to regard alchemy as science).

Some trace it back to the utopian New Atlantis, an unfinished novel by Francis Bacon published in 1627 or to Thomas More’s 1516 Utopia, who coined the term. We can even go beyond and look at the quite ironically-titled A True History. In this 2AD century satire by Lucian of Samosata, characters get lost in exotic islands — one for instances, made of cheese — travel to the moon only to find themselves in the midst of a war between the moon folk and the king of the Sun, get trapped inside a giant whale and meet mythical creatures, the legendary Homer and Herodotus. Or, as I like to call it, a pretty fun Friday.

Regardless of what it is, its origins are deeply rooted in mythological storytelling, speculation about the world’s invisible engines, fear of the unknown and dreams of the future.

Full size version here

Waiter, there’s a fish in my ear

There is one particular dream in modern science fiction which has been told over and over again. It was first mentioned in the 1945 novellete First Contact, and since then we’ve seen it in countless science fiction works, such as a device in Star Trek, a telepathic field in Doctor Who’s Tardis, and as a mind-boggling fish in Hitchhiker’s Guide to the Galaxy.

It is of course the “universal translator,” a contraption that solves the problem of language barriers between alien species by instantly translating (and although not as common, sometimes interpreting) any language, which is a remarkable feat considering Earth alone has around seven thousand of them.

It’s easy to see why the universal translator is so wildly popular. In a connected world, language is arguably the last communication barrier. It’s what keeps us from tapping into all of mankind’s knowledge, from achieving universal understanding.

As far as science fiction goes, I’m slighty partial to Douglas Adam’s idea, although it’s possibly the most outlandish of them all. As he so very eloquently puts it:

The Babel fish is small, yellow, leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy received not from its own carrier, but from those around it. It absorbs all unconscious mental frequencies from this brainwave energy to nourish itself with. It then excretes into the mind of its carrier a telepathic matrix formed by combining the conscious thought frequencies with nerve signals picked up from the speech centres of the brain which has supplied them.

The practical upshot of all this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish.

(The practical usefulness of this fish is only beaten by the weirdness of the imagery suggested by “excretes into the mind of its carrier a telepathic matrix”.)

There has been no shortage of news claiming one tech giant or another has invented a real-life universal translator. Google recently came out with Pixel Buds, powered by their Pixel smartphone, Google Translate and Google Assistant; Microsoft Translator claims to instantly translate in-person conversations with the help of a smartphone, an incredibly successful Indiegogo campaign to build a similar device got 3181% funded, with the first 1M being raised in 2 hours.

There's no question that there is a strong desire, if not dire need, for this technology. But before you go grabbing your credit card, you should manage your expectations. First of all, none of these devices take into consideration cultural context and idiosyncrasies, non-verbal signals and other language nuances that machines are very much oblivious to but linguists are quick to catch.

And secondly, these devices are inherently stuck with the limitations of machine translation and speech recognition, and despite all the advances in the past decades, they’re still not solved. Case in point:

"I'm Portuguese and I live in Lisbon" written in Portuguese,
translated to "I seek a woman who likes the sex" in Spanish.

Beyond language

The idea behind a device that analyses sentences and spews translations into an earpiece is not complicated, but it still relies on language, and one must wonder if that’s the most efficient way to communicate. Make no mistake, we owe language everything we have. It triggered mass-cooperation that allowed social structures to evolve from basic settlements to the modern, complex societies and institutions we have today.

In his best-seller Sapiens, Yuval Noah Harari speaks about the importance of these myths:

“This mysterious glue is made of stories, not genes. We cooperate effectively with strangers because we believe in things like gods, nations, money and human rights. Yet none of these things exists outside the stories that people invent and tell one another. There are no gods in the universe, no nations, no money and no human rights — except in the common imagination of human beings. You can never convince a chimpanzee to give you a banana by promising him that after he dies, he will get limitless bananas in chimpanzee Heaven. Only Sapiens can believe such stories. This is why we rule the world, and chimpanzees are locked up in zoos and research laboratories.”

Wait But Why’s Tim Urban agrees. Language prompted an unprecedented acceleration of collective knowledge. However, in his almost 40 thousand words long masterpiece about Neuralink’s mission (yet another lovechild of Elon Musk), he points out that, when we communicate, we’re effectively using 50,000-year-old technology. The same species that, in average, replaces one smartphone for a new, shinier model every 2 years.

Language is neither a fast nor lossless form of communication. In the proccess of compressing concepts and cognition into speech, we loose context, intention, nuance, and all that other useful metadata that would give a much broader picture to the receiver.

The receiver then has to figure out what to do with that lossy data they were left with, and reconstruct it in a way that can represent the original content. But, more often than not, the loss is irretrivable. We incorporate that partial data into our set of preconceptions and fields of experience, and it gains a meaning that can be very different from the original one.

No wonder we fight about meaningless nonsense all the time.

Now, this is, mind you, a gross oversimplication. The relationship between cognition and language is much more complex than any algorithm for compression and has been the subject of discussion for millenia.

Behaviourists like Skinner believe language learning is a process of reinforcement in which we’re rewarded by better communicating our needs. (i.e. crying and fussing is not as effective as calmly asking “mommy dear, would you please fetch me some water?”), while authors such as Chomsky and Greenberg believe languages share a set of linguist universals and we are born with innate “language acquisition devices” that develop without instruction.

Supporters of the controversial Whorfian theory believe that language affects, or, to the more extreme — and criticized — side of the spectrum, determines the way we think, and therefore people from different countries perceive the world differently, while authors such as Gentner view language as a part of our cognitive tool kit, creating a semiotic system to make sense of the world around us and fostering “higher order cognition”.

The great Eskimo vocabulary hoax

Chomsky, for instance, suggests that, evolutionarily speaking, language’s main purpose is not even communication, but as an instrument for representing thought.

When asked by Wiktor Osiatynski about nonlinguistic forms of thinking, Chomsky says:

“The analysis of linguistic structures could help in understanding other intellectual structures.

Now, I don’t think there is any scientific evidence about the question of whether we think only in language or not. But introspection indicates pretty clearly that we don’t think in language necessarily. We also think in visual images, we think in terms of situations and events, and so on, and many times we can’t even express in words what the content of our thinking is. And even if we are able to express it in words, it is a common experience to say something and then to recognize that it is not what we meant, that it is something else.

What does this mean? That there is a kind of nonlinguistic thought going on which we are trying to represent in language, and we know that sometimes we fail.”

That phenomenon accounts for at least half of my interactions before 10 am. But, what if we could create a device that translates nonlinguistic concepts and images and communicates them directly into someone else’s brain?

The year of our lord, 2049

Unbabel’s engineers like to think about a not-too-distant future in which we’ve just unveiled our first nanopills filled with an array of AI+human-powered even-more-nano-bots that flow through the blood stream into the cortex and provide brain-to-brain interfacing.

We’ll be able to communicate using a lossless descentralised system that transmits emotions through hormonal and neurotrasmitor analysis, translates concepts into visual inputs, directly into the recipient’s brain.

Amplification implants will be readily available so we can communicate instantly with anyone, anywhere. We’ll be able to transmit terabytes of information in fractions of seconds. We’ll just know.

A vast network of bilinguals across the world instantly decodes this information and provides cultural context through the neural network, which essentially reduces the cost of translation to 0, allowing SMEs, the backbone of our modern economy, to prosper and trigger a wave of economic growth responsible for job and income creation, innovation, local development and sustainability in developed and developing countries alike.

With no language barriers and universal access to all of mankind’s knowledge, scientific research and world-wide collaboration enter a golden age. Major efforts will be put into reverting the climate crisis, fossil fuels will be completely abandoned and we’ll start harnessing and mastering clean energies with the help of solar power satellites, making the transition into a Kardashev type I civilization.

We’ll even be able to communicate with animals (and finally explain to our dogs how sorry we are that we’ve stepped on their tail).

Of course, security would be a major issue. As the technology gains widespread adoption, hackers will constantly be trying to manipulate our nanobots and access our sensorial inputs and outputs. Naturally, the stakes would be much higher than ever before. They could fabricate new memories, modify thoughts, re-wire our core foundations, change everything that defines who we are and why we do the things we do (stay tuned for even weirder episodes of Black Mirror).

To mitigate this, communications would have to be subjected to higher and higher levels of end-to-end encryption, powered by secure distribution of symmetric session keys that are securely generated using quantum-resistant public key cryptography.

For us at Unbabel, our pipeline would be dramatically different. It would look more like a brain’s neural network, composed of decentralised nodes through which information travels. Humans would no longer be the final part of the process, providing feedback and context at every step of the way.

In the end, we don’t really know what the future holds. We may all end up living in a bitcoin-fueled nightmare where governments crumble and the crypto-rich build citadels to protect themselves from the common, bitcoin-less folk.

When it comes to future predictions and trends' assessment, Alan Kay said it best:

“The best way to predict the future is to invent it”.

All I can say is, we’re working on it.