Does the recent flurry of headlines about Facebook and the negative outcomes produced by its algorithms have you worried about the future and the implications of widespread AI usage?
It’s a rational response to have during an alarming news cycle. However, this situation shouldn’t be interpreted as a death knell for the use of AI in human communications. It’s more of a cautionary example of the disastrous consequences that can occur as a result of not using AI in a responsible way. Read on to learn more about ethical technology, data quality, and the significance of human-in-the-loop AI.
Facebook and dark AI
During a recent Senate hearing, a Facebook whistleblower testified about how the perils of language inequity can become exponentially multiplied when AI is used irresponsibly. Algorithms employed by Facebook prioritize engagement, which often leads to the spread of polarizing content (people love to engage with more extreme viewpoints). To combat the potential for polarization, Facebook relies on integrity and security systems to keep engagement-based ranking algorithms in check. But those systems only operate in certain languages — leaving users who speak other languages vulnerable.
“In the case of Ethiopia there are 100 million people and six languages. Facebook only supports two of those languages for integrity systems. This strategy of focusing on language-specific, content-specific systems for AI to save us is doomed to fail,” the whistleblower said.
In many cases where the social media giant has neglected to remove inaccurate, hateful, and even violence-inciting content, that dangerous data is then used to train the next iteration of AI language models. Such models, having been fed large quantities of bad information, end up perpetuating these noxious linguistic patterns across the platform.
Facebook’s decision (or nondecision) to allow its algorithms to run unchecked in many non-English speaking countries is a harrowing example of “dark AI.” Fueled by biased data and lack of human oversight, AI can transform from an agent of positive exponential change into a sinister force capable of intentionally misleading and agitating populations en masse.
Combatting AI bias by focusing on data quality
We believe that holding a high standard of data quality is essential for combating dark AI and reducing the capacity for technology to be a negative influence on society. While innovation in model development and evaluation is crucial for the evolution of AI, we must also place a significant emphasis on the quality of the data used to train these models. As Google Brain co-founder Andrew Ng puts it, “Data is food for AI.”
When researchers and engineers devalue data quality and rush to put a given model into production, this can lead to a phenomenon called “data cascades.” This is an insidious process in which inadequate data eventually causes unanticipated downstream effects. As we have seen in the Facebook hearing, it’s easy to overlook or conceal the fact that bad data can be the catalyst for a chain of distressing events.
The good news is a growing number of organizations share our point of view that a data-centric approach to AI is critical for reducing unintended outcomes. Because the information used in machine learning is largely created or influenced by people, that data is capable of inheriting the biases of any humans who touch it.
Biased data can ultimately reflect parts of our world that are untrue or that we are working to leave behind. As an example of gender bias, if you type the sentence, “The doctor spoke to the nurse,” into Google Translate, the Portuguese translation will indicate that the doctor is male and the nurse is female. Of course, hundreds of years of historical texts contribute to the technology producing this outcome — it just doesn’t reflect where we are now and how things have evolved for the better.
Why humans are the key to responsible AI
Although AI technology has become incredibly advanced in the 21st century, we can’t expect it to always be 100% accurate and behave with the same rationality of a human being. That’s why we think that human-in-the-loop AI is the best way to reduce the risk of the technology going rogue.
With Language Operations (LangOps), we have pioneered the use of large-scale, human-in-the-loop AI language translation. This approach combines the speed and efficiency of machine translation with a human translator’s accuracy and ability to preserve cultural nuances. In our case, maintaining human intelligence and ethical judgement as part of the equation helps ensure that an organization’s loyal customers are never misunderstood, overlooked, or offended, no matter what language they speak.
To mitigate the impact of potential bias and increase the quality of our data, we rely on our diverse community of editors to provide their perspective from all across the globe. These one-of-a-kind humans review and refine machine translations, not only to ensure high-quality outcomes, but to help train the AI engine so that it’s better at understanding context and cultural differences in the future. At the end of the day, the more diverse humans there are in the loop to keep AI in check and teach it to behave more ethically, the better.
The true potential of language
Facebook’s misuse of AI is a complex and challenging situation that has reached a breaking point after many years and many iterations of their technology. We’re not saying we hold the answers to remedying such a difficult and pervasive problem. However, by raising awareness of some of the concepts that Unbabel and our partners support to further advance ethical technology, we may be able to help avoid a situation where other companies claim ignorance or turn a blind eye when their AI is doing more harm than good.