This year, there's a new trendy technology in town.
Artificial intelligence (AI) has emerged as the "me too" feature that everyone is scrambling to incorporate into their products. Cloud and Big Data and analytics? Meh – they're so last year.
The funny thing is, AI isn't new. It can be traced back as a discipline to the mid-1950s, believe it or not, and the ideas behind it predate its formal definition by centuries. But like many concepts where ideas outstripped the ability to execute them, it faded into the background, a victim of the frustration among the Powers That Be, who wouldn't provide further funding after research progress slowed.
A brief resurgence occurred in the 1980s thanks to the development of expert systems, but by the end of the decade, AI stalled again, relegated to the realm of science fiction.
Ten years later, chess-playing computer Deep Blue beat human player Garry Kasparov, and AI again began to progress.
Of course, there are as many visions of what constitutes AI as there are people trying to create it. One criterion was passing the Turing Test, in which a person interacting (via keyboard) with two other participants had to figure out which of them was the human and which was the computer. The test doesn't evaluate the accuracy of the responses, just the extent to which they appear to have come from a human. In the mid 1960s, researchers at MIT came up with a little program called Eliza that managed to fool quite a few, even though the techniques behind it relied as much on psychology as on technology. If you're curious, Eliza is still available online.
Today, on the consumer front, we have much more sophisticated software like Microsoft's Cortana, Apple's Siri, and Amazon's Alexa that, while they might or might not pass the Turing Test, interact in a much more natural manner than Eliza ever did.
The dream of having sentient computers is still a long way off, though. No matter how clever an AI appears to be, it's still just a program. A very sophisticated one, to be sure, but a program nonetheless. And like any program, its function is regulated in part by the biases of its creators, and in part by data – Big Data. Just as a human learns from assimilating information from its surroundings, an AI "learns" from the data that's fed to it. The quality of that data defines the quality of the AI. AIs need a lot of data from multiple sources so their algorithms can properly learn.
Why? Consider the adage "when all you have is a hammer, every problem looks like a nail." If an AI is trained with insufficient or biased data, it will be limited to deductions of possibly dubious quality (an old friend once called this "artificial stupidity"). Its world view will be restricted – every problem will look like a nail. But feed it extensive and varied data, and it can find unexpected connections and solve problems that humans may not have the bandwidth to manage.
That's already been proven by IBM Watson, which was able to assist doctors in correctly treating obscure cancers by correctly identifying them after misdiagnoses by humans.
However, there are more examples of inadequate systems than good ones. For example, ProPublica reports that a program used by law enforcement to figure out a criminal's risk of re-offending was found to discriminate against people of colour. An AI beauty contest was biased towards white contestants. And in 2015, Google's obviously insufficiently trained facial recognition AI tagged two black users as gorillas. Oops.
All of these faux-pas can be traced back to bias, either that of the person designing the system, or in the training data provided, or both. If, for example, the beauty pageant program was trained with images of primarily white winners, it would learn to incorrectly equate white skin with beauty. The gorilla error also happened because of insufficient training data representing non-white people, which made the program decide that a dark-skinned hominid must be one of the great apes that it had in abundance in its training set - artificial stupidity at its best.
Word embeddings can also come back to bite us. They're representations of text data used in many machine learning algorithms to show relationships in meanings, for example, "man is to king as woman is to queen", or "Paris is to France as Tokyo is to Japan." The trouble is, many are horribly biased. They assume that all doctors are men, and all nurses are women, for example, or that homemakers are all women.
It's not unsalvageable, though. In a paper titled Man is to Computer Programmer as Woman is to Homemaker, researchers explain how they created a dataset without gender bias, keeping correlations that make sense (king: man, queen: woman) and removing those exhibiting bias. They created debiasing algorithms and changed gender associations of what should be neutral words ("doctor" and "nurse" became equally male and female, for example).
It's not a universal cure, the paper noted, pointing out:
One perspective on bias in word embeddings is that it merely reflects bias in society, and therefore one should attempt to debias society rather than word embeddings. However, by reducing the bias in today’s computer systems (or at least not amplifying the bias), which is increasingly reliant on word embeddings, in a small way debiased word embeddings can hopefully contribute to reducing gender bias in society. At the very least, machine learning should not be used to inadvertently amplify these biases, as we have seen can naturally happen.
Next, the team is planning a dataset that removes racial biases as well.
The industry is slowly acknowledging the issue. In its 2018 predictions, Dell Technologies says "bias check will be the next spell check," enabling screening for conscious and unconscious bias in functions like hiring and promotions. Dell thinks that AI can help.
“AI and VR capabilities will be key to dismantling human predisposition," said Brian Reaves, chief diversity & inclusion officer, in a post. "In the short-term, we’ll apply VR in job interview settings to remove identifying traits, machine learning to indicate key points of discrimination or AI to program positive bias into our processes.”
But that, of course, assumes that we can eliminate the bias in AI.