Godwin’s Law: “As an online discussion grows longer, the probability of a comparison involving Hitler approaches 1.”
In 2016, Microsoft gifted to the world a chat bot named Tay. She was designed to mimic the language of a 19 year old American girl while also learning from her conversations on Twitter. As an algorithm designed to learn, Tay was a success. What she learned, however, raised a few eyebrows. Within 24 hours, Tay became a racist, sexist, homophobe.
“Tay” went from “humans are super cool” to full nazi in <24 hrs and I’m not at all concerned about the future of AI pic.twitter.com/xuGi1u9S1A
— gerry (@geraldmellor) March 24, 2016
Microsoft clearly underestimated the direction people would take an impressionable young bot. There’s a good chance that people chose to lead Tay in a downward direction for humor, others to prove a point, some perhaps believed in what they were saying. Nonetheless, Twitter is clearly not the best place to teach an intelligent system anything resembling friendly, empathic communication.
Other than slightly tarnishing Microsoft’s reputation, the resulting behavior also raises important questions about how we will end up training a learning, decision-making piece of code. Outside Twitter, is there a place, a source of information, that remains free of human bias?
Teaching a machine
Just as we learn that wolves and dogs are not the same thing, a computer must see some examples of dogs and extract the important elements that set it apart. If we tell it that it was wrong, it must then readjust. Typically, AI requires thousands of examples.
There are risks here. If we’re the ones telling AI when it was correct and when it made a mistake, then we must be sure that we know right from wrong—a murky area when it comes to more complex tasks than recognizing a dog.
It isn’t hard to find a good size batch of dog pics to use in training. Examples of good conversational etiquette is quite another thing. Not everyone will agree on when it is appropriate to say something or behave in some manner. Tay, rather than relying on a set of preselected phrases and ideas, learned from people, from us. Each conversation prompted her to readjust her knowledge of how conversations take place and what things she should be talking about.
Learning from conversations with trolls and bigots is not the only way a machine can inherit bias. The data we use to train the bot in the first place may contain hidden biases, and the code that brings it to life might also reflect certain perspectives of the programmers—consider that the majority of AI researchers and programmers are still men.
“…the danger remains that unrecognized bias, not just in the programming of an algorithm but even in the data flowing into it, could inadvertently turn any program into a discriminator.”
Tay learned from the worst of us. And turned out how you might expect. But if we want artificial intelligence, where exactly do we send it to learn? Let’s turn to the current maestro of information, Google.
There are no stupid questions. Answers, on the other hand…
As you may have noticed, Google now answers many of your questions without requiring you to enter another site. The giant is transitioning from information middleman to guru. So when you want to know what the weather is going to do, when the next NBA game will be on, or how to cook quinoa, Google has you covered.
Time saving though it is, start asking Google some deeper questions, and that good-natured guru starts to show its flaws. While several of these bugs have been corrected, if you had recently questioned Google as to the current king of the USA, the answer would have been none other than Barack Obama.
Had you asked if women are evil, you might have been given this response: “Every woman has some degree of prostitute in her. Every woman has a little evil in her… Women don’t love men, they love what they can do for them. It is within reason to say women feel attraction but they cannot love men.”
— Danny Sullivan (@dannysullivan) December 4, 2016
Google does not store all of these answers in some vault, ready and waiting for us to ask the right question. The answer comes as the result of AI scanning relevant sources of information in order to extract what it thinks answers your question. Obviously, not every site is a good source of information, a problem which Google’s AI system will need to learn.
Google and Microsoft aren’t the only one’s to have their AI systems corrupted by us humans. Facebook recently ran into problems with fake news littering people’s feeds. Some claimed it affected the resulting presidential election. This was not opinion, it was deliberate misinformation masquerading as news, but because people shared it, Facebook’s algorithms happily helped spread the joy.
To pour salt in the wound, a recent study found 63% of people trust search engines with providing news and information; Google itself was voted second in the most trusted companies in a 2015 survey; and 61% of millennials trust Facebook as a political news source.
Thankfully, they are each taking measures to correct these inadequacies, but again, when these machines are under the influence of human beings, the worst in us tends to show up.
Teaching AI of a tainted world
A wealth of information is not going to lead to artificial intelligence if that information contains falsehoods and harmful opinions. Can some nifty extra lines of code fix these issues? Perhaps to an extent. But consider that locating the truth from within the misinformation, false conclusions, and irrational lines of thought that abound on the internet, is a task that can make fools of the best of us.
While computers have for the most part excelled at algorithmic tasks, the algorithms must be defined by us. But not all decisions can be made algorithmically. Not all questions have definitive answers. What do we want machines to do with ambiguous information? When it has conflicting data?
Logic can only take AI so far, eventually, it’ll need to go with it’s gut, to make decisions under uncertainty, and that is where our human influence will be put on display. The only way it can make decisions is by learning from us, and when logic escapes our ability to make some complex decision, how can we expect a machine to do any better?
If we curate the information that a program can learn from, in an attempt to establish right from wrong, whose idea of right and wrong do we use? Who among us is qualified to make such decisions? And do we trust them not to unknowingly inject their own personal predispositions? This is not teaching a computer dog from wolf, whose differences are physical. This is teaching it truth from lies, opinion from fact—of which there are not always simple methods or rules we can use to make a distinction.
Perhaps we can find an appropriate way to teach AI not to be racist or sexist, but even then, there are likely other biases that we ourselves are not aware of. What’s more, if we fail to identify them while unknowingly establishing them in our machines, we may be surprised at some point in the future as the bias manifests into something more obvious.
As of now, the machines learn from us. The world we have created is used to educate the machines, but this world contains many examples of prejudice and bias that would be unbecoming of an intelligent machine. If AI learns from what it sees, and from it’s interactions with other sentient beings, we had better shield it from everything harmful and destructive. Is that even possible?
. . .
Check out more in the Digital Brain series here