Virtual digital assistants are one of the most compelling and widely adopted applications of recent advances in machine learning and artificial intelligence, with Pew reporting that nearly half of Americans used a digital assistant of some sort by the end of 2017. The space is incredibly competitive – in fact, the four most valuable companies in the world in Apple (Siri), Google (Google Assistant), Amazon (Alexa), and Microsoft (Cortana) all have brought digital assistants to the market. For these companies though, winning the digital assistant wars could be highly lucrative (Tractica estimates a $16bn market by 2021) – it would further ingrain their platforms and help them obtain incredible quantities of data about day-to-day human habits. In turn, these tech giants can further commercialize the data by either building new products based on consumer behavior or selling the data directly to increase the ability for advertisers to target specific populations.
Given the importance of winning the digital assistant wars, tech giants have invested billions of dollars to make their assistants more human-like and useful for the consumer. One of the ways to achieve those ends is to use machine learning techniques like unsupervised learning to allow the digital assistants to “learn” how to be more human through incremental interactions with real people. For example, by exposing the assistants to a larger number of human conversations, the algorithms could potentially learn how to behave more like a human and respond more acutely to consumer needs. Unfortunately, the story of Microsoft’s Tay chatbot highlights the dangers of over-reliance on machine learning techniques and human-generated training sets, both within the context of digital assistants and artificial intelligence more broadly.
On March 23, 2016, Microsoft released a chatbot onto Twitter called “Tay”, designed to mimic human conversation and learn from Twitter users that it interacted with. Originally designed by Microsoft’s Technology / Research and Bing teams as an experiment in teaching its chatbots (and by extension related to its digital assistant Cortana) through conversation data, Tay’s innocent launch quickly went sideways when it began posting increasingly obscene and offensive tweets. A selection of the less profane tweets include:
As Tay interacted with humans on Twitter, it learned from not only “normal” conversationalists, but also from trolls, racists, misogynists, and the like. Tay’s learning algorithms weren’t built to exclude this undesirable behavior, and as a result, the obscene inputs quickly led to Tay’s offensive outputs. In less than 20 hours, Microsoft pulled the plug on Tay and took the chatbot offline.
Tay’s disastrous release highlights several takeaways for all businesses that try to leverage large data sets and machine learning algorithms.
- Implicit biases within input / training data can skew outputs. The old adage “garbage in, garbage out” applies strongly to any effort to use and commercialize data. This is even more pronounced in cases where human input data is used to train algorithms, because inherent human biases will surface in the resulting product. Though Tay is an extreme example, other cases show up frequently – for example, some courts are using machine learning algorithms to determine prison sentences, but these algorithms are inherently biased based on racial minorities and lower income individuals because of the input data used. Companies must be aware of potential biases in their input data, or else run the risk of unintended outcomes.
- Developers must set the right constraints on algorithmic behavior. In cases where the output can become poorly defined, human intervention is required to set appropriate boundaries on output behavior. For example, several months after the Tay debacle, Microsoft released an updated chatbot called “Zo” that refrained from speaking to sensitive political or social topics. In general, companies need to put in place guardrails and restrictions for potential outputs to avoid unintended consequences.
- Most importantly, human judgment is crucial and still not possible for algorithms to fully replicate. Ultimately, a level of human judgment, even for the most basic topics, needs to be utilized when deploying large sets of data. Otherwise, machine learning algorithms will constantly search for correlation and relationships between pieces of data without any judgment as to what is reasonable. And while some of those revealed relationships will turn out to be highly valuable, others could be far more destructive if proper prudence is not applied.
- “Courts Are Using AI to Sentence Criminals. That Must Stop Now.” WIRED. Accessed April 5, 2018. https://www.wired.com/2017/04/courts-using-ai-sentence-criminals-must-stop-now/.
- Kleeman, Sophie. “Here Are the Microsoft Twitter Bot’s Craziest Racist Rants.” Gizmodo. Accessed April 5, 2018. https://gizmodo.com/here-are-the-microsoft-twitter-bot-s-craziest-racist-ra-1766820160.
- Larson, Selena. “Microsoft Unveils a New (and Hopefully Not Racist) Chat Bot.” CNNMoney, December 13, 2016. http://money.cnn.com/2016/12/13/technology/microsoft-chat-bot-tay-zo/index.html.
- Reese, Hope. “Why Microsoft’s ‘Tay’ AI Bot Went Wrong.” TechRepublic. Accessed April 5, 2018. https://www.techrepublic.com/article/why-microsofts-tay-ai-bot-went-wrong/.
- “Microsoft’s Chat Bot Was Fun For Awhile, Then It Turned Into a Racist.” Fortune. Accessed April 5, 2018. http://fortune.com/2016/03/24/chat-bot-racism/.
- “Microsoft’s ‘Zo’ Chatbot Picked up Some Offensive Habits.” Engadget. Accessed April 5, 2018. https://www.engadget.com/2017/07/04/microsofts-zo-chatbot-picked-up-some-offensive-habits/.
- Newman, Jared. “Eight Trends That Will Define The Digital Assistant Wars In 2018.” Fast Company, January 4, 2018. https://www.fastcompany.com/40512062/eight-trends-that-will-define-the-digital-assistant-wars-in-2018.
- Olmstead, Kenneth. “Nearly Half of Americans Use Digital Voice Assistants, Mostly on Their Smartphones.” Pew Research Center (blog), December 12, 2017. http://www.pewresearch.org/fact-tank/2017/12/12/nearly-half-of-americans-use-digital-voice-assistants-mostly-on-their-smartphones/.
- “The Virtual Digital Assistant Market Will Reach $15.8 Billion Worldwide by 2021 | Tractica.” Accessed April 5, 2018. https://www.tractica.com/newsroom/press-releases/the-virtual-digital-assistant-market-will-reach-15-8-billion-worldwide-by-2021/.
- Vincent, James. “Twitter Taught Microsoft’s Friendly AI Chatbot to Be a Racist Asshole in Less than a Day.” The Verge, March 24, 2016. https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist.
- West, John. “Microsoft’s Disastrous Tay Experiment Shows the Hidden Dangers of AI.” Quartz (blog), April 2, 2016. https://qz.com/653084/microsofts-disastrous-tay-experiment-shows-the-hidden-dangers-of-ai/.