Duolingo: Machine Learning Our Forgetfulness
Luis von Ahn and Severin Hacker launched Duolingo in 2012 with the goal of revolutionizing language learning. In a 2014 interview with The Guardian, von Ahn explained his motivation:
“What I wanted to do was create a way to learn languages for free. If you look at language learning in the world, there are 1.2 billion people learning a foreign language and two thirds of those people are learning English so they can get a better job and earn more. The problem is that they don’t have equity and most language courses cost a lot of money”.
While there were language learning apps on the market before Duolingo, they tended to struggle with retaining users long-term. Von Ahn estimated that Busuu, a leading online language learning service that included a free option, had only 5% of users sticking with it long-term . Busuu, Babbel, and other competitors were good ways to get started, but for most users they didn’t fully replace more expensive in-person language learning classes.
Duolingo’s solution was to create a personalized, fun, and effective experience using machine learning (ML) to understand and challenge users of the product. Their aim is readily apparent in their playful and game-like app design , but under the hood ML may be the true innovation. For example, their “Half-life regression” (HLR) algorithm uses data from over 300 million users worldwide to construct and maintain a personalized model that predicts how likely you are to remember a word at any time . Relying on theory dating back to 1885, which says that chance of remembering decays exponentially with time , the team came up with HLR to determine when is the best time to show a user a word again. Duolingo rolled out HLR to all users after preliminary A/B tests showed that it increased overall user activity by 12% .
Von Ahn had experience leveraging large populations to teach digital systems new information, having previously invented reCAPTCHA, which cleverly allows people on the web to verify they are not robots while helping digitize books at scale. reCAPTCHA was acquired by Google in 2009 and now includes image recognition and other uses beyond text .
reCAPTCHA example: the user teaches the computer that the left image is “specific” while verifying another image the computer already knows, “Donanne” .
While ML is a near-term differentiator for Duolingo, it also underpins longer-term strategy for the seven year old firm now valued at over $700M . More than any direct competitor, it’s possible Duolingo’s primary adversary will always be attrition . Beyond in-game tactics like HLR, Duolingo has also begun administering a placement test dependent on ML in order to better challenge and retain more advanced learners . This competency could also enable them to compete in standardized testing and language certification markets.
Machine learning allows Duolingo’s to clearly quantify and steadily improve retention on its platform, but it remains unclear whether their effort is best spent further optimizing these quantifiable metrics. Perhaps the journey of learning a language cannot be simplified into perfectly personalized exercises such as multiple choice questions and drag-and-drop games. These may be the best way of getting people in the door and keeping them on the platform for some time, but unless they can recreate the experience of living with native speakers and having dozens of casual conversations each day, can they really revolutionize language learning? For Duolingo to meaningfully help as many people learn new languages as possible, they may need to consider branching out into video coaching, immersion trip planning, and other supporting efforts to the overarching mission—similar to how Airbnb now organizes “Experiences” as part of its mission to “create a world where people can belong through healthy travel that is local, authentic, diverse, inclusive, and sustainable” .
Finally, Duolingo must consider the business opportunity from ML along with the user opportunity. In addition to aiding in HLR and other algorithms, Duolingo is amassing a considerable trove of personalized language learning data. How should they use it beyond algorithms such as HLR and placement tests? Should they branch out into other areas of education beyond language? Should they recommend other experiences within and outside of the product based on personal data? More controversially, should they consider selling aggregated or individualized data for potentially lucrative opportunities advertising or recruiting?
 O’Conor, L. (2018). Duolingo creator: ‘I wanted to create a way to learn languages for free’. The Guardian. https://www.theguardian.com/education/2014/aug/27/luis-von-ahn-ceo-duolingo-interview.
 Konrad, A. (2018). Language App Duolingo Raises $20M In Race To Teach English. Forbes. https://www.forbes.com/sites/alexkonrad/2014/02/18/language-learning-app-duolingo-raises-20m-in-race-to-teach-english/.
 Lardinois, F. (2018). Duolingo hires its first chief marketing officer as active user numbers stagnate but revenue grows. TechCrunch. https://techcrunch.com/2018/08/01/duolingo-hires-its-first-chief-marketing-officer-as-active-user-numbers-stagnate/.
 Murre JMJ, Dros J (2015) Replication and Analysis of Ebbinghaus’ Forgetting Curve. PLoS ONE 10(7): e0120644. https://doi.org/10.1371/journal.pone.0120644.
 Code for HLR is open sourced at https://github.com/duolingo/halflife-regression.
 Goedegebuure, D. (2018). You Are Helping Google AI Image Recognition – Dennis Goedegebuure – Medium. https://medium.com/@thenextcorner/you-are-helping-google-ai-image-recognition-b24d89372b7e.
 Licensed under Creative Commons Attribution-ShareAlike 3.0 License.
 Lardinois, F. (2018). Duolingo raises $25M at a $700M valuation. TechCrunch. https://techcrunch.com/2017/07/25/duolingo-raises-25m-at-a-700m-valuation/.
 Although Duolingo doesn’t share exact statistics, they do say on their website that “only a fraction of those who start a Duolingo course make it to the end.” https://forum.duolingo.com/comment/19245570/What-proportion-of-users-who-start-a-course-finish-it
 Gagliordi, N. (2018). How Duolingo uses AI to disrupt the language learning market. ZDNet. https://www.zdnet.com/article/how-duolingo-uses-ai-to-disrupt-the-language-learning-market/.
 Airbnb Press Room. (2018). About Us – Airbnb Press Room. https://press.airbnb.com/about-us/.