23andMe… and ~10 million other DNA samples

23andMe has transformed its massive database into a research and revenue engine.

23andMe is a well-recognized brand for consumers who want to understand their ancestry. With a small amount of saliva, 23andMe provides users a detailed breakdown of a user’s ancestry across over 2000 regions. Additionally, users can opt in to find distant relatives, receive a break down of interesting traits (such as propensity to sneeze in sunlight), and probability of developing diabetes, breast cancer and celiac disease. 23andMe does this through extracting DNA from saliva samples and processing the DNA on a genotyping chip to identify variants. 95% of human DNA is the same and genotyping is the analysis used to identify variants that help 23andMe to paint the picture of who their individual users are.

While the product is impressive for its ability to rapidly process genetic data and deliver insights to customers, it is how the company has used the collection of big data to develop novel drugs and further scientific research that is particularly interesting. When a user sends in their saliva kit, the user has the ability to opt in or out of having their data used for research. If the user chooses to opt out, the data is discarded after 30 days. It is unclear if the data is used within that 30-day window for research, but 80% of users chose to opt into the research. With ~10 million samples collected as of 2019, 80% of research usable samples has left 23andMe with at least 8million DNA samples to be used for deep research. The further power in this data set is that samples come from individuals who are “re-contactable” so follow up surveys and questions can be asked. This immense amount of data has proven hugely lucrative for the company. In 2018, pharma giant GlaxoSmithKline acquired a $300M stake in 23andMe in exchange for access to the genetic data. The data will be used to both develop new drugs and to better identify patients to speed clinic trials.

23andMe has also actively transformed their huge database into a discovery machine and developed a novel drug in 2020. The company looked at traits across its user base to develop a molecule that blocks signals that lead to autoimmune diseases. The drug was tested on animals internally before being sold to Almirall for human testing. This drug development is just the beginning for 23andMe who have a pipeline of more than 30 therapeutic programs, spanning oncology, respiratory, cardiovascular diseases, and more.

While the applications for this data and analysis are powerful, ethical concerns have arisen over the privacy and use of people’s genetic data. First, there is the concern about privacy. While 23andMe emphasizes that users data and information are securely protected, both hacking and police warrants for data pose threats to the privacy of the information. Secondly, upon sign up users agree to give up their data for research with the knowledge that they will never financially benefit from the results. This raises questions of how 23andMe will price prospective drugs given this huge free input of data for which participants are not compensated.



The Business of Predicting and Winning Elections


Tala – How data boosts access to credit for low-income individuals in emerging economies

Student comments on 23andMe… and ~10 million other DNA samples

  1. Thanks for this post Julia! Really interesting. I have found the use of ancestry cites like this in policing to be fascinating. I think that privacy standards have changed a lot since these sites like 23andMe and Ancestry.com first started, but it definitely makes you think you should read the fine print in what rights you are singing away by selecting the “I agree to these terms” box. Given that these sites are a triangulation of data from multiple sources, I wonder if there will be concerns of people’s privacy for those who never even used the site or consented to their data at all – like the Golden State Killer (who I obviously have no sympathy for to be clear!!).

    This article on how ancestry data was used to catch the Golden State Killer was super fascinating: https://www.theatlantic.com/science/archive/2018/04/golden-state-killer-east-area-rapist-dna-genealogy/559070/

  2. This piece was one of my favorites to read this week, Julia! Thanks for posting. Certainly the privacy policy is dubious. To go off of what Julia M wrote, perhaps the silver lining, if you will, is the vast pool of cold cases from the 1970s and 1980s that genealogy companies can provide to law enforcement to solve crimes. Who knew that an unassuming pursuit to build a family tree could help to fight crime?

  3. Very interesting article, Julia! I did not know this company, and I find the business model fascinating, despite the ethical concerns that naturally come with the collection of health/genetic data. I simply wonder how biased the data collected can be, given that people who are aware and can afford such services might not be very representative of the entire population.
    Moreover, as genetic testing is a one-time thing, it might be an interesting opportunity for the company to explore ways to incentivize its customers to make multiple transactions over time, maybe by testing for some diseases or offering more comprehensive services that require multiple samples.

  4. Very interesting post, Julia.

    Just saw in their website that they have a subscription model. It’s impressive how valuable it is to have this ongoing relationship with customers. Being able to collect further data, from sources other than their DNA, and having them paying you at the same time.

  5. Very interesting, yet slightly disturbing article. You have to ask if the same 80% of people would be willing to opt-in for research if they put the name GlaxoSmithKline instead of 23andMe. It certainly seems unethical to sell data that people have volunteered to you. I would be curious about regulation catching up to programs like this and what aspects of the process would be altered.

Leave a comment