It cost $2.7 billion and almost fifteen years to sequence the first whole human genome; that cost has now dropped to approximately $1000 and takes just a few days as a result of enhanced digitization of the genome . The rapid pace of development in sequencing technology suggests that sequencing cost and time will no longer be prohibitive to using genomic data on a large scale. This is good news for pharmaceutical companies who rely upon patient genetic data to guide the selection of potential novel drug targets. However, as Figure 1 suggests, the reduction in sequencing cost and time has merely moved the discovery bottleneck downstream; regardless of how fast a genome is sequenced, it must be processed before finally being interpreted by a scientist . Pharmaceutical companies are integrating collaborations with external technology companies into their operating models in order to shorten analytical timelines.
AstraZeneca initiated a collaboration with Bina Technologies in 2015 in order to develop faster and more scalable NGS (Next Generation Sequencing) processing capabilities. Bina’s RAVE software platform enabled AstraZeneca to cut their deep sequencing process timelines for exome (protein coding DNA sequences) sequencing from 16 hours down to 45 minutes, and whole genome (protein coding and noncoding DNA sequences) sequencing from 72 hours down to 6 hours . The data processing steps also rely upon a set of 331 NGS tools, the majority of which are developed in academia and are updated approximately every month; each tool also has its own set of adaptable command lines. Bina’s ability to update, validate, and quality check each of these tools as a part of their Genomic Management Solution provides immense value .
However, processing large amounts of data quickly means nothing if the results do not allow for the extraction of key pieces of biological insight. Therefore, in addition to improving the speed at which genomic data is processed, AstraZeneca has also focused its efforts on improving the algorithms used in the processing steps. Commonly used variant callers miss complex genetic mutations, including large genomic insertions and deletions. Additionally, they do not perform well with ultra-deep sequencing which is necessary for detecting genetic variants in circulating tumor DNA. AstraZeneca has remedied these issues through the development of VarDict, a novel variant caller .
Given the rapid pace of development in genomics technology and the increasing data storage and analytics requirements, AstraZeneca has turned to Amazon Web Services (AWS). AWS has enabled AstraZeneca to build a private cloud. The Bina platforms have been integrated into the cloud, as well as high-performance computing clusters .
Finally, all of this data would be useless without scientists who can interpret it. However, mining the raw data files that are generated after processing requires a particular skillset; a whole-genome sequencing data file for a single patient is on the order of 102 GB in size. Analysis of these files requires an understanding of both disease biology and programming/big data analytics, a marriage of skills that is not abundant enough in the research community. In order to facilitate the exploration of large datasets by biologists, AstraZeneca has spearheaded a “Bioinformatics for the Bench” development initiative. In order to deliver this, AstraZeneca collaborated with Bina Technologies on using their Annotation and Analytics Intelligence Module Software (AAiM) to provide biologists with software and guided user interfaces that make large genomic datasets more accessible and interpretable .
The dramatic increase in availability of genomic data has provided opportunities for AstraZeneca to shift its business strategy towards personalized healthcare. For example, after the failure of olaparib in 2011 and 2012 clinical trials in triple-negative breast and ovarian cancers, AstraZeneca halted development of the drug . However, further exploration of the biology aided by genomic datasets of patients in these trials led to the conclusion that ovarian cancer patients carrying BRCA mutations responded well. AstraZeneca subsequently achieved olaparib approval in this patient population, and was the first company to release a drug with a companion diagnostic to identify appropriate patients . Moving forwards, and with a general push in the healthcare space towards outcomes, AstraZeneca will be able to harness the increasing amounts of genetic data to understand which patients will benefit from a given therapy.
In addition to the initiatives previously discussed, I think there is opportunity for AstraZeneca to further explore collaborations with technology companies focused on healthcare. Specifically, integrating genomic data with clinical metadata that might be easily collected through the Apple ResearchKit may provide avenues to explore a systems biology approach to understanding disease and identifying drug targets. Flatiron Health also provides a unique opportunity to integrate clinical and genomic data . Additionally, working with experts in machine learning and data analytics at tech companies such as Google to understand novel methods of analyzing big data may provide enhanced insights.
Word Count: 797
 Tirrell, Meg. Unlocking my genome: Was it worth it? December 2015. <http://www.cnbc.com/2015/12/10/unlocking-my-genome-was-it-worth-it.html>.
 BM Good, BJ Ainscough, JF McMichael, AI Su, and OL Griffith. “Organizing knowledge to enable personalization of medicine in cancer.” Genome Biology 15 (2014): 438.
 Bina Technologies. Solving NGS Bottlenecks with a Globally Distributed Genomic Data Management Solution. June 2015. <http://blog.bina.com/read/solving-ngs-bottlenecks-with-a-globally-distributed-genomic-data-management-solution>.
 ZL Lai, A Markovets, M Ahdesmaki, B Chapman, O Hofmann, R McEwen, J Johnson, B Dougherty, JC Barrett, and JR Dry. “VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research.” Nucleic Acids Research (2016).
 Garber, Ken. “PARP inhibitors bounce back.” Nature Reviews Drug Discovery (2013): 725-727.
 FDA. FDA approves Lynparza to treat advanced ovarian cancer. December 2014. <http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm427554.htm>.
 Flatiron Health. n.d. <https://flatiron.com/>.
Cover Image Source