Digital Disease Detection: Can We Predict the Next Ebola Outbreak?

Bitscopic is a startup using advanced big data, machine learning, and analytics to collate data from a variety of sources to predict and track disease outbreaks and biosurveillance for infectious diseases.

Bitscopic is the next big company in digital disease detection (DDD), which essentially uses internet data for public or global health biosurveillance. In essence, DDD combines big data and crowd sourcing to track disease outbreaks and other public health issues more quickly and with a higher geographical granularity than traditional disease surveillance systems, which have significant lag time due to their dependence on official reports made by physicians and health departments.

Tools and applications for digital disease detection have been around for decades: Google Flu Trends is one of the most public and well-known examples of digital disease detection, as it leverages the geographical locations of search queries to track outbreaks of the flu in the US. GFT’s big data approach accurately tracked flu outbreaks in real-time, which was a great improvement over the 1-2 week lead time it takes for the Centers for Disease Control (CDC) to collect and process this data. The Global Public Health Intelligence Network, sponsored by the World Health Organization, has been tracking disease outbreaks by trolling the Web for disease-related news stories since 1997. Newer tools leverage recent growth in social media platforms. For example, HealthMap made the news earlier this year when it became the first website to note the Ebola outbreak. On March 14, about nine days before government authorities in Guinea had even informed the World Health Organization of the first case, HealthMap had picked up mentions of the first few infections from a local newspaper in Guinea. Google Flu Trends now has a competitor: a Twitter-based program designed to forecast influenza patterns.

Although these tools have been around for awhile and new digital detection disease tools often sprout up, very few of them are for-profit or able to capture the value they create. Most are academic or research based applications or are proprietary to large multilateral NGOs. Enter Bitscopic. Bitscopic is a startup that specializes in applying the latest advances in the fields of distributed computing and machine learning to biosurveillence and public health. The company works with large national health organizations to implement systems to detect the early outbreak of potential biological threats and contain their spread. Their flagship product, Praedico, is a next generation big data biosurveillance application that incorporates cloud computing technology, big data platforms, machine-learning algorithms, geospatial and advanced graphical tools, multiple electronic health record domains, and customizable social media streaming from public health-related sources to predict and track infectious disease outbreaks, all within a user friendly interface.

Bitscopic’s team is largely comprised of former Microsoft/Bing employees, who are skilled at extracting meaningful insights from very large datasets from electronic health records (EHR) of federal hospitals. They work with hospitals or governments interested in tracking disease detection digitally, and leverage best-in-class analytics to collate different sources to track infections and outbreaks nationally as well as down to the patient level. Bitscopic offers a competitive advantage over Google Flu Trends because it combines actual EHR data, epidemiological data, advanced geospatial tools, as well as social media tracking for disease detection. Google Flu Trends has proved successful at demonstrating historical outbreaks and trends using the data that was used to build the system, but it has proved significantly less successful at predicting behavior based on new/future data. Google Flu Trends retroactively predicts flu activity well in 2009, but it missed two large swine flu outbreaks later that year, due to changes in internet search terms. It also over-predicted the severity of the 2011-2012 flu season by 50%, which can lead to misallocations of resources and incorrect personnel deployment.

Because Bitscopic leverages actual health records as well as search terms and other sources in a constantly adapting model, it is far superior at disease surveillance and outcomes predictions. Not only can it help localize and pinpoint hospital-acquired infections and local outbreaks of unknown diseases (think outbreaks of the relatively unknown Chikungunya virus in the US last year), but it can also facilitate automatic communication between healthcare providers and local, national, and international health authorities in real time – with no lag.

The potential for Bitscopic to not only create tremendous value but capture that value is also high. Changes in the Affordable Care Act place increasing pressure on hospitals for good outcomes, reduced readmissions, and preventive care. Hospitals or government institutions working with Bitscopic will be much better equipped to conduct contact tracing, track where an infection was acquired, and stop infection transmission before it starts. According to the McKinsey Global Institute, using data to better predict disease outbreaks of the U.S. population could save between $300 and $450 billion. With possible savings of 10% of the entire U.S. medical bill, as well as the potential to predict, track and stop epidemics in their tracks, insights from big data could be the prescription for better care, lower costs and lives saved. And that’s only in the US. If Bitscopic could work with international Ministries of Health to leverage what national data they have available, the potential to pinpoint and stop outbreaks of polio, measles, dengue fever, yellow fever, all hemorrhagic fevers (i.e. Ebola), and a whole host of other infectious diseases will be greatly enhanced. The potential impact for this tool could contribute to improved global health security worldwide.

Of course, there are the inevitable challenges. Using big data from EHR records and social media websites, no matter how sanitized, poses some ethical dilemmas. Do any companies (hospitals, insurance companies, private sector companies) have an obligation to provide patient or employee data for public health issues? Additionally, it is challenging to know how accurate the predictions of Bitscopic will really be compared to its competitors. In an international context, under-predictions could lead to a catastrophic spread of Ebola or another infectious disease for weeks before being noticed. Over-predictions could cause panic or misallocations of supplies in a resource-constrained environment, or apply stigma unfairly to some communities. And of course, developing countries do not have the most robust data-sets or health records available for use by Bitscopic.

Despite these concerns, any additional bit of data helps when national or international public health security is on the line. If Bitscopic can succeed in demonstrating its value to private sector organizations and governments looking to track national disease outbreaks and capture that value as well, the world will be much better off.

Previous:

Visualizing Data with Tableau

Next:

Next Big Sound – BandPage on Steroids?!

Student comments on Digital Disease Detection: Can We Predict the Next Ebola Outbreak?

  1. This is a really great post. I agree the biggest challenge here is data security, particularly when having to do with health. Not only that, but in the healthcare where data has been collected and used for decades, but for different purposes and for different institutions, the question is how do you leverage old data too? It will take a lot of time and money to sort through all this data to an integrated and working state. Or is it better just to start from scratch and collect it from the beginning, with individuals opting into the program? This will take time though, and could be serval years until there is enough volume for this to be of any use.

    The health needs to better use its data. Yet with data security and infrastructure problems I am afraid the industry will probably not use data to its full potential for many years to come.

  2. I definitely agree that we need to leverage data better to predict health outbreaks. I am curious how much initial data Bitscopic needs in order to prove out their concept. It seems like the real value from this comes from having huge scale. But as in the typical network problem, how to do convince governments and hospitals to join the network and share EHR data before you’ve been able to show that you’ve identified an outbreak faster and more accurately because you didn’t have scale? If they are just using social media and search in the beginning, then it seems like it’s too similar to the Google Flu Trends methodology.

  3. Thanks for this post — I hadn’t heard of Bitscopic but it’s a great example of leveraging big data to try to solve a truly important problem. I have no background in healthcare so this may be far off but in regards to whether hospitals, insurance co’s will / should provide data, I wonder if that’s where legislation could be beneficial. Of course, privacy of individuals should be of highest importance but it seems like this type of information in aggregate anonymized format is beneficial to society as a whole and as such should be shared.

Leave a comment