As Jon Snow so often reminds us, “winter is coming.” And with it comes increased precipitation, freezing temperatures, and millions of your friends’ Facebook and Twitter posts complaining about their cold and flu symptoms. Although these posts are not particularly newsworthy, they are beloved by health-savvy technology companies, academic institutions, and the US Centers for Disease Control (CDC).
For decades, population health statistics across America were generated by the analysis of data gathered via periodic reports submitted by doctors’ offices, hospitals and public health departments. This information was used to inform and better prepare public health workers across the country for the coughing, sneezing masses they serve. Unfortunately, this data was often outdated by two-weeks by the time it was finally delivered to interested parties.
Thanks to the wonders of social media, the CDC and disease-loving research institutions like Johns Hopkins have been using your tweets to gather real-time information about population health. In fact, rather than just tracking existing outbreaks, they are now looking to use time and geographically-stamped tweets to better predict where and when illness will occur. This advanced warning could enable hospitals, schools, pharmacies and local community centers to better prepare themselves with additional antibiotics, medical supplies, patient beds, and optimize nurse and doctor schedules.
Over the past several years, many public, nonprofit and private groups have put great effort into tracking both common illnesses and rare infectious diseases. In 2011, a team of epidemiologists from the University of Iowa proved Twitter’s public health value by correctly tracking the 2009 swine flu (Influenza A H1N1) outbreak levels and public concern through simple key words. Other research groups have created web applications like Germ Tracker (pictured below) and Sick Weather that pull data from multiple social media sites and use real-time analytics to display colorful, animated maps that track health problems ranging from allergies to chicken pox to depression.
In 2009, Google partnered with the CDC to develop Google’s Flu Trends, which used anonymized flu-symptoms and remedy-related Google searches to better map and track worldwide disease outbreaks. The internet giant also developed Google Dengue Trends in an attempt to track the global path of Dengue fever. Although Google no longer actively supports these applications, it has made its historical estimates publicly available and is actively seeking academic research partners to explore this area further.
The CDC has also released a downloadable mobile phone app called FluView (pictured below) where users can see current data on the number of flu-like illnesses per state. While this application only provides basic, high-level data, CDC epidemiologists are interested in predicting even more detailed disease information. The ability to find detailed predictive information regarding outbreak severity, timing and strain type is particularly key in understanding if and how vaccines must be altered to work effectively.
In fact, social media data may be key in informing the public and health response teams about both local and global disease outbreaks. Initiated a decade ago by a team of researchers and developers at Boston Children’s Hospital, HealthMap (pictured below) is a leading web and mobile app that monitors disease outbreaks by combining social media data with other authoritative news and health information sources. HealthMap mines, integrates and analyzes data from social media networks, eyewitness reports, official health agencies (e.g., World Health Organization, World Organization for Animal Health), online news sources (e.g., Google News, SOSO Info), and communicable disease surveillance (e.g., EuroSurveillance). In fact, HealthMap was able to report the recent Ebola epidemic 9 days before any public statement from the World Health Organization.
Although it may appear seamless, this data integration and analysis does not come without inherent limitations and systemic challenges. Because many Twitter postings rely on reactions to current news, perfecting algorithms to better separate meaningful disease indicators from related but insignificant noise (e.g., tweets commenting on a national flu outbreak) continues to be problematic. Additionally, since scientists often only trust verified, authoritative data sources, convincing public health researchers and government official of the value of nonscientific, unverified social media data remains a challenge.
In recent years, public health organizations have become more comfortable with using social media platforms to disseminate information widely and immediately, but an effective, direct information flow from the public to these agencies is still in its infancy. Overall, I would encourage organizations like the CDC to work more closely with technologically advanced and innovative academic and private partners to improve data aggregation and real-time analysis. Although there may be challenges around the sharing of unpublished population health data, an 86% baseline correlation between weekly flu tweets and national flu data is too significant to ignore.
Image Sources (in order of appearance):