How open data defends against disease outbreaks

In May this year, a 68-year-old South Korean businessman was diagnosed with Middle East respiratory syndrome (MERS). In the days that followed, the South Korean government issued no statement about which hospitals he had visited. As reports of secondary cases started to file in, a similar silence ensued.

Rumours filled the information vacuum, and citizens turned to social media. Which hospitals were infected? What was taking the government so long to release the hospital names? Can I take my sick child to hospital?

Two and a half weeks elapsed before a list was released of the 24 hospitals visited by people with confirmed cases of MERS. In the weeks that followed, case reporting improved dramatically; by July the whole country was declared MERS-free.

In an ever more connected world, diseases travel as easily as airplane passengers. At the same time, technological advances have created new surveillance opportunities. A lot of big data is locked away in hospital billing records, pharmaceutical sales numbers or airline reservation systems. But not all of it is inaccessible. Social media and online news reports are available to anyone with access to the internet, and there is other data that is either publicly available or can be constructed from public information. Open data is changing the surveillance of, and response to, outbreaks of infectious disease.

Open data can aid early detection

Traditional surveillance tends to involve passive reporting, whereby clinicians and health practitioners take the initiative to report the disease to the appropriate health agencies. Detection of a potential outbreak is often delayed until a confirmed laboratory diagnosis. While this approach increases the reliability of data, the process takes time and is hidden from the public. Open data offers opportunities to recognize an outbreak earlier than our current means.

Foodborne illness strikes an estimated 55-105 million Americans a year. Traditional surveillance requires the sick to seek healthcare; as a result, people who don’t visit a doctor are never counted, likely leading to substantial underreporting of cases of foodborne illness and limiting identification of the specific pathogens involved.

Open data is revolutionizing the way health departments detect foodborne outbreaks and improve food safety. For example, the Chicago Department of Public Health and its partners launched Foodborne Chicago in 2013 to aid in detection of, and response to, cases of illness. By mining Twitter and other messages that came via a form on their website, the agency began collecting complaints from around the Chicago area. It’s not just health departments: online restaurant reviews can contribute to a revolution in public health surveillance.

While sceptics worry that open data isn’t 100% on the mark, it’s generally agreed that it can still be accurate enough to provide a public good. When the Chicago health department followed up on the complaints, 21 restaurants and 33 other facilities were closed on grounds of grave violations.

Open data can improve collaboration

It was through the use of publicly available data that epidemiologists and virologists were able to study the South Korean MERS outbreak in real time and share their thoughts on social media. Subsequently, the South Korean Ministry of Health and Welfare released detailed case information, and this enabled remarkable breakthroughs to take place.

The public release of data allowed for experts to collaborate with one another, and to identify the “superspreading” nature of the outbreak early on. Previously, MERS had not been seen as contagious enough to cause alarm, as it required close contact with infected individuals.

Two weeks after he was diagnosed, the index case was found to have infected more than 20 individuals, according to publicly available data sources. It was a conclusion later confirmed by a peer-reviewed assessment published in Eurosurveillance about a month and a half after the outbreak began.

The early and informal recognition of the superspreading nature of the outbreak led to appropriate infection control measures. These included healthcare workers wearing protective clothing, hospitals enforcing quarantine and isolation protocol, and public health officials properly communicating the risks with the general public.

This chart shows how disease can be spread from a single person to the wide community:

open data

Open data can improve transparency

Open data can foster public trust by allowing individuals to examine and analyse the evidence for themselves. It allows for accurate reporting from ministries of health and shows up weaknesses in systems that fail to identify threats early on. It can also bring to light deliberate obfuscation, as when one Chinese hospital’s attempt to hide an early H7N9 case in China was exposed by social media.

Open data presents unprecedented opportunities to improve the surveillance of infectious-disease outbreaks. It has already demonstrated its potential to improve detection, collaboration and transparency. With more data sources and continued innovation, open data can continue to support new breakthroughs.

The Summit on the Global Agenda 2015 takes place in Abu Dhabi from 25-27 October

Author: John Brownstein is an Associate Professor at the Harvard Medical School.

Image: A girl wearing a mask to prevent contracting Middle East Respiratory Syndrome (MERS) sits on a luggage as others walk past them at Gimpo International Airport in Seoul, South Korea, June 17, 2015. REUTERS/Kim Hong-Ji

Leave a Reply