Open-Data Advocate Says Health Information Must Be Shared

A leading and longtime advocate of open access to scientific data delivered a stern message to physicians recently: "You are not ready” for the flood of data that is coming, and you need to take the lead on setting privacy guidelines for that data before someone else does.        

John Wilbanks, a self-proclaimed agitator and contrarian, warned that if privacy advocates drive policymaking on the use of health data, society and science will lose out.

In an era of “big data,” when personal health information can be derived from sources as diverse as credit card records and GPS, and when individuals can acquire a genome sequence without consulting a doctor, Wilbanks urged the medical research and entrepreneurial community to take the lead in integrating and applying these various data in useful ways. He also urged a rethinking of the paternalistic approach to protecting patients and study participants by withholding data from them.

John Wilbanks

Formerly the VP for Science Commons and an Ewing Marion Kauffman Foundation fellow, Wilbanks has donated his own genome sequence and blood work to science. He recently founded Consent to Research to encourage others to do the same. His talk was part of the recent Living Well Through Data conference at Mt. Sinai Hospital.

Wilbanks said that the days when physicians served as gatekeepers of medical records are over. Even transferring those paper files into digital format is not progressive enough. “The EMR (electronic medical record) is an incremental approach to the health record in a world where we’re increasingly getting data outside the system,” Wilbanks said. “The reality is that we are all carrying instruments in our pockets and on our computers that allow far more high-resolution pictures of us as people.”

As confident as Wilbanks is in the benefits of integrating these data, he is skeptical that health information can or should be kept private. Citing a statistic shared by an earlier speaker, Wilbanks noted that some 500,000 data points exist for any individual in the health system. How many anonymized data points are needed to link data back to the individual they came from? Between 3 and 100, he said.

He pointed to a few examples of formerly private health information that can be gathered outside of the clinic: We can infer that a 16-year-old girl is pregnant based on her credit card purchases at Target, or detect symptoms of Alzheimer’s or alcoholism based on errors made on a keyboard at work, he said. These methods for assessing health “have nothing to do with the traditional health system and are not covered by HIPAA.”

Anonymization efforts are practically futile in a world of big data, he argued: “The technical capacity to re-identify people is rapidly outstripping the reality of privacy laws.” Furthermore, he said, “Pretending that we can de-identify and anonymize data and still make it useful is not useful. We’re either going to have anonymized data or we’re going to have useful data,” Wilbanks told the room full of researchers practiced in data anonymization.

In fact, Wilbanks foresees circumstances in which consumers will be incentivized to share more of their health information. Progressive Insurance is already offering discounted rates to customers who agree to install a device that tracks how often their vehicle is driven after midnight, how frequently they slam on the brakes, and other risk factors for auto insurance payouts. “It’s only a small leap,” Wilbanks said, to imagine the day when health insurance companies incentivize customers to enable cell phone GPS tracking to share how often you’re at McDonald's or the gym.

Wilbanks also argued that it is only fair that physicians and researchers as well as companies that collect health data from individuals share with those from whom they’ve collected it. “There is enormous value in the medical industrial system,” he said. “But as long as it says, 'we’re not going to give you back your data because we want to protect you and we’re worried about what you’d do with it so we’ve de-identified you for your protection,' we’ll end up separating these systems in a way that won’t be good for society.”

He pointed to the growing quantified-self movement whose members track their own vital statistics with a plethora of gadgets. One participant Wilbanks knows wants to be able to compare information from his pacemaker to other data he tracks. But neither Medtronic nor his physician will permit him access to data generated by a machine in his own body. “That is for his own good, supposedly,” Wilbanks said, “but this is a major block to the creation of apps using data coming off pacemakers.”

He warned the Mt. Sinai audience, “If you don’t fix this, someone will.” He foreshadowed a scenario in which privacy advocates get HIPAA rules extended to all electronics data because, “we need to protect those 500,000 points of data about John, or John is going to get screwed.”

Wilbanks suggested that medical professionals and commercial aggregators of data could secure a future in which diverse health data will be used effectively to change clinical outcomes by adopting some practical ideas. He urged them to “be honest” about how consumers’ data will be used. “Ninety-page consent documents and 55-page terms-of-use are not honest ways of dealing with users,” he said. “Consider consent forms that are clear about what you plan to do with the data … That sort of honesty is considered risky in the mobile and social world, but it’s going to have to be the core of health data.”

He also urged researchers and physicians to commit to giving participants usable copies of their data. Consumers have a right to their healthcare data, even if the content is generated by a private company, he said. “It’s not jut about a copy, but a usable copy. If you’re giving them results from a microarray analysis, you’ve got to disclose your schema to the participants. Their bodies are generating that data,” he said.

And he stressed the importance of obtaining informed consent—a practice that he acknowledged is routine in medicine, but said is often overlooked by entrepreneurs in the health data industry.

“At some point,” Wilbanks said, “honesty with users and the reusability, portability, and integration of data will change clinical outcomes. If you build models that lock up data and worry about getting it perfect before you give it back to people, it’s not accessible for clinical decisions.”

Finally, he warned, “If we let all the reasons not to share get in the way of sharing we’ll guarantee that the only way we’ll affect clinical decisions is through the absence of data.”

Anna Azvolinsky is a PhD in molecular biology turned science writer. Her work appears in Nature Medicine, and the Journal of the National Cancer Institute as well as Vitals on NBC News.com and Yahoo Health, among other publications. She regularly contributes to the Princeton Alumni Weekly magazine and Cancer Network, the online site of the Journal Oncology.

© 2012 New York Genome Center. All Rights Reserved. Site Developed by Citrus Studios. Graphic Design by J. Wick Design Studio.