Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

Google Wants Guinea Pigs for a New Medical Study. Here’s Why I’d Volunteer.

With health data, the default should be to look for safe ways to share.

Shutterstock / Palau

After six months of covering the merger of technology and medicine for Re/code, I’ve come to believe one thing very strongly: The next great insights into health and disease, and the resulting breakthroughs in diagnostics and treatments, are likely to emerge at the intersection of these disciplines.

I’ve become convinced because nearly every researcher I speak with drives home this point in words and deeds, with advanced medical research increasingly relying on genomic sequencing and other forms of big-data analysis.

But it also simply makes sense: These tools and techniques are allowing scientists to understand biology at a more basic and fundamental level than has ever been possible in the past. They’re steadily unlocking the programming code of life itself.

These approaches are especially promising for devising personalized cancer treatments, based on the specific mutations within a person’s particular tumor.

This realization has shifted my thinking on online privacy. I’ve been a frequent critic of the policies and blunders of various Internet players, and will always believe that we should be thoughtful and deliberate about how we manage personal data in this Information Age.

As I pointed out recently:

Supposedly “de-identified” data has proven to be anything but on several notable occasions in the past (including here, here and here). And electronic medical records have been compromised already.

But in the context of health care, I’ve come to believe that we need good and specific reasons to cling to our data. The default should be to look for safe ways to share.

We can’t afford to mindlessly indulge our abstract fears about privacy, and generalized resentment of big-tech businesses, when there is so much to be gained for society.

Health data is, as one researcher put it to me, the “grist for the mill” — and as it is, far too much of it is locked away in paper filing cabinets of clinics, isolated by well-meaning but out-of-date laws, or jealously guarded by corporations.

This all came to mind late last week, when Google revealed plans to conduct a “Baseline Study” to “establish a basic understanding of a healthy physiology at this most fundamental level.”

The Mountain View, Calif., technology giant’s research division plans to begin with a small pilot program surveying 175 healthy people, then will collaborate with researchers at Duke and Stanford on a far broader study.

Participants will provide blood, saliva and other samples, and will undergo full genomic sequencing and other tests. Google will analyze the data using its sophisticated algorithms and powerful computer network.

“This could become a reference tool that could inspire even more research studies,” Google said in a press release. “And in the long run, we hope this could be a small contribution toward helping the medical profession find new, proactive ways to keep us healthy.”

The company stresses that the effort is strictly for science, and says it’s taking pains to protect patient confidentially. The study will be overseen by an institutional review board, samples will be collected by the health institutions, and the data will only be given to Google once the names and social security numbers have been scrubbed.

But one point did initially give me pause: The information handed over will include full genome sequences of individual participants.

Curious about the implications of that, I contacted Hank Greely, a Stanford law professor focused on the ethical and legal issues associated with biomedical technologies. He said that a full genome sequence, the three billion DNA base pairs that make you you and me me, can only be anonymous if you define “anonymity” in a narrow way.

“I’m not saying people shouldn’t sign up for this,” he said in an interview. “But they need to know going into it that nobody can honestly promise you anonymity or confidentiality.”

That’s because once someone has the sequence, they could theoretically match it up to anywhere else that data lives — for example, on heredity sites like Ancestry.com, 23andMe and Family Tree DNA. In fact, several dozen adoptees reportedly used DNA tests to figure out the likely surname of their biological fathers on the latter site, the BBC reported in 2008.

As these tests become cheaper and more popular — the cost of full genome sequencing has plummeted a millionfold in the last decade, and simpler SNPs tests are already less than $100 — there are likely to be more places where this sort of data is available.

The privacy issue was a persistent theme in the coverage of the Baseline Study last week, and that’s probably a good thing. It’s always a fair question to ask.

But with that all said, after several exchanges with Google, the risks in this very specific circumstance seem tiny to me.

The company isn’t hosting this data publicly, so the only worrisome scenarios are that someone hacks into it, or a rogue Google X employee decides to abuse it for reasons that would also be difficult to fathom.

Down the road, if the study produces useful insights, Google may share some information with outside researchers, but only those working on formal studies also approved by institutional review boards. It won’t ever hand it out to the public, the company says.

I have two tests that I try to apply when thinking about appropriate privacy boundaries: Do consumers have choice, and do they have transparency?

In this case, the answer appears to be “yes” to both. The study is purely voluntary; no one will be compelled to participate. And I’m assured that the consent form explicitly describes the possible risks associated with sharing genomic data, precisely as Greely advocates.

Given these precautions, I’m prepared to say that I’d be comfortable participating in this study — at least if I qualified as healthy, which, unfortunately, I probably don’t.

There’s no way of knowing whether Google’s study will actually produce any genuine scientific leaps, but there’s every reason to believe that one analysis of this sort soon will.

This article originally appeared on Recode.net.

More in Technology

Technology
The case for AI realismThe case for AI realism
Technology

AI isn’t going to be the end of the world — no matter what this documentary sometimes argues.

By Shayna Korol
Politics
OpenAI’s oddly socialist, wildly hypocritical new economic agendaOpenAI’s oddly socialist, wildly hypocritical new economic agenda
Politics

The AI company released a set of highly progressive policy ideas. There’s just one small problem.

By Eric Levitz
Future Perfect
Human bodies aren’t ready to travel to Mars. Space medicine can help.Human bodies aren’t ready to travel to Mars. Space medicine can help.
Future Perfect

Protecting astronauts in space — and maybe even Mars — will help transform health on Earth.

By Shayna Korol
Podcasts
The importance of space toilets, explainedThe importance of space toilets, explained
Podcast
Podcasts

Houston, we have a plumbing problem.

By Peter Balonon-Rosen and Sean Rameswaram
Technology
What happened when they installed ChatGPT on a nuclear supercomputerWhat happened when they installed ChatGPT on a nuclear supercomputer
Technology

How they’re using AI at the lab that created the atom bomb.

By Joshua Keating
Future Perfect
Humanity’s return to the moon is a deeply religious missionHumanity’s return to the moon is a deeply religious mission
Future Perfect

Space barons like Jeff Bezos and Elon Musk don’t seem religious. But their quest to colonize outer space is.

By Sigal Samuel