Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

Metaphors of Big Data

What do bacon, oil, tsunamis, exhaust, deluges, nuclear waste and teenage sex have in common?

CopyBlogger.com

What do bacon, oil, tsunamis, exhaust, deluges, nuclear waste and teenage sex have in common? They are all things to which “Big Data” has been likened.

Many excellent essays have addressed Big Data metaphors. They include “Data Is the New ‘___’,” by Sara Watson; “Big Data Metaphors We Live By,” by Kailash Awati and Simon Buckingham Shum; “Big Data, Big Questions: Metaphors of Big Data,” by Cornelius Puschmann and Jean Burgess; and “Swimming or Drowning in the Data Ocean? Thoughts on the Metaphors of Big Data,” by Deborah Lupton. Those articles, however, discuss the “metaphors of Big Data” as if they’re all efforts to describe the same thing. But they are not.

The metaphors and similes cited above refer to at least three distinct things. The “tsunami” and “deluge” are attempts to illustrate the challenges of handling vast and ever-changing datasets. The teenage-sex simile is a comment on the hype surrounding the notion of Big Data: In 2013, Dan Ariely said that “Big data is like teenage sex: Everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it …” The better-known “oil” and “bacon” metaphors refer to the large datasets themselves, which are being collected by various entities these days, regarded as assets, and mined or chewed up for insights.

It might be more useful to treat those three groups as distinct, and to address separately the metaphors commonly applied to each of them. In particular, I want to focus on the metaphors that are squarely directed at big datasets, at collections of information — as opposed to Big Data-related processes or hype. Because even Big Data sets are not the same, and our metaphors should reflect that. We can’t just discuss even this one subset of the Big Data phenomenon as if we all know what we’re talking about. The kinds of data matter.

Take, for example, the metaphor of Big Data as nuclear waste. This metaphor has been applied as a response, a corrective, to the much better known mantra of “data is the new oil.” The nuclear waste metaphor is, however, a reference to a particular kind of Big Data: Personal data about individual human beings. (Privacy professionals talk a lot about “PII”: Personally identifiable information — which is a broader concept. They/we have long discussions about what constitutes PII. This is not one of those discussions.)

There are many large data sets, such as data about atmospheric or oceanic conditions, or about production outputs in various companies, or energy consumption by particular vehicles, that would probably not be described, even by Big Data critics, as “radioactive material.” Let’s separate those out. Let’s clarify that there’s a distinct problem when intimate personal data about individual human beings is what’s being described as “the new oil” or “the new bacon” and treated like an ordinary asset.

Technology critic Evgeny Morozov has argued that the commodification of personal details is not a matter of property rights. In a New Republic article titled “Selling Your Bulk Online Data Really Means Selling Your Autonomy,” he writes:

“Our data constitutes our very humanity. To voluntarily treat it as an ‘asset class’ is to agree to the fate of an interactive billboard. We shouldn’t unquestionably accept the argument that personal data is just like any other commodity and that most of our digital problems would disappear if only, instead of gigantic data monopolists like Google and Facebook, we had an army of smaller data entrepreneurs. We don’t let people practice their right to autonomy in order to surrender that very right by selling themselves into slavery. Why make an exception for those who want to sell a slice of their intellect and privacy rather than their bodies?”

Is that true for any personal data, though? Should we draw even finer distinctions? Strangers have long had access to some details about most of us — our names, phone numbers and even addresses have been fairly easy to find, even before the advent of the Internet. And marketers have long created, bought and sold lists that grouped customers based on various differentiating criteria. But marketers didn’t use to have access to, say, our search topics, back when we were searching in libraries, not Googling. The post office didn’t ask us to agree that it was allowed to open our letters and scan them for keywords that would then be sold to marketers that wanted to reach us with more accurately personalized offers. We would have balked. We should balk now.

Maybe some personal data can be sold without undermining our autonomy, and some can’t. Access to a person’s name and phone number is not the same as access to his or her Social Security number, or search topics, or communications with his or her coworkers, friends, family or lovers. The intimate details of our lives, and in particular our communications (including those on any social media that does not clearly describe itself as “public”) should be differentiated from “the new oil” or “the new bacon.” They should, indeed, be off the market.

At the same time, we should acknowledge that not all Big Data is radioactive. We need to separate our metaphors, and maybe come up with some new ones, too, in order to give clarity to the issues we now face in the new data economy.


Irina Raicu is the Director of the Internet Ethics program at the Markkula Center for Applied Ethics, Santa Clara University. Follow the Internet Ethics program on Twitter at @IEthics.

This article originally appeared on Recode.net.

More in Technology

Technology
The case for AI realismThe case for AI realism
Technology

AI isn’t going to be the end of the world — no matter what this documentary sometimes argues.

By Shayna Korol
Politics
OpenAI’s oddly socialist, wildly hypocritical new economic agendaOpenAI’s oddly socialist, wildly hypocritical new economic agenda
Politics

The AI company released a set of highly progressive policy ideas. There’s just one small problem.

By Eric Levitz
Future Perfect
Human bodies aren’t ready to travel to Mars. Space medicine can help.Human bodies aren’t ready to travel to Mars. Space medicine can help.
Future Perfect

Protecting astronauts in space — and maybe even Mars — will help transform health on Earth.

By Shayna Korol
Podcasts
The importance of space toilets, explainedThe importance of space toilets, explained
Podcast
Podcasts

Houston, we have a plumbing problem.

By Peter Balonon-Rosen and Sean Rameswaram
Technology
What happened when they installed ChatGPT on a nuclear supercomputerWhat happened when they installed ChatGPT on a nuclear supercomputer
Technology

How they’re using AI at the lab that created the atom bomb.

By Joshua Keating
Future Perfect
Humanity’s return to the moon is a deeply religious missionHumanity’s return to the moon is a deeply religious mission
Future Perfect

Space barons like Jeff Bezos and Elon Musk don’t seem religious. But their quest to colonize outer space is.

By Sigal Samuel