Social media and technology researcher Danah Boyd likes to tell this story as an illustration of the power of mass data gathering:
A couple of years ago, a 16-year-old Minneapolis girl received a mailing advertisement from Target. The ad congratulated her on her pregnancy and suggested she buy some infant care products.
The girl’s father was incensed and chewed out the manager of a local Target store. A few days later, the father called the manager to apologize — his daughter was pregnant, and Target’s corporate marketers and statisticians had found out first. They had learned to predict — based on consumer buying habits — when shoppers were expecting children and the girl’s recent purchases had raised a red flag.
What Target failed to do, Boyd told an audience at the University of Virginia on Friday, was to understand the implications of using the data.
“What’s happening is an analytic correlation … without information for what that correlation will do as it’s shared with other people,” said Boyd (who changed her first and last names legally to be lowercase).
The author and researcher was the keynote speaker at the first National Conference on Big Data Ethics, Law and Policy at UVa. The university recently started a Data Science Institute, and one area the institute hopes to tackle is the ethical and legal implications of mass data collection.
The keynote focused on privacy in the age of big data. Technology in those fields is growing exponentially, but the law, ethics and analysis are lagging.
Although some corporations have been eerily successful at predicting consumer buying habits, a lot of them don’t interpret data the right way. Young people are particularly savvy at using marketing algorithms for their own purposes. For example, Boyd talked about teenagers who tag brand names in their Facebook statuses, which pushes a status to the top of a newsfeed.
Some high school students Boyd interviewed talked about using algorithms to prank each other. They might put certain keywords in the text of an email, for example, so their friends would get advertisements for adult diapers.
In other words, metrics on “likes,” “tags” or “mentions” can easily be misinterpreted. It is important for computer scientists and data technicians to think about the context of the information they’re collecting and how to ethically use it, Boyd said.
“Because of these technologies we can see into the lives of many more people,” she said. “It becomes really difficult to think about how to make sense of what we see; how to think about the ramifications as we are all connected in ways that are fundamentally unprecedented.”
That’s something the field hasn’t thought about enough, said Alfred Weaver, a computer science professor and a member of the Data Science Institute. Weaver said one of the institute’s goals is to incorporate more coursework on ethics into the curriculum.
It’s particularly important, Weaver said, for students to think hard about all the ways their platform or technology might be used. A lot of his students, he said, learn the hard way that their platforms can be abused.
“The way we teach technology [now], we make students think they understand the way it’s going to be used because they’re the architect,” Weaver said. “That’s wrong.”
The institute will also be looking at data collection in the use of surveillance and national security investigations.
Siva Vaidhyanathan, who chairs UVa’s Department of Media Studies, said balancing security concerns, data gathering and privacy is one of the major problems the institute is hoping to tackle.
For now, Vaidhyanathan said, the best option might be tighter regulations on what corporations such as Amazon and Google can collect. Those companies are often forced to give up that information to the government.
“Companies are at the mercy of the government … but they don’t have to collect everything about everyone and keep it forever,” he said.