Next year, the University of Virginia will begin classes in its first degree in data science, a graduate program that could help train people to analyze data for corporations, scientific researchers and the federal government.
It’s a growing field, especially in health care, where collection of patient data can help researchers study entire populations for risks and trends.
But many data scientists could wind up combing through consumer data to help, say, social media websites such as Facebook do targeted advertising. That could send those graduates into a field lacking professional codes of ethics specific to the work they do.
“Most data scientists are part of larger professional associations,” said Donald E. Brown, director of UVa’s newly created Data Science Institute. “There is no equivalent professional society with a separate code for data scientists.”
Those associations are the bodies that generally spell out codes of ethics for their fields. It’s a way for professionals to hold each other accountable, Brown said.
It hasn’t happened in data science yet, an extraordinarily technical field that’s drawing a lot of attention from the general public as companies such as Google collect massive stores of information on users.
Right now, most data scientists belong to broader engineering associations — such as the Institute of Electrical and Electronics Engineers or the Association for Computing Machinery — that have only general guidelines. The Association for Computing Machinery has a short section on privacy in its code of ethics, for example:
“It is the responsibility of professionals to maintain the privacy and integrity of data describing individuals,” the code reads. “This includes taking precautions to ensure the accuracy of data, as well as protecting it from unauthorized access or accidental disclosure to inappropriate individuals.”
But what sort of personal information should companies keep and how should they use it?
Deborah G. Johnson, professor of applied ethics at UVa, is considering those kinds of questions. She is part of the Data Science Institute’s corps of experts in law, humanities and social sciences who are working with data scientists to look at the ethical implications of collecting and storing personal data.
“The basic premise is that these new data analytics both raise all kinds of legal ethical questions and also that we’re going to need to create policies and laws that shape the way analytics works,” Johnson said.
There are a few universities with research institutes on mass data collection, but they mostly focus on the business or financial side of it. Ethical standards still are in their infancy and constantly changing as new issues arise.
For example, Johnson said, driverless cars currently in the testing phase might be able to collect and store GPS coordinates tracking the movements of car owners.
“What happens to that data and how can it be used?” Johnson said. “Who controls it? … You’re going to need all kinds of new regulations and policies.”
Johnson said she’s equally concerned about new-media companies tracking users’ browsing, shopping and reading habits. That might make for a more pleasant experience, but not necessarily an enriching one, Johnson said.
“They customize things so much and they kind of feed our consumer instincts and feed who we already are … we become more insular,” she said. “They make us more who we already are, rather than give us the opportunity to become something else.”
John W. Whitehead, a constitutional attorney, author and founder of the Albemarle County-based Rutherford Institute, said the minimum ethical standard would be informed consent — warning consumers their data could be used and asking them to agree to it in writing.
“Any company collecting data should get consent with the consumer on what they’re doing with the data and who they’re selling it to,” he said.
Whitehead said he also favors the Consumer Privacy Bill of Rights proposed by the White House two years ago. The proposal says consumers have the right to “reasonable limits” on the collection of data, along with access to “easily understandable” information about a company’s privacy practices.
The proposal would require changes to privacy law that could be made only by Congress.
UVa plans to host its first conference on big-data ethics, law and policy on April 11. For more information, email Johnson at firstname.lastname@example.org.