The Problems of Data Ownership and Data Security
Data science is one of the fastest growing fields in the world. We’re creating data all the time, and recent estimates show with that approximately 90% of the data that exists in the world today has been created over the last two years alone. Much of this data is created by everyday people, through posts on social media, emails, and instant messages. However, with the explosion of the data science field, the question of who has ownership and access to this data is becoming increasingly relevant.
Recently, French journalist Judith Duportail made use of a European Union data privacy law to request a copy of the consumer data that dating app Tinder had collected on her. Duportail received a substantial report, some 800 pages and all, which contained a massive amount of information about her including her hobbies, occupation, her “likes” on Facebook, romantic and sexual preferences, musical tastes, geolocation, and more.
Duportail knows how difficult it is to “see data”. In her article for The Guardian, she correctly described how even though every decision and action we take can be collected in some form of database, it never feels as if we are being monitored. In the article, Duportail expressed her fears about what would happen if this data was compromised, about the kinds of things people could learn about her.
This is not to say that data collection on this scale is an invasion of privacy, at least usually not an unwilling one. Users of apps and social media typically provide the data willingly to companies and agree to privacy policies specified in lengthy terms and conditions. It also isn’t as if users don’t receive some sort of benefit from the data that is collected about them, as this data is often used by companies to personalize the experience of their users, making their interaction with apps or websites more attractive and efficient.
However, there are very real ethical concerns about who gets access to your data and in what format. Frequently your information is sold off to third parties, and even if it isn’t there is always the chance that the security of your data could be undermined by hacking. Data protection activists are arguing for more legal recourse for citizens whose data is distributed without their consent or compromised by hacking.
In a recent article published for the New York Times, Zeynep Tufecki sounded off on the “maddening unaccountability” of corporations like Equifax, whose servers were recently breached allowing the intimate identifying information of around 145 million Americans to be stolen. Tufecki argues that there must be more regulations to hold large corporations accountable for what they do with people’s data.
Information technology arrived on the scene only recently, so it has faced fewer of the kinds of regulations that consumers and citizens, in more progressive eras, managed to impose on other industries.
Though Tufecki acknowledges that there are technical factors that make the lack of sufficient cybersecurity such a problem, she argues the main reason for repeated breaches is a political one.
“Big corporations have poured large amounts of money into our political system, helping to create a regulatory environment in which consumers shoulder more and more of the risk, and companies less and less,” Tufecki says.
Tufecki says that while she understands no piece of software can ever be perfectly safe from bugs or hacks, and hence it would be unreasonable for a person to sue in every instance of a software glitch, most data breaches and software failures are preventable. They result from negligence on the part of companies unwilling to invest in reliable products and security features. For this reason, Tufecki says there is a pressing need to establish regulations that would hold companies accountable when a person’s data is unethically sold or compromised in a preventable manner.
Who Owns the Data You Generate?
Aside from someone’s data being sold without their permission or stolen by hackers, another question worth considering is how much access to their data should a person have? As the source of a set of data, shouldn’t consumers have access to the data that technology companies collect about them?
In the current legal landscape, you probably don’t own the data that is collected about you. Modern legal concepts of privacy and copyright are usually insufficient to deal with the concept of data ownership. As a result, there is no legal concept which defines you as the owner of information about your life. Certain laws do guarantee a consumer’s right to access data about them, however. The EU provision that allowed Duportail to request a copy of her data from Tinder is an example of these laws. However, in the US, as Tufecki notes, data privacy is not tightly regulated.
Even if legal precedence for data ownership is currently insufficient, and ethical case for allowing private citizens access their data can easily be made. If people can access the data that has been collected about them, they have the potential to learn interesting things about their lifestyles and preferences. As was revealed to Duportail when she requested a copy of her data from Tinder, the ability to analyze aggregated data about our lives can lead to powerful insights about ourselves. It would seem unfair to withhold this opportunity from people when they are the source of the data in the first place.
A New Model of Data Ownership
Because of the power and opportunity inherent in gaining more knowledge about yourself, people like Luigi Zingales and Guy Rolnik of the University of Chicago’s Booth School of Business have argued for an updated model of data ownership. One that would allow people to have their own “social graph”, ownership of a digital database and map of all the digital connections a person makes. Zingales and Rolnik note that while companies like Facebook and Google allow an individual some limited access to the data they generate, through various tools like APIs, these companies can cut off any person they see as a competitive threat and the average person has no idea how to access these tools.
“I have a suspicion of the very idea that data can be owned. I don’t find ownership a very good model for data, let me put it that way. I think part of what we have to evolve is some different metaphors, different language, different frameworks, for understanding what’s going on with data,” says Kelly.
Kelly argues that saying any one person owns their data is absurd, similar to saying a person owns the 99.99% of their genes that they share with other people. Kelly argues for a more holistic model of data access. Every party, agency, channel or node that accesses a set of data would have some claim to it, but also some rights and responsibilities that they have to abide by when accessing that data.
Regardless of what solutions are proposed to the problems of data security and ownership, everyone can agree that the current situation is one that must be remedied. Cases like Duportail’s experience with her own data and the Equifax breach all highlight the necessity for new models of data ownership, privacy, and security.