Building upon the roots of data protection and privacy

By Martin Abrams

In December 2015, European Data Protection Supervisor Giovanni Buttarelli issued an opinion that suggested we need to re-invent data protection for the era of big data, not to compromise on principles, but rather to assure big data is used to serve people.

Privacy and data protection have always been different but linked by common human needs. Privacy is about a space where one and one’s intimate circle are free from the scrutiny and judgement of others. It is about the concepts of physical privacy extended.

Alan Westin’s ground-breaking book Privacy and Freedom established the need in every culture for people to have space where they may do silly things, think different thoughts, and express behaviors that differ from the norms. This private space has never been perfect – there have always been peeping toms and gossips. In European law, this translates into the fundamental right to autonomy and family life.

Data protection, though, has always been about the wise use of data transformed into information to serve the needs of individuals and/or a community. As Peter Hustinx explained in 2014 paper, data protection goes beyond privacy to the full range of fundamental rights impacted by the processing of information. Data protection is in essence a check on the inappropriate power that may happen when organizations hold and process data.

The need for data protection begins with the creation of historical record--the invention of writing in ancient Sumer. The technology of information gathering, aggregation, and manipulation was quickly owned by the powerful. The scribes in ancient Sumer worked for the powerful. The data they created was used by those in charge to maintain power.

Yet, technology favors no one over time. Technology can rock the powerful. Writing could be used to spread new thoughts and aggregate interests. The use of information technology may be used to create new markets and displace established ones. And this pace of economic creation and destruction has accelerated.

Now, consider the evolution of privacy law.

The Magna Carta, the basis for the English-speaking world’s legal system, defined the space where people could expect solitude. An individual’s home is his or her castle. An individual controls access to his or her home, and the papers that are contained in that home. Records are an extension of that space. The behavior that takes place in that home is only observable on the invitation of that homeowner.

What a person did in a public space and with others was open to observation. They were recordable by other individuals who participated, or by others in the public common. There have always been societal norms and even laws to cover the use and sharing of those observations.

World War II brought us the invention of the computer. By the 1960s, it was clear that we needed to move beyond the Magna Carta. Scholarship in the sixties led to privacy and data protection law in the 1970s, which in turn led to the OECD Privacy Guidelines.

The key concept then was that data came from me, and there should be a transfer of that data from me to others based on a clear delineation of the rules related to that data. The nature of the use was not the issue. The transfer of control related to data was.

Yet, there was also a recognition that data generation often involved more than one party. There was also public record, which had always existed, but was now much more visible as it moved from dusty basements to mainframe computers. So, U.S. laws began to cover the fair use of data such as credit reporting data. This was less about control and more about the fair application by lenders and employers who hold power positions. The concept of appropriate use was captured in European law and was the genesis of legal basis to process.

Now, consider the evolution of technology and their impact on privacy and data protection. It took thousands of years to get from the invention of writing to the printing press. And a few hundred years to move from the printing press to brownie cameras. Next were card-sorting machines to mainframe computers. The 1970s brought database technologies. The 1980s marked an acceleration of technologies with distributed processing and statistical analysis against large and broad data sets. The early 1990s was a period dominated by a decline in communications and processing costs to be followed by the consumer Internet and the real birth of an observational world. The year 2000 brought processing with common modules (the product of Y2K). Processing and communications cost kept falling making way for personal organizers and eventually smart phones. Big data followed along with the Internet of Everything. Information technologies have accelerated from millenniums between big changes to decades, to years, to it seems minutes.

Not only has technology changed, the nature of power has changed as well. In the year 2000, national intelligence was about nation states. In 2001, it became decentralized powers. The threats were no longer just spies, but rather the bomb maker taking lessons over the Internet.

So, both privacy and data protection become very complicated. The simple concepts contained in 1972 law do not capture all of the governance challenges of 2016.

What do not change are some essential human needs:

We need privacy to have a space where we can be flawed or perfect without other seeing, hearing, and/or knowing.
We need to assure that thinking with data serves peoples’ needs and not an empowerment of those with dominate power positions.
We need information to keep us safe, to have a more competitive marketplace, to improve healthcare, to better education, and to provide more opportunities.
We do not want to be hurt by the wrongful use of information or arrogant insights.
We do not want our futures predestined by predictions.

We need privacy to have the space to be ourselves, yet technologies continue to expand the public common and make use of the data so much easier. Moreover, use of data and processing to meet our needs means discovering new insights to improve healthcare, education, economic participation, and even better shopping. Addressing the needs of individuals and communities requires us to think with data. Thinking with data is modern analytics.

Much of the functionality of the law was dependent on the extension of data being something that was akin to our physical being. Yet, the flow of technology has been to expanded observation beyond the protection of the castle. When I use my wearable connected to my smart phone, even in my castle, I do not have the protections associated with castle walls.

However, stepping in front of innovation is akin to trying to stop an avalanche with a shovel. Channeling the application of innovation, though, is about assuring civility.

So, how do we reinvent data protection, as suggested by the EDPS? Who has to have an active role? What should those roles be? How do we do this with cross-cultural sensitivity? Reinventing data protection means reinventing the roles of data users, regulators and just plain people.

Start with data users. From there, the concept of accountability expands. The goal of being a responsible and answerable custodian does not change--but the aspects of being one become much more holistic. This is particularly so when data is used beyond the expectations of reasonable people.

Big data is a great metaphor for this. Therefore, by its very nature, big data is a repurposing of data – even if big data is listed as a purpose in a privacy statement.

First, we need to segment big data into two phases. This concept of a two-phase approach to big data was explored in a paper written by Paula Bruening, Meg Leta Jones, and me in 2013.[1] The first phase is discovery and involves thinking with the data. The second phase is application and involves acting with the knowledge that comes from thinking with data. The risks to individuals are different in these two phases.

Once one understands whether one is thinking or acting, one needs to understand the reasons for the processing. What is the problem one is trying to solve? By understanding the problem, one may begin to comprehend challenges related to using data to solve that problem.

Second, we need to understand the mechanics of processing the data we are using. In many ways, this matches up with the accuracy issues clearly articulated by the OECD guidelines. The size and diversity of data sets does not solve accuracy issues as some would suggest, it just creates greater complexity in determining whether the accuracy of data will impact the credibility related to solutions. The legality and morality of the data comes into play. Some data is precluded by law, some by contract or promise, and some is just not appropriate.

Third is a clear understanding of the linear algebra associated with looking at benefits and risks. Who gets benefits, who has risks? If there is a mismatch between bearer of risk and the recipient of benefits, it is not likely that processing is in balance. These issues are explored in “The Unified Ethical Frame for Big Data.”[2] Data protection has always required understanding the full range of risks to fundamental rights and freedoms, and the unified ethical frame explores how one may balance out those issues. In the end, this leads to a simple assessment of whether the processing of data is legal, fair and just. If not, processing needs to be revised to make it so.

Fourth, checks and balances for the data users matter. Controllers must stand ready to demonstrate that their assessments are conducted with integrity. Standing ready to demonstrate creates the means to demonstrate sound process to an oversight agent. To truly create checks and balances, and assure individual participation, data users must be educators to the public on how data is used beyond the expectations of individuals. Education creates the means for individual participation, an accountability essential element.

On the role of the privacy enforcement agent, regulators need new skill sets. They must be able to ask the active questions that facilitate whether both discovery and application meet the test of being legal, fair, and just. That means that regulators should be willing and able to differentiate between the two. To play their role effectively, regulators need the authority to spot check processing.

Then, there is the role people play. In the early seventies, when data and systems were the same, people were expected to play the active role of governing data through the choices they made. In that scenario, sunshine solved many problems. Today, sunshine is not enough. Processing, even well explained may be beyond people’s ability to manage. However, people may act as a community in a fashion we have never seen before. As stated above, this is very much dependent on the quality of explanatory materials provided by data users. My colleague Peter Cullen is exploring the role of people in the Holistic Governance Project he is leading at IAF.

Therefore, in closing, I would suggest as practitioners we need to understand the difference between privacy and data protection. Both sides of the legal and governance coin are important, but differentiating them gets us closer to effective solutions that meet societal needs. In addition, we can no longer depend on individuals to be the policer of the market via the reading of notices and consent. The nature of processing has become very complex. Yet, the system cannot devolve into one where the data user is the sole arbiter of which processing is appropriate in which is not. The data user as daddy is paternalism beyond being fair and just. This means people, as individuals or a group, must have a meaningful role in setting norms for the processing of data beyond expectations. Last, regulators too must differentiate issues related to autonomy versus those related to the fair and just use of data to foster more knowledge and better application for data driven solutions.

[This blog is based on a talk delivered at the Office of the Information and Privacy Commissioner, Feb. 3, 2016]

Abrams is Executive Director and Chief Strategist of The Information Accountability Foundation, a research and educational non-profit organization.

[1] The Centre for Information Policy Leadership (2013), “Big Data and Analytics: Seeking Foundations for Effective Privacy Guidance”, www.hunton.com/files/Uploads/Documents/News_files/Big_Data_and_Analytics_February_2013.pdf.

[2] The Information Accountability Foundation (2014) “A Unified Ethical Frame for Big Data Analysis”, http://informationaccountability.org/wp-content/uploads/IAF-Unified-Ethical-Frame.pdf.