User Vulnerabilities in the Data Cycle

and How to Mitigate Them

Jun 22, 2022

arrows — Photo by Possessed Photography on Unsplash

Continuing last week's conversation about Privacy-Enhancing Design as a way to improve user privacy and reduce user vulnerabilities, this week, I would like to take a step back and discuss the different types of user vulnerabilities that exist throughout the data cycle and have a negative effect on user privacy. My focus is to present them and show how different stakeholders can have a positive role in mitigating these vulnerabilities.

User vulnerabilities that impact privacy can occur in any of the three main phases of the data cycle: data collection, data processing, and data use. Each of these three phases is connected to a techno-economic background, which will then influence how existing biases and stakeholders' interests will interplay and create or exacerbate an existing user vulnerability. These vulnerabilities, and the way they are exploited by organizations, generate situations of power imbalance and unfairness that are currently not properly dealt with by data protection law. They lead to negative effects on user privacy.

In the next paragraphs, I will briefly describe how these vulnerabilities appear in the data cycle:

A) Data Collection

The data collection phase of the data cycle occurs through the UX interface design of a product or service. Through buttons, settings, notifications, screens, and all elements of the interaction with the user, the organization is able to collect personal data.

At this point of the data cycle, the main vulnerabilities are the cognitive biases that are exploited through the use of deceptive design (i.e., dark patterns in data protection), which we have been extensively discussing in the last editions of this newsletter. There is a power and informational asymmetry between organizations and users, which allows organizations to build interfaces that can manipulate users to share more - or more sensitive - personal data.

The techno-economic background is composed of the enormous incentives organizations have to collect vast amounts of data, as personalized ads (fueled by personal data) sponsor products and services that are made available for "free." Moreover, the absence of data protection legislation regulating best practices for data protection design, as well as no express bans on deceptive design, are additional incentives for the proliferation of unfair design.

B) Data Processing

The data processing phase of the data cycle occurs through mathematical models, statistical knowledge, algorithms, and machine learning. Data scientists and engineers usually prepare the data, perform the processing operation and then present the output of the processing in a readable format to the other interested stakeholders in the organization.

At this point of the data cycle, the main vulnerability is algorithmic bias. According to Lee, Resnick, and Barton, "bias in algorithms can emanate from unrepresentative or incomplete training data or the reliance on flawed information that reflects historical inequalities. If left unchecked, biased algorithms can lead to decisions that can have a collective, disparate impact on certain groups of people even without the programmer’s intention to discriminate." There is a power and informational asymmetry between organizations and users, which end up allowing organizations to deploy algorithms that function in a "black box" style and whose criteria and methods are unknown to users.

The techno-economic background is composed of the incentives organizations have to process data in a quick and efficient way - to comply with the interest of the various stakeholders - and to use this data in a financially advantageous way, often failing to implement algorithmic hygiene. Moreover, the lack of regulatory mandates on plain algorithmic transparency and accountability is an additional factor in the lack of scrutiny faced by organizations exploiting data, as well as the exacerbation of the power and informational imbalances between users and organizations.

C) Data Use

The third and last phase of the simplified data cycle I am describing here is data use (which, in some instances, is also called data processing). Data use can be defined as "instances where data are reviewed to inform a recommendation for action in strategic planning, policymaking, program planning and management, advocacy, or delivering services." I am separating data processing, and data use into two different categories so that the factors in play and the vulnerabilities of each one of them can be analyzed separately.

At this point of the data cycle, the main vulnerabilities are unfair uses of personal data or situations in which the finality attributed to a certain data is unfair in itself or in the consequences it generates (regardless of the algorithm or machine learning process that was deployed in the processing phase). The "unfairness" of the data use is mainly associated with the covert collection, cross-matching, and exposure of secondary information. A non-exhaustive list of potentially unfair personal data uses includes full body scans in airports, biometric identification, facial recognition, profiling, virtual assistants, and security cameras in private areas.

The techno-economic background is composed of the incentives organizations (and governments) have to reach immediate and overarching security-related solutions, as well as pressure from the market and from various stakeholders involved to develop new disruptive products and services (which often include excessive and invasive uses of personal data). Moreover, the absence of tangible regulatory limits on data uses, and the lack of a more material approach to fairness in data protection law contribute to the proliferation of data uses that do not adequately respect human dignity.

Below I prepared a visual summary of the three types of user vulnerabilities found in the data cycle:

After presenting the three types of user vulnerability, I would like to discuss what can be done about that.

First, I would like to point out a clear insufficiency of current data protection laws to deal with these challenges, especially for failing to set material, empirical and interdisciplinary rules regarding fairness and transparency applied to privacy. The challenges of data protection, especially online, are clearly interdisciplinary and must necessarily involve the regulation of some aspects of UX design, software engineering, and technology development and use. Data Protection Law has historically avoided regulating design, code, and technology, but perhaps, at this point, this is what is required to advance user privacy. In this context, frameworks like Data Protection Design and Privacy-Enhancing Design are good examples of interdisciplinary efforts directed to bridge data protection with different fields (in this case, UX design) and generate regulatory waves that will shape how privacy will be implemented in practice in the next 5-10 years.

Second, something that every professional dealing with technology or any phase of the data cycle can do is get informed about privacy practices, privacy harms, and best privacy practices in his or her field. Privacy is a broad topic that goes much beyond the General Data Protection Regulation (GDPR) or the fine print on privacy policies. It is about our vulnerabilities being shielded, our autonomy being supported, and our human dignity being respected - online and offline. These are essential rights and values that cannot be neglected or exchanged for clicks.

I will continue bringing up this debate and pushing for new theoretical and regulatory approaches that might help technology, innovation, data, and privacy to be seen as inseparable topics that must be thought and dealt with together. There is no point in creating products and services that might seem useful, usable, beautiful, or cool if, instead of respecting our values as human beings, they exploit our vulnerabilities, reducing us to data points in a spreadsheet.

-

See you next week. All the best, Luiza Jarovsky

User Vulnerabilities in the Data Cycle

and How to Mitigate Them

-

Discussion about this post