The legal war against OpenAI has started

Plus: Threads app, Meta, and privacy: here we go again

Jul 06, 2023

∙ Paid

👋 Hi, Luiza Jarovsky here. Read about my work, invite me to speak, tell me what you've been working on, or just say hi here.

This week's newsletter is sponsored by MineOS:

We’re only halfway through 2023, and it’s already the most eventful year for American data privacy. With the recent postponement of the effect date for California’s CPRA to 2024, Connecticut’s CTDPA, Colorado’s CPA, and 5 new state laws having passed already, we figured it was time for a quick rundown. Because there’s so much to cover, MineOS created this video overview to help people get caught up within minutes. Get the need-to-know video about the new American data privacy laws and how they compare to the 5 state laws already on the books.

🔥 AI Risk Management Framework: privacy matters

The AI Risk Management Framework was released by the National Institute of Standards and Technology (NIST) on January 26th, 2023, and it states that its goal is: "to offer a resource to the organizations designing, developing, deploying, or using AI systems to help manage the many risks of AI and promote trustworthy and responsible development and use of AI systems." (page 2). This is one of the most important US-based documents on the trustworthiness of AI systems. AI developers and tech companies around the world have been reading it and looking for concrete guidance on how to design their AI systems to mitigate risks and harm. Although it is not focused on privacy implications, it also has a section about privacy (Section 3.6). The paragraph below has caught my attention, as it specifies the Framework's approach to privacy-related values: "Privacy values such as anonymity, confidentiality, and control generally should guide choices for AI system design, development, and deployment. Privacy-related risks may influence security, bias, and transparency and come with tradeoffs with these other characteristics. Like safety and security, specific technical features of an AI system may promote or reduce privacy. AI systems can also present new risks to privacy by allowing inference to identify individuals or previously private information about individuals." (page 17) In terms of privacy aspects, it also adds that "Privacy-enhancing technologies (“PETs”) for AI, as well as data minimizing methods such as de-identification and aggregation for certain model outputs, can support design for privacy-enhanced AI systems." This is an essential document for anyone developing AI, especially those focused on AI ethics, responsible AI, or trustworthy AI. It establishes central concepts and principles and helps to pave the way to safer AI systems. Anyone developing or investing in AI should have in mind that privacy matters and must be incorporated within AI systems.

🔥 The legal war against OpenAI has started

It looks like the legal war against OpenAI has officially started, with two lawsuits on the same day. On June 28, Clarkson Law Firm filed a class action lawsuit against OpenAI (Microsoft was also named as a defendant) in the US District Court for the Northern District of California. The lawsuit is very “informative” in a privacy sense, as it discusses various privacy-relevant topics, such as scraping, children's privacy, the right to be forgotten, consent, bias, misinformation, and more. If you do not have the time to read the 157 pages, I especially recommend reading the section "Overview of Risks" - starting on page 47. Below are some interesting quotes to give you a taste of the lawsuit: “The massive, unparalleled collection and tracking of users’ personal information by Defendants endangers individuals’ privacy and security to an incalculable degree. This information can be exploited and used to perpetrate identity theft, financial fraud, extortion, and other malicious purposes. It can also be employed to target vulnerable individuals with predatory advertising, algorithmic discrimination, and other unethical and harmful acts." (page 47); “Given ChatGPT’s ability to generate human-like understanding and responses, there is a high likelihood that users might share (and already are sharing) their private health information while interacting with the model, by asking health-related questions or discussing their medical history, symptoms, or conditions. Moreover, this information can be logged and reviewed as part of ongoing efforts to “train,” improve and monitor each model’s performance." (page 49); "Compounding this massive invasion of privacy, OpenAI offers no effective procedures at this time for individuals to request for their information/training data to be deleted. Instead, OpenAI simply provides an email address that consumers can contact if they would like to have their information removed. But this “option” is illusory. Regardless of whether individuals can technically request for ChatGPT to remove their data, it is not possible to do so completely, because Defendants train ChatGPT on individual inputs, personal information, and other user and nonuser data, which Defendants cannot reliably and fully extract from its trained AI systems any more than a person can “unlearn” the math they learned in sixth grade" (page 50). A new chapter in the intersection between privacy & AI has begun - waiting to read the opinion of the Court in this case. On the same day (June 28), another lawsuit was filed in California against OpenAI, this one on the grounds of copyright infringement. Here are two quotes from this other lawsuit: “The unlawful business practices described herein violate the UCL because they are unfair, immoral, unethical, oppressive, unscrupulous or injurious to consumers, because, among other reasons, Defendants used Plaintiffs’ protected works to train ChatGPT for Defendants’ own commercial profit without Plaintiffs’ and the Class’s authorization. Defendants further knowingly designed ChatGPT to output portions or summaries of Plaintiffs’ copyrighted works without attribution, and they unfairly profit from and take credit for developing a commercial product based on unattributed reproductions of those stolen writing and ideas.” (page 13). Ironically or not, lawyers will likely be among the last ones to be replaced by robots, as the legal war against companies developing AI-based products and services is just starting.

🔥 Dark Patterns in AI and the normalization of “fake”

I have discussed dark patterns in privacy extensively in this newsletter and in my academic paper ("Dark Patterns in Personal Data Collection: Definition, Taxonomy, and Lawfulness"). When talking about dark patterns in privacy, we usually refer to deceptive UX design practices that make people share more or more sensitive data. I have also written about dark patterns in code, as proposed by various authors. These are situations in which a privacy dark pattern would involve both UX and code, would not be visible to the user (only through auditing), and, as with UX dark patterns, would undermine user autonomy. I've recently proposed a third type of dark pattern: dark patterns in AI. These are AI applications or features that attempt to make people: a) believe that a particular sound, text, picture, video, or any sort of media is real or authentic when in fact, it was AI-generated (false appearance/deepfakes); and b) believe that a human is interacting with them when it's an AI-based system (anthropomorphism). Dark patterns in AI deploy sophisticated deception techniques that cause data subjects to change their behavior and share personal data. These practices are not covered by data protection concepts such as notice, consent, or traditional data protection principles, but they can cause privacy harm. Authorities have started to notice them. For example, the Italian Data Protection Authority in February banned the AI companion Replika, and the FTC has been warning against manipulative AI in recent blog posts. Dark patterns in AI are becoming a major concern, and the potential risk will continue growing in the next years with the advancement of AI. Unfortunately, my guess is that examples of this kind of dark pattern will continue as a strong trend, especially with the current normalization of “fake” or generative AI-based products. For example, Adobe has recently announced that it will integrate Adobe Firefly, its generative AI tool, into its widely popular Photoshop program. One of the consequences of this integration is that Photoshop users will be able not only to edit photos and pictures but to use “generative fill” and create over the original image with “text to image” prompts. With the wide popularization and normalization of both generative AI creation and generative AI “filling,” soon we will not have a clue if something that we see is true or not. This phenomenon - especially on this scale and with high economic incentives - is new, and the consequences for human societies are yet unseen.

🔥 Threads, Meta, and privacy: here we go again

Today (July 6), Meta - the company behind Facebook, Instagram, and Whatsapp - launched a new app called Threads. These are some of the privacy UX and data protection issues that caught my attention: