⚖️ AI systems & GDPR compliance
The latest developments in AI policy & regulation | Edition #107
👋 Hi, Luiza Jarovsky here. Welcome to the 107th edition of this newsletter on AI policy & regulation, read by 26,000+ subscribers in 135+ countries. I hope you enjoy reading it as much as I enjoy writing it.
📅 Registration is open.The next cohorts of our 4-week Bootcamps on the EU AI Act and on Emerging Challenges in AI, Tech & Privacy start in July. Don't miss them and join 750+ professionals who have attended our training programs at the AI, Tech & Privacy Academy.
👉 A special thanks to Didomi for sponsoring this week's free edition of the newsletter. Join their webinar:
Ad Tech is evolving, enhancing targeted advertising while addressing data privacy concerns. However, these innovations face a complex regulatory landscape, particularly in the US, where each state has its own legislation. The rise of connected TVs adds complexity, enabling targeted ads based on user behavior, each with unique regulatory challenges. Join Didomi's exclusive webinar on June 13 at 11am EST / 5 pm CEST, hosted by privacy leaders from Telly, Fieldfisher, and Didomi. Register here
⚖️ AI systems & GDPR compliance
This is a long debate which I've been covering for months. Here's where we are now:
➡️ Many compliance issues are still open
➵ ChatGPT was launched 18 months ago, and millions of people use advanced general-purpose AI systems daily. However, EU data protection authorities still don't have a firm position on practical issues related to AI systems and GDPR compliance.
➵ Two weeks ago, I wrote about the European Data Protection Board ChatGPT Taskforce Report and how it failed to clarify how AI providers (such as OpenAI) must comply, in practice, with legitimate interest, especially given the current characteristics of general-purpose AI systems as well as practical evidence on how the AI ecosystem is developing.
➵ EU Data protection authorities’ positions on the topic vary. The Dutch Data Protection Authority, for example, said that “scraping is almost always illegal,” others have said that “it depends,” and others have remained silent.
➡️ The EU AI Act
➵ Another interesting aspect of this debate is that the EU AI Act will soon start being enforceable (in phases) and for high-risk AI systems, it requires that they declare that they comply with the GDPR (see Article 47 and Annex V). It's essential that all aspects of GDPR compliance are clarified so that the legal ecosystem around AI works properly.
➡️ Additional developments
This week, we have three additional developments in this context:
The French Data Protection Authority's recommendations on AI & GDPR;
The European Data Protection Supervisor’s take on scraping to train AI;
Noyb vs. Meta AI.
➵ I discuss the three of them below and what they mean in practice, but my position remains that we urgently need a straightforward position from EU authorities on how to apply legitimate interest in the context of general-purpose AI systems.
➡️ Privacy & data protection must be effective in the age of AI
➵ Given the characteristics of these systems (e.g., LLMs, deep learning), how they are trained (e.g., scraping), and the difficulty of effectively implementing data protection principles, data subjects’ rights, privacy by design, and so on, authorities need to be clear on how data protection law will continue to be meaningful in the “age of AI.”
➵ The field of data protection has been developing for decades, and if it wants to continue offering solid fundamental rights protection to all of us, it needs to face emerging AI challenges and openly discuss how to overcome them.
➡️ Learn more:
➵ To learn more about AI compliance & regulation, check out our training programs at the AI, Tech & Privacy Academy. There are two Bootcamps starting in July, don't miss them!
🏛️ The CNIL's recommendations on AI & GDPR
➡️ The CNIL - the French Data Protection Authority - published its recommendations on GDPR compliance in the context of AI system development. It's a must-read for everyone in AI.
➡️ These are the 7 steps they recommend:
Step 1: Define an objective (purpose) for the AI system
Step 2: Determine your responsibilities
Step 3: Define the "legal basis" that allows you to process personal data
Step 4: Check if I can re-use certain personal data
Step 5: Minimize the personal data I use
Step 6: Set a retention period
Step 7: Carry out a Data Protection Impact Assessment (DPIA)
➡️ Quotes:
"If you plan to re-use a dataset that contains personal data, make sure it is legal. That depends on the method of collection and the source of the data in question. You, as a controller (see “Determine your responsibilities”), must carry out certain additional checks to ensure that such use is lawful."
"The personal data collected and used must be adequate, relevant and limited to what is necessary in the light of the objective defined: this is the principle of data minimisation. You must respect this principle and apply it rigorously when the data processed is sensitive (data concerning health, data concerning sex life, religious beliefs or political opinions, etc.)."
"The Data Protection Impact Assessment (DPIA) is an approach that allows you to map and assess the risks of processing on personal data protection and establish an action plan to reduce them to an acceptable level. In particular, it will lead you to define the security measures needed to protect the data."
➡️ On the specific requirements to use legitimate interest as a lawful basis to process personal data, it says that "a how-to sheet specific to the legal basis of legitimate interest will be published shortly."
🏛️ The EDPS’ take on scraping to train AI
➡️ Last week, the European Data Protection Supervisor (EDPS) published its “orientations for ensuring data protection compliance when using Generative AI systems.” The document is focused on EU institutions, but we can learn their position on scraping to train AI.
➡️ On page 7, while answering the question "how to know if the use of a generative AI system involves personal data processing?" the EDPS orientation states:
"The EDPS has already cautioned against the use of web scraping techniques to collect personal data, through which individuals may lose control of their personal information when these are collected without their knowledge, against their expectations, and for purposes that are different from those of the original collection. The EDPS has also stressed that the processing of personal data that is publicly available remains subject to EU data protection legislation. In that regard, the use of web scraping techniques to collect data from websites and their use for training purposes might not comply with relevant data protection principles, including data minimisation and the principle of accuracy, insofar as there is no assessment on the reliability of the sources." (link below)
➡️ At a minimum, AI companies are currently clearly not complying with the lawfulness of processing (Article 6 of the GDPR), data subjects' rights, and various data protection principles.
➡️ There is a lot of uncertainty and a "let's see how it goes" attitude, which does not help anyone, including a whole ecosystem of companies that are becoming AI deployers and want to comply with EU law.
➡️ At this point, a firm answer from the European Data Protection Board (EDPB) or the EDPS would be extremely helpful.
🥊 Noyb vs. Meta AI
Noyb urges Austria, Belgium, France, Germany, Greece, Italy, Ireland, Netherlands, Norway, Poland & Spain to act against Meta's AI practices. This is big. Here's what you need to know:
➡️ According to noyb's official release:
"Over the past few days, Meta has informed millions of Europeans that its privacy policy is changing once again. Only on closer inspection of the links in the notification did it become clear that the company plans to use years of personal posts, private images or online tracking data for an undefined 'AI technology' that can ingest personal data from any source and share any information with undefined 'third parties.' Instead of asking users for their consent (opt-in), Meta argues that it has a legitimate interest that overrides the fundamental right to data protection and privacy of European users. (...) noyb has now filed complaints in 11 European countries, asking the authorities to launch an urgency procedure to stop this change immediately before it comes into force on 26 June 2024."
➡️ This is related to many of the topics I've been discussing in recent months, including tech companies' privacy policy changes to train AI and their non-compliance with data protection principles, data subject rights, lawfulness of processing, and more (to learn in-depth about AI compliance issues, register for my upcoming 4-week Bootcamp).
➡️ It's also related to my newsletter article on the European Data Protection Board's (EDPB) lack of a firm position and what I said above regarding the European Data Protection Supervisor (EDPS)'s recommendation on the interpretation of the GDPR in the context of AI training.
➡️ In terms of legal technique, in its complaints, noyb has requested the "urgency procedure" specified in Article 66 of the GDPR. The urgency is due to the fact that Meta's processing for undisclosed "artificial intelligence technology" is set to take effect on June 26th.
➡️ noyb's complaints list violations of Articles 5(1) and (2), 6(1), 9(1), 12(1) and (2), 13(1) and (2), 17(1)(c), 18(1)(d), 19, 21(1) and 25 GDPR. It says it plans to file complaints in the remaining EU Member States in the coming days. As always, noyb and Max Schrems are acting fast!
➡️ This is a very important development at the intersection of AI and privacy. I'm looking forward to hearing what the eleven EU Data Protection Authorities will say. I recommend you read noyb's official release, as well as its file "Overview of Meta’s falsehoods and 'spin.'"
💻 On-demand course: Limited-Risk AI Systems
Check out our June on-demand course: Limited-Risk AI Systems. I discuss the EU AI Act's category of limited-risk AI systems, as established in Article 50, including examples and my insights on potential weaknesses. In addition to the video lesson, you receive additional material, a quiz, and a certificate.
➡️ Paid subscribers of this newsletter get free access to our monthly on-demand courses. If you are a paid subscriber, request your code here. Free subscribers can upgrade to paid here.
➡️ For a comprehensive program on the AI Act, register for our 4-week Bootcamp in July. It's a live online program with me - don't miss it.
📊 Meta's AI plans
To better understand Meta's AI plans, here's what Mark Zuckerberg said during Meta's earnings call on February 1st:
"When people think about data, they typically think about the corpus that you might use to train a model up front. On Facebook and Instagram there are hundreds of billions of publicly shared images and tens of billions of public videos, which we estimate is greater than the Common Crawl dataset and people share large numbers of public text posts in comments across our services as well. But even more important than the upfront training corpus is the ability to establish the right feedback loops with hundreds of millions of people interacting with AI services across our products (...)"
🏛️ AI & copyright infringement
US judge sides with Google in AI lawsuit alleging privacy, copyright infringement, and more in the context of AI training. Plaintiffs can file a second amended complaint. Quote:
"In light of the concerns expressed by Judge Chhabria in his order dismissing the complaint in the matter of Cousart v. OpenAI LP, No. 23-cv-4557-VC,1 and given the overlap in the plaintiffs named, the involved plaintiffs’ counsel, and the claims asserted in this case and Cousart, Google’s motion to dismiss Plaintiffs’ amended complaint is GRANTED. Plaintiffs shall file a second amended complaint within 21 days of this order."
➡️ For now, most of the legal issues around scraping to train AI, such as privacy, copyright, and more, are still open.
➡️ Read this decision here and the original complaint here.
🔥 AI Governance is HIRING
Below are ten AI governance positions posted last week. Bookmark, share, and be an early applicant:
1. Barclays (US): AI Governance and Oversight - apply
2. The Alan Turing Institute (UK): Policy Fellow, AI Governance - apply
3. Zurich Australia (Australia): AI Governance Lead - apply
4. Venquis (Germany): Consultant - Generative AI Governance - apply
5. RWE (Germany): AI Governance Expert - apply
6. M&T Bank (US): AI Governance Senior Consultant - apply
7. GEICO (US): Director, Data, Model & AI Governance - apply
8. EY (India): Senior, AI Governance (Risk Consulting) - apply
9. Deloitte (Canada): Consultant, AI Governance, Risk and Data - apply
10. BIP (Italy): AI Governance Specialist - apply
➡️ For more AI governance and privacy job opportunities, subscribe to our weekly job alert.
➡️ To upskill and land your dream AI governance job, check out our training programs in AI, tech & privacy. Good luck!
✨ AI: from personalization to replacement
As advanced AI agents and assistants are integrated into our devices, everyone should be aware of the continuous process that started around 25 years ago. I summarize it in 5 minutes, watch:
🎤 Are you looking for a speaker in AI, tech & privacy?
I would welcome the opportunity to:
➵ Give a talk at your company;
➵ Speak at your event;
➵ Coordinate a training program for your team.
➡️ Get in touch
⏰ Reminder: upcoming training opportunities
The EU AI Act Bootcamp
🗓️ Tuesdays, July 16 to Aug 6, 10am PT / 6pm UK time
👉 Register here
Emerging Challenges in AI, Tech & Privacy
🗓️ Wednesdays, July 17 to Aug 7, 10am PT / 6pm UK time
👉 Register here
📩 To receive our AI, Tech & Privacy Academy weekly emails with learning opportunities, subscribe to our Learning Center.
I hope to see you there!
🙏 Thank you for reading!
If you have comments on this week's edition, write to me, and I'll get back to you soon.
To receive the next editions in your email, subscribe here.
Have a great day.
Luiza