Skip to main content

Privacy in AI:

Bottleneck Points

25 May 2023 will mark the calendars as the 5th anniversary of the world’s landmark data protection and privacy1 legislation, the General Data Protection Regulation (“GDPR”). GDPR has shaped business-making in the EU and beyond by seeking a balance between business interests and individual’s privacy. Lately, the mind-blowing speed of AI technologies -which often get trained by publicly available and/or privately held personal datasets- has grabbed lots of attention from regulators and other stakeholders. GDPR, as a “future-proof” legislation, covers and regulates all data processing systems that interact with personal data no matter the nature of the system, including AI systems. There are strong indications that the enforcement wheel of the GDPR will be turned towards the AI systems within the second 5 years of the GDPR. 

Within this scope, it is crucial to underline that the GDPR requires AI developers and deployers to integrate certain control mechanisms in an attempt to ensure the safe and trustworthy use of AI systems. Due to the hardships faced when conceptualizing and applying GDPR controls over inherently complex algorithms, we often observe the need to have a more structured focus on privacy risks in AI systems.

In this article, we will focus on three different sections relating to the relationship of lifelong partners, namely the GDPR and AI systems in an attempt to provide a sneak-peek over “where to start with the GDPR compliance on AI systems”. 

Where Does Privacy Stand in Trustworthy AI? 

Large-scale personal data is frequently used by AI systems to learn and make predictions, which raises questions concerning the gathering, use, and archiving of such data. Given the volume of data being gathered and processed, there is a chance that it could be misused through hacking or other security flaws. Therefore, privacy is vital in ensuring Trustworthy AI. The newly published “AI Risk Management Framework” by NIST explains that AI systems shall prioritize a “privacy-enhanced” design as an integral part of the development. Similarly, with a wider focus, French Data Protection Authority (“CNIL”) has published a rulebook on “ensuring GDPR compliance in AI systems”. According to the European Consumer Organization in 2020, a survey showed that 45-60% of Europeans agree that AI will lead to more abuse of personal data2. However, consumers are not the only reason to take privacy into account. High-quality privacy and data management policies are vital for the organization's branding, commercial operations, and security management as they go beyond simply assuaging customers' anxieties and concerns.

Beyond mere concerns coming from individuals, regulators in the EU have also increased their efforts to enhance GDPR compliance in AI systems. Italian Data Protection Authority (“Garante”) has lately blocked the famous large language model ChatGPT over lack of legal basis to scrape the personal data of Italian citizens during the development of the model, as well as lack of an age verification tool in the model for the use of minors3. Following the developments and growing scrutiny, the European Data Protection Board (“EDPB”) has set up a task force to prepare for an EU-level cooperated response to privacy risks arising from the ChatGPT4. No timeline from the EDPB has been communicated for the work of this task force. Meanwhile, ChatGPT has been reopened for access in Italy after integrating a few privacy controls implemented in Italy5.

 In sum, privacy is and will remain as one of the most serious headlines of ensuring trustworthy AI. On top, the growing scrutiny over AI systems’ compliance with data protection rules requires businesses to take action on the AI tools that are either developed or deployed by them. 

To Get it Right: Training Datasets and Privacy by Design 

The term “privacy by design”, also mentioned by the GDPR, means nothing more than “data protection through technology design.” Behind this is the thought that data protection in data processing procedures is best adhered to when it is already integrated into the technology when created. The disadvantages of not integrating a privacy-by-design framework become highly visible especially in the development of AI systems, as the latter may, depending on the type of the AI system, have a “black-box” nature6. Some AI systems have a black-box nature that makes it harder to detect and fix ethical and regulatory issues later in the lifecycle of the AI systems.

 This is also clearly visible through the intense regulatory and public focus on datasets used by AI developers to train AI systems. If datasets are processed illegally by an AI system, the lifecycle of the system comes to life from a flawed perspective. GDPR requires data controllers to have legal grounds for using even publicly available personal data. The misconception around  “independently using all available data on the internet ” puts businesses into high-risk scenarios where flawed and illegally trained AI tools are sold and distributed. Garante’s decision on banning the use of ChatGPT has originated from the lack of a legal basis for using publicly available datasets.

Thus, the relationship between AI systems that process personal data and the GDPR raises several fairly complicated legal and technical issues that call for a number of assessments and justifications by data controllers. Businesses will typically need documentation for these risk assessments. The two main instruments within the GDPR to lower the risks over AI systems’ compliance with the GDPR are as such: 

  • Data protection impact assessments (“DPIA”) are required when data processing is deemed "high risk" and it is often conducted by data protection and privacy professionals. Before starting to develop an AI system, or buying an AI system, DPIAs can immensely help to have a full picture of privacy risks that may vary based on the purpose of the deployment of such AI systems, 
  • Legitimate interest assessment acts as a justification for businesses that have chosen their legitimate interest as the basis to process personal data (with further considerations included). This is especially crucial to conduct as most businesses want to refrain from having the consent of data subjects for every new tool that they would be developing and/or using. Not every assessment can have a positive result, but this is one of the most operational tools to foresee any regulatory issues around AI systems and give AI projects a solid kick-start. 

Already Deployed AI Systems: Focus on Accountability and Transparency GDPR compliance on AI systems does not finish after the development phase. In accordance with the GDPR, businesses must also disclose to individuals what information they have on them and how it is used during the lifecycle of an AI system. As a result, if businesses want to analyse someone's personal information using artificial intelligence, they will usually need to let them know through privacy policies. 

More importantly, there is a "right of explanation" for data subjects whenever automated decision-making about an individual is involved. Businesses must inform everyone who will be impacted about automated decision-making, including the existence of the practice, its importance, and how it works.

 In the case of an opaque algorithm that is not developed in accordance with the privacy-by-design approach, ensuring maximum transparency and explainability can prove to be really challenging.

 Deloitte’s focus on holistic Trustworthy AI

 In Deloitte, our wide scale of different technical expertise combined with the regulatory risks expertise, has allowed us to incorporate a Trustworthy AI™  framework. Privacy, as an integral and very crucial part of this framework, in combination with all other controls for a fairer and safer AI environment for everyone involved can help organizations to boost their capabilities over risk-free development and deployment of AI systems.

GDPR’s first 5 years were surely a game-changer in many aspects and GDPR will keep playing a vital and central role over the new phase of “AI for business” with a more consistent background and increased enforcement capabilities of regulators. One thing is for sure: There is no “black and white” when it comes to GDPR and AI, and every AI system requires a special analysis to ensure capabilities are put on the right track of ethical and regulatory concerns. 

Cloud & Emerging Technologies

We provide a wide range of innovative, end-to-end Cyber + Cloud capabilities tailored to our clients’ environments while enabling significant and secure digital transformation.