EU data watchdog sets terms for AI model's legitimate use of personal data

By News Room Last updated Dec 19, 2024

The EDPB responded to the Irish authority, clarifying AI model anonymity, the legitimacy of using personal data for their development while leaving leeway to national data protection authorities.

The EU’s data protection agency has clarified in what circumstances developing AI models may access personal data in an opinion which sets out a three-stage test for legitimate interest of such use.

The opinion published this week by the European Data Protection Board (EDPB) – the coordination body for national privacy regulators across the EU – followed a request from the Irish Data Protection Authority in November, seeking clarification on whether personal data could be used in AI training without breaching EU law. Ireland’s DPA acts as a watchdog for many of the largest US tech companies, headquartered in Dublin.

Reaffirming models anonymity and ‘legitimate interest’

The opinion outlines that for an AI model to be considered truly anonymous, the likelihood of identifying individuals through the data must be “insignificant”.

The EDPB also established a framework for determining when a company may consider it has a “legitimate interest” giving it a valid legal basis for processing personal data to develop and deploy AI models without obtaining explicit consent from individuals.

The three-step test for assessing legitimate interest requires identifying the interest, evaluating whether processing is necessary to achieve it, and ensuring that the interest does not override the fundamental rights of individuals. The EDPB also stressed the importance of transparency, ensuring that individuals are informed about how their data is being collected and used.

The EDPB stressed in the opinion that ultimately it is the responsibility of national data protection authorities to assess, on a case-by-case basis, whether GDPR has been violated in the processing of personal data for AI development.

Models developed with data extracted and processed illegally may not be deployed, the opinion states.

Reactions from civil society and industry

The ruling was welcomed by the Computer & Communications Industry Association (CCIA), which represents major tech companies, including those developing AI models. “It means that AI models can be properly trained using personal data. Indeed, access to quality data is necessary to ensure that AI output is accurate, to mitigate biases, and to reflect the diversity of European society,” said Claudia Canelles Quaroni, CCIA Europe’s Senior Policy Manager. However, CCIA also called for more legal clarity to avoid future uncertainties.

Digital rights advocates, however, raised concerns, particularly about the anonymity of AI models. “Although this may seem plausible in theory, it is unrealistic to fine-tune such a distinction to the threshold set earlier, creating significant challenges in ensuring effective data protection,” said Itxaso Dominguez de Olazabal, Policy Advisor at EDRi.

Dominguez de Olazabal also highlighted the broad discretion given to national authorities, warning that it could lead to inconsistent enforcement. “This lack of alignment has already proven problematic under the GDPR, threatening the effective protection of fundamental rights. Harmonisation is key to ensuring digital rights are upheld universally.”

Looking ahead: web scraping guidelines

Further guidelines are expected from the EDPB to address emerging issues, such as web scraping – the automated extraction of data from websites, including text, images, and links, to train AI models and enhance their capabilities. These additional clarifications will be crucial as AI development continues to rely heavily on vast amounts of data.

Read the full article here