ChatGPT and Microsoft Copilot in Public Administration – Lawfulness under GDPR and EDPB Opinion 28/2024?

ANSWER

General remarks on general-purpose AI systems

It is no secret that these models are trained on personal data without a legal basis.

The above solutions are general-purpose AI systems based on LLM models. The EDPB Opinion concerns AI models, not the use of systems. A model, within the meaning of the Opinion, is a component of an AI system (analogous to an engine in a car), which was created through machine learning. In this assessment, it may apply when GPT models are used via an API.

In light of the above and the business terms of use for ChatGPT, an organisation using ChatGPT is responsible for the input data entered into the model via the chat and the output data generated in response to prompts. In this case, OpenAI acts as a processor.

This does not change the fact that, in light of the EDPB Opinion and the current state of technical knowledge, personal data can be extracted from the model. Organisations should implement technical and organisational measures prohibiting users from attempting such data extraction from the model.

This means that the transfer of data to OpenAI should be governed by a data processing agreement. Attention must also be paid to the legal bases for processing and to the applicable principles. Furthermore, a policy should be implemented prohibiting, for example, data extraction from the model, adversarial attacks on the model, and circumvention of AI system safeguards.

Copilot

The term Copilot covers several different services: Copilot Chat, MS 365 Copilot Chat, and Copilot for MS 365. The first service is not covered by a data processing agreement or the EU Data Boundary. Separate packages exist for the public sector and education. A definitive answer cannot be given without information about the specific service selected and the manner of its implementation.

Publicly available DPIAs concerning Copilot in the public and education sectors identify the following risks:

Lack of transparency: it is unclear what personal data Microsoft collects and stores regarding the use of Microsoft 365 Copilot, particularly with regard to diagnostic data.
Users who request access receive incomplete and unclear information.
Microsoft 365 Copilot is likely to generate inaccurate and incomplete personal data. Users sometimes fail to notice that they are working with inaccurate data due to excessive reliance on the AI tool (known as automation bias).
Data is transferred to Microsoft outside the data processing agreement and EU Data Boundary when web search functionality is used.

The above are examples drawn from two DPIAs relating to specific implementations. Each organisation should individually assess the risks associated with its deployment, taking into account the purpose of the system and its functions.

Verification of data accuracy by the model user

In practice, due diligence may involve the supplier providing appropriate documentation or conducting independent testing of the model. For many closed models, this may currently be difficult; however, the AI Act will partly address this issue through certification mechanisms, CE marking, and — in the case of general-purpose models — requirements including the disclosure of

training data sources (Article 53 of the AI Act).

Show all

Previous Next