Why does this matter?
AI models process data using the provider’s infrastructure, which is often (in fact, almost always) located outside your company. Even if you do not input full personal data, the contextual description alone may contain information allowing for identification of an individual or disclosure of confidential information. In such cases, your organisation loses control over where the data is processed, how extensively it is logged, and how the provider may further use it.
Most common risks
-
Personal data embedded in prompts
Names, surnames, email addresses, client numbers, contract numbers, often pasted mechanically. Providing them to an AI tool means transferring them outside the organisation.
-
Excessive context
Information that appears “neutral”, e.g., descriptions of client interactions, project names, internal dependencies, may itself constitute confidential information.
-
Data processing outside the organisation’s control
Many models run on infrastructure located outside the EU. Without appropriate contracts and safeguards, this may result in unlawful data transfers.
-
No guarantee of deleting data from logs
Disabling chat history does not affect technical logs or telemetry, which may be stored by the provider for long periods.
Practical example
In 2023, Samsung Electronics experienced a widely publicised incident involving the use of the public version of ChatGPT. Employees, seeking faster ways to analyse code errors and resolve technical issues, began pasting into the tool proprietary source code fragments, internal configuration descriptions and summaries of project meetings. All of this information was processed on the provider’s external infrastructure, beyond the company’s control, and some of it could have been used to train the model. After discovering the situation, Samsung immediately banned the use of public AI generators, conducted a risk assessment and process audit, and began developing its own isolated AI solution accessible only within the internal environment.
How to use AI generators lawfully and safely?
-
Avoid disclosing personal data to AI models.
Personal data, even those that seem insignificant, become part of the external provider’s infrastructure once entered into an AI tool, and therefore lie outside your organisation’s control.
-
Keep prompts at a high level of abstraction that does not expose operational details of the company.
Descriptions of functions, processes, projects or business relationships can be as sensitive as personal data, as they may reveal the organisation’s structure, intentions or strategic direction. -
Use only tools provided or approved by your company, particularly those with defined data-processing safeguards.
Corporate environments do not eliminate risk but at least provide clarity on logging scope, data-processing locations and how the provider handles input data.
-
Assume that any information submitted to an AI tool may be recorded in logs and remain in circulation on the provider’s side.
In practice, this means that no content should be entered into a model if it cannot be disclosed to an external entity, regardless of how trivial or fragmented it may appear.
Summary
AI generators offer significant potential, but they do not remove the responsibility to protect personal data and confidential information. The primary risk does not stem from file uploads but from inattentive prompt formulation, which may reveal personal data or essential information about your organisation’s activities. Conscious and controlled use of AI tools is now an indispensable element of information hygiene in the workplace.