How to implement a voicebot in compliance with GDPR? Key principles and best practices

03 December 2025

Voice assistants (voicebots) are becoming an increasingly popular business tool — from customer service in call centres, through banking, to e-commerce. They automate conversations, reduce costs, and increase service accessibility. However, every voice command involves the processing of personal data — often highly sensitive, and sometimes biometric data. That is why businesses implementing such solutions must ensure not only functionality and dialogue quality, but also compliance with the GDPR and the e-Privacy Directive. How can this be done effectively and securely? In this article, we present practical guidance based on the EDPB guidelines on voice assistants.

Zdjęcie autora: r.pr. Katarzyna Szczypińska

The Author:

r.pr. Katarzyna Szczypińska

Share this article

Technological background

The basic operating principle of a voice assistant is simple – it enables an oral dialogue with the user. Three main elements of the information flow can be distinguished:

  1. Physical element – the hardware component with which the assistant is integrated (smartphone, speaker, smart TV, etc.) and which has microphones, speakers, and network and computing capabilities (more or less advanced depending on the case).
  2. Software element – the component that enables human – machine interaction, integrating modules for automatic speech recognition, natural language processing, dialogue, and speech synthesis. It may operate directly within the physical hardware, but in many cases it runs remotely.
  3. Resources – external data, such as content databases, ontologies, or business applications, which provide knowledge (e.g. “tell me the time on the west coast of the United States”, “read my emails”) or enable the requested action to be performed in a specific way (e.g. “increase the temperature by 1.5°C”).

Assistants allow the installation of third-party components or applications that extend their basic functions. Each assistant uses a different term for these components, but all of them require the exchange of users' personal data between the assistant provider and the application developer.

The assistant operates in such a way that, once connected to the receiving device (smartphone, speaker, vehicle), it remains in standby mode. More precisely – it is constantly listening. However, until a specific wake word is detected, no sound is transmitted from the receiving device and no other operations are performed – apart from detecting the wake word itself. For this purpose, a buffer of several seconds is used. When the user utters the wake word, the assistant locally compares the sound with that word. If they match, the assistant opens a listening channel and the audio content is immediately transmitted (which in many cases means sending this data to remote servers via the Internet). This is followed by interpretation of the user's command and the preparation and delivery of a response or execution of the command, after which the assistant returns to standby mode.

e-Privacy Directive

Processing personal data is one of the core functions of voice assistants, and therefore the relevant EU legal framework for them is primarily the GDPR. In addition, the e-Privacy Directive (implemented in Poland by the Telecommunications Law) establishes specific standards for all entities that wish to store information on, or obtain access to, information stored on, a subscriber’s or user’s terminal equipment in the EEA. Pursuant to Article 5(3) of the e-Privacy Directive:

“Member States shall ensure that the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent , having been provided with clear and comprehensive information, in accordance with Directive 95/46/EC, inter alia about the purposes of the processing. This shall not prevent any technical storage or access for the sole purpose of carrying out the transmission of a communication over an electronic communications network, or as strictly necessary in order for the provider of an information society service explicitly requested by the subscriber or user to provide the service.”

In practice, this means that if the processing of data by a voice assistant is not “strictly necessary in order to provide the service” (i.e. to carry out the user’s command), access to information stored by the assistant on a phone, tablet or television requires the user’s consent. This applies, for example, to profiling users or using their data for machine learning. However, note that data controllers would have to link consent to specific users. Therefore, controllers may process data of unregistered users only for the purpose of carrying out their commands.

Processing personal data

The definition of personal data under Article 4(1) of the GDPR covers a broad range of different data and applies in a technologically neutral context to any information relating to an “identified or identifiable natural person.” Any interaction between the data subject and a voice assistant may fall within the scope of this definition. From the moment the first command is issued until a response is obtained, an action is taken, or follow-up action is performed (e.g., setting up a weekly notification) – the initial input of personal data gives rise to the generation of further personal data. This includes primary data (e.g., account data, voice recordings, command history), observed data (e.g., device data relating to the data subject, activity logs, online activities), as well as inferred or derived data (e.g., user profiling). The assistant may also process data of different individuals – registered users, unregistered users, or persons who accidentally uttered the wake word. The more services or functions the assistant offers and the more it is connected to other devices or services managed by other parties, the greater the volume of personal data processed and the broader the scope of secondary processing. This gives rise to a multitude of processing operations, carried out primarily by automated means.

Practical DPO course will confirm your high competence

Data controller and data processor

The main stakeholders can be identified in the context of the roles assigned to them: supplier or designer, application developer, integrator, owner, or a combination of these roles. Various scenarios are possible – depending on who performs which tasks within the business relationship between the stakeholders, what the user’s command is, what personal data are processed, and what actions related to such processing are carried out and for what purposes. Each party may have one or more roles, as it may be an independent data controller, joint controller, or data processor in relation to one data processing operation, and in relation to another operation – act in a different role.

For example, a bank offers its clients an application that can be operated directly using a voice assistant to manage accounts. Two entities are involved in the processing of personal data: the designer of the assistant and the developer of the banking application. In the scenario presented, the bank is the data controller with respect to the provision of the service, as it determines the purposes and essential means of processing associated with the application enabling interaction with the assistant. However, if a situation were to arise in which the designer of the voice assistant wished to use the data collected and processed for the purposes of the service provided by the bank to improve its own speech recognition system, the provider would become the data controller in relation to that specific processing operation.

Transparency

Article 13 of the GDPR provides that where personal data of the data subject are collected from that person, the data controller shall, at the time of obtaining the personal data, provide the data subject with all necessary information about the processing of the data, e.g. who the controller is, what the purposes and legal bases of the processing are, how long the data will be processed, and to whom the data will be disclosed. The phrase „at the time of obtaining the data” in practice means that the information should be provided at the very moment the data are collected. There should be little difficulty in providing such information to a registered user if creating an account is necessary in order to use the voice assistant. However, in the case of other persons whose data may be processed by the assistant, this may prove difficult or even impossible. Other problems that may arise include the complexity of the ecosystem (the information must be provided by the controller – but which entity is the controller?) or the specificity of the voice interface (digital systems are not yet adapted to purely voice-based interaction, as demonstrated by the almost universal use of a supplementary screen).

According to the EDPB, users should be informed about the current state of the device (whether it is listening). For this purpose, specific voice signals and visible icons or LEDs, or displays on the device, may be used.

Some voice assistant providers include third-party applications in the assistant's default configuration, so that those applications can be launched using specific wake words. In such cases, users should also receive the necessary information about data processing by third parties.

If the device does not have a screen, the information may be made available via an easy-to-understand link. As an example, existing solutions may be cited, such as customer call centres' practice of informing the caller that the conversation is being recorded and directing them to the applicable privacy policy.

Purpose limitation and legal basis for processing

Among the most common purposes of processing personal data by voice assistants are:

  • executing user commands,
  • improving assistants through training the machine learning model and human review and labelling of voice transcriptions,
  • identifying the user (using voice data),
  • profiling users in order to personalise content or advertising.

Risk analysis

When was the last time

you carried out a risk analysis?

Risk and DPIA are fundamental elements in building a data protection system.
ORDER AN OFFER
For example: in the course of executing a user command relating to directions to the nearest petrol station, the assistant processes the user’s voice and location. Any processing of personal data that is necessary to carry out the user’s command may therefore be based on the legal basis of contract performance. Contract performance may constitute a legal basis for processing personal data using machine learning, if this is necessary for the provision of the service. Processing of personal data using machine learning for other, non-essential purposes, such as improving service quality, should not be based on this legal basis.

An example of achieving the purpose of improving the assistant through training machine learning systems and manually reviewing voice and transcriptions is a situation in which the user must issue the same voice command three times because the assistant does not understand it, and then the three voice commands and the related transcriptions are passed to reviewers for checking and correcting the transcriptions. Next, the voice commands and corrected transcriptions are added to the assistant’s training dataset to improve its performance. Article 6(1)(b) of the GDPR does not constitute an appropriate legal basis for processing for the purposes of improving service quality or developing new features within an existing service. In light of Article 5(3) of the e-privacy Directive, it therefore remains necessary to obtain consent for the processing of data for this purpose.

If voice data are to be used to identify a user, the processing will involve special categories of data – biometric data. In such a case, the only applicable legal basis will be consent referred to in Article 9(2)(a) of the GDPR.

An example of processing data for the purpose of personalising content or advertising is a situation in which a user browses the Internet and the assistant adds labels to their profile indicating topics of interest, so as later to present tailored offers. Content personalisation may constitute (but does not always constitute) an inherent and expected element of the operation of the assistant. Whether such processing can be regarded as an inherent element of a voice assistant service will depend on the precise nature of the service provided, on the expectations of the average data subject, in light not only of the terms and conditions governing the provision of that service, but also of the way it is advertised to users, as well as on whether the service can be provided without personalisation.

Where personalisation takes place in the context of a contractual relationship and as part of a service expressly requested by the end user (and the processing is limited to what is strictly necessary for the provision of that service), such processing may be based on Article 6(1)(b) of the GDPR. If the processing is not strictly "necessary for the performance of a contract", the data controller must in principle obtain the data subject's consent. This follows from the fact that consent will be required under Article 5(3) of the e-Privacy Directive in the case of storing or gaining access to information. Accordingly, consent under Article 6(1)(a) of the GDPR will also generally constitute the appropriate legal basis for the processing of personal data following those operations, as reliance on legitimate interest could in some cases risk infringing the additional level of protection provided for in Article 5(3) of the e-Privacy Directive. As regards profiling the user for advertising purposes, it should be noted that this purpose is never considered a service expressly requested by the end user. Therefore, consent should be systematically collected from users for processing for this purpose.

Data retention

In accordance with the storage limitation principle under the GDPR, a voice assistant should retain data no longer than is necessary for the purposes for which the personal data are processed. Therefore, retention periods should be linked to the various processing purposes. Data controllers must limit not only the retention period, but also the type and amount of data. Where the user withdraws consent, the data collected from them may no longer be used (e.g. for further training of the model). Nevertheless, a model previously trained using such data does not necessarily have to be deleted. However, it is necessary to implement measures that reduce the risk of re-identification to an acceptable threshold.

Voice recording anonymisation presents a particular challenge, as users may be identified on the basis of the content of the message itself and the characteristics of the voice itself. However, some research is being carried out into techniques that could make it possible to remove contextual information, such as background noise, and anonymise the voice.

Security

To process personal data securely, a voice assistant should protect its confidentiality, integrity and availability. Any assistant service requiring confidentiality will need some access control and user authentication mechanism. Without access control, anyone able to issue voice commands may access, modify or delete users’ personal data (e.g. inquire about received messages, the user’s address or calendar events).

User authentication may be based on one or more of the following factors: something one knows (e.g. a password), something one possesses (e.g. an electronic card), or something one is (e.g. a voiceprint). Voice assistants typically do not require or offer an identification or authentication mechanism when the device providing the service has only one user account. Most assistants trust their local networks. Any compromised device on the same network may therefore change the settings of a smart speaker, enable the installation of malicious software, or assign fake applications or skills to that speaker without the user’s knowledge or consent.

A voice assistant, like any other software, is exposed to software vulnerabilities. Any vulnerability may affect millions of users. If a voice assistant functions properly, it does not send any information to the cloud until the wake word is detected. Nevertheless, software vulnerabilities could allow an attacker to bypass the assistant’s settings. It would then be possible, for example, to obtain a copy of all data sent to the cloud and forward it to a server controlled by the attacker.

DPO Function - It’s Easy to Transfer

Processing of Special Categories of Data

The processing of special categories of data may take place, for example, when managing appointment times in users’ calendars or when processing voice or biometric data for the purpose of identifying a user. The very content of issued commands may also contain special-category data (e.g. reveal a person’s religion or political views).

Example: a group of users configures a voice assistant to use voice model recognition. Each user then saves their voice model. Later, one of them asks the assistant to access meetings that are in their calendar. Since access to the calendar requires user identification, the assistant extracts the model from the voice issuing the command, calculates its voice model, and checks whether it matches the registered user and whether that particular user has access to the calendar.

The processing of biometric data for the purpose of identifying a user (as in the example above) will require the explicit consent of the data subject. When voice data are used for biometric identification or authentication, data controllers are required to ensure transparency regarding where biometric identification is used and how voice models (biometric models) are stored and transferred across different devices. Detecting the voice of the relevant speaker also requires comparing it with the voices of other persons in the vicinity of the assistant. To avoid such collection of biometric data without the knowledge of the data subjects, while at the same time enabling the assistant to recognise the user, priority should be given to solutions based solely on user data.

Data Minimisation

Controllers should minimise the amount of data collected directly or indirectly and obtained through processing and analysis. For example, they should not perform any analysis of the user’s voice or other audio information in order to obtain information about the user’s mental state, possible illness, or life circumstances.

Where background noise occurs, it should be noted that even if it does not contain voice data, it may contain situational data that can be processed in order to obtain information about the participant (e.g. their location). Designers of voice assistants should consider technologies that remove background noise in order to avoid recording and processing voices and situational information in the background.

Mechanisms for exercising the rights of data subjects

The data controller should provide information about the rights of data subjects when they activate the voice assistant, and no later than when processing the user’s first voice command. Given that the main mode of interaction with assistants is voice, their designers should ensure that users – both registered and unregistered – can exercise their rights by means of easy-to-understand voice commands.

GDPR e-learning

GDPR e-learning is already the standard!

Employees gain knowledge about data protection in an accessible and practical way. Final tests confirm the training results, and a certificate documents them.
SEE MORE
With regard to the right of access, it should be remembered that simply directing users to the history of their interactions with the voice assistant will not usually enable the data controller to fulfil all obligations arising from the right of access. The available data generally constitute only part of the information processed in the context of providing the service. As regards the right to rectification, it should be borne in mind that it applies to any opinions and conclusions of the data controller, including profiling, and should take into account the fact that the vast majority of data is highly subjective.

As regards the right to erasure, it should be noted that it is difficult to enforce it through anonymisation of sets of personal data due to the inherent problems associated with the anonymisation of voice data and the great diversity of personal data collected from the data subject, observed about them and inferred from them. However, since the GDPR is technology-neutral and technology is evolving rapidly, it cannot be ruled out that the right to erasure may be realised through anonymisation. In the case of any data processing, and in particular where the registered data subjects consent to the transcription of voice recordings and their use for the purpose of improving service quality, providers of voice assistants should, at the user’s request, be able to delete the original voice recording as well as any related transcription of personal data. The data controller should ensure that, once the right to erasure has been exercised, the data can no longer be processed. For example, if, before submitting a request for erasure, a user made an online purchase using their voice assistant, the assistant provider may delete the voice recording relating to the online purchase and prevent any further use of that recording in the future. The purchase will, however, remain valid.

Data processing by voice assistant providers falls within the scope of the right to data portability, because the processing operations are based mainly on the consent of the data subject or on a contract to which the data subject is a party. In practice, the right to data portability should facilitate switching between different voice assistant providers. As regards the format, assistant providers should provide personal data using commonly used open formats (e.g. .mp3, .wav, .csv, .gsm) together with the relevant metadata used to accurately describe the meaning of the exchanged information.

Summary

In 2023, the U.S. Federal Trade Commission (FTC) accused Amazon of violating the Children’s Online Privacy Protection Act by failing to delete children’s voice and location data upon request and by using such data to improve its algorithms. The matter resulted in the imposition of a fine of USD 25 million. In January 2025, Apple agreed to end a five-year court case concerning allegations that it had secretly activated Siri for more than a decade in order to record conversations through iPhones and other devices equipped with the virtual assistant.

The above cases show that even the largest companies must comply with privacy protection rules and ensure that users have control over their data. Although the use of voice assistants can significantly improve everyday convenience, we should remember that, from the user’s perspective, what matters most is what ambient sounds the assistant collects and what it does with them thereafter.

Read also:

Receive a free package of 4 tutorials and 4 e-learning trainings
The controller of your data is ODO 24 sp. z o. o.