Meta May Use User Data to Train AI: What the German Ruling Says

The reasons for the ruling issued by the High Court of Cologne , which on May 23, 2025 recognized Meta's right to use public content made available by its customers to train AI models, are finally public, accepting Big Tech's argument that "there is no reasonable alternative that Meta can pursue to achieve its interests as effectively with other, less invasive means."

The origin of the controversy

The ruling was issued following a lawsuit filed by a German consumer association that complained about the violation of the right to protection of customers' personal data caused by Meta's choice.

Specifically, the association accused Meta of failing to demonstrate that using its customers' data to train an AI was necessary and appropriate under the General Data Protection Regulation (GDPR) and that the activity was prohibited because it also consisted of the processing of "special" personal data — for example, health data — without being able to invoke any exceptions provided for by the GDPR.

Meta defended itself by claiming that it had a “legitimate interest” in using public content circulating on its platforms that is compatible with the GDPR and that it had adopted a series of measures that reduced the risks to the rights of individuals to an acceptable level.

In particular, the ruling states, Meta declared that it had limited the use of data to that made public by customers, that it had provided for the possibility of changing the status of the contents from public to private, thus excluding them from use, that it had informed customers and given them an effective possibility to oppose the processing, that it had de-identified the information relating to single individuals, that it had "tokenized" them (i.e. reduced them to mathematical values necessary to allow the model to perform the calculation operations) and therefore that it had decoupled them from the personal identity of the individuals, and that it had adopted security measures throughout the development cycle of the model.

In ruling in favor of Meta, the German court has put in black and white a series of principles that strongly scale down the widespread interpretation — even in Italy — of the legislation on the protection of personal data by affirming a series of principles that are also valid outside of issues relating to AI.

The GDPR also protects economic interests and not just the rights of the individual

“In addition to legal and ideological interests, economic interests are also considered legitimate interests,” writes the Court, recalling a ruling by the Court of Justice of the European Union , which had recognized the relevance of the commercial interest of a sports federation in communicating the data of its athletes.

Furthermore, the ruling continues, “Anonymization of such datasets is not practicable, since it is often difficult to obtain the consent of the data subjects with reasonable effort. This interpretation also reflects the “duality” of the protection purposes of the GDPR, which not only serves to protect personal data, but also to ensure the free circulation of such data and therefore their usability.”

So, even if in reality it is the regulation on the protection of personal data that states it and there would have been no need to say it, the ruling specifies that the interests of companies have the same dignity as the rights of individuals. In still different terms: there is no prevalence "of principle" that prevents the use of personal data in the context of economic activity. The important thing, the Court reiterates, is that this use is actually necessary and indispensable to obtain a lawful result, even if not expressly provided for by law .

To understand the scope of this principle, just think about the issues related to the storage of internet traffic data and email metadata, those related to the use of analytics or those deriving from the “pay or okay” model — or rather “pay in money or pay in data” . In light of this ruling, it is not true that these activities are unlawful as such but the relationship between the “sacrifice” concretely imposed on the customer and the objectives of the supplier must be verified, case by case. If in practice the risks for the fundamental rights and freedoms of the person are sufficiently limited, a company cannot be prevented from processing the related personal data.

The risks to be considered are only those directly connected to the functioning of the model

Another fundamental principle for the development of AI in the European Union is that, when assessing the consequences of processing personal data, only those related to the training of the AI itself should be considered.

The judges write on this point: “Other possible violations of the law that could arise from the subsequent functioning of the AI (such as disinformation, manipulations, other harmful practices) are currently not sufficiently foreseeable and can be prosecuted separately. In any case, the possibility that such risks will materialize to such an extent as to make the legitimate use of the AI impossible and, ultimately, to call into question the appropriateness of the data processing is remote.”

With extreme lucidity, the judges affirm the principle according to which to evaluate whether or not personal data can be used to train an AI, only the direct consequences deriving from the use of the data in question must be considered and not the fact that someone could use the model in the future to commit illegal acts. In this case, the court notes, other existing rules apply because, it is deduced, the AI model is the tool with which the laws are violated and not the author of the violation.

Total anonymization is not necessary

Another point of contention between the parties was the de-identification through the elimination of data relating to people, but on the permanence of the photos.

Meta considered it sufficient to have eliminated data such as full names, email addresses, telephone numbers, national identification numbers, user identifiers, credit/debit card numbers, bank account numbers/bank codes, license plates, IP addresses and postal addresses and to transform them into an unstructured and “tokenized” form. On this point, it states, “While this does not exclude that, despite de-identification, identification may still occur in individual cases, the court considers that these measures will reduce the risk overall”.

Training an AI is not a treatment targeted to a specific individual

Here too, it is appropriate to quote the judgment verbatim: “the development of Large Language Models based on very large datasets does not usually concern the targeted processing of personal data or the identification of a specific person” and again “the prerequisites that allow non-targeted processing are sufficiently satisfied by the purpose of AI training, which is intended to create general models for calculating probabilities and not to profile individual people”, as well as by the numerous protective measures adopted by the defendants.”

This is a central passage of the ruling because it reiterates another aspect that has practically never been considered in the (Italian) application of the GDPR: the regulation applies to those treatments that identify or make identifiable a specific person and not categories or groups. Therefore, given that the tokenization of the contents of the posts spread on Meta's social networks was achieved through sufficient de-identification of the individuals, the treatments of the data thus obtained do not violate the law.

Here too, the practical consequences of the legal principle go beyond the scope of AI because, for example, they disprove the thesis according to which all profiling activities carried out, for example, using trackers, IP numbers or other tools that identify devices or software and not, instead, who is using them, are systematically in violation of the law.

A message to the European Commission and national data protection authorities

As repeated several times, this process takes on a more general value that transcends the Meta question because it concerns the relationship between the ideological presuppositions of standardization and the industrial consequences of technological development.

It is quite evident that over the course of almost ten years, the GDPR has been unilaterally interpreted to the detriment of the legitimate interests of those who innovate, in the name of a fetishization of “privacy” (a term which is also absent in the European regulation).

Therefore, the national protection authorities have adopted soft-law provisions and measures that have not taken into due consideration what the regulation had already foreseen since its promulgation: as long as one moves within the perimeter of the law, there are no absolute prohibitions on the processing of personal data but a balancing of interests, and the balancing of interests must be verified on a case-by-case basis.

The GDPR is certainly not perfect and would require a profound overhaul from the ground up, but this ruling demonstrates that it can be interpreted reasonably, also taking into account the rules that protect research and business.

To be clear, this is not about calling for a “free hand” for Big Tech or, in general, for businesses and therefore sacrificing the person on the altar of profit, but the opposite cannot be done either, in the name of an ambiguity that has never been clarified about the role that information technologies can and must have in transforming our society.

This is the point that the European Commission should consider in adopting the operational acts of the AI regulation and in identifying the changes to the GDPR that are finally being discussed.

repubblica

Meta May Use User Data to Train AI: What the German Ruling Says

The origin of the controversy

The ruling was issued following a lawsuit filed by a German consumer association that complained about the violation of the right to protection of customers' personal data caused by Meta's choice.

The GDPR also protects economic interests and not just the rights of the individual

The risks to be considered are only those directly connected to the functioning of the model

Total anonymization is not necessary

Another point of contention between the parties was the de-identification through the elimination of data relating to people, but on the permanence of the photos.

Training an AI is not a treatment targeted to a specific individual

A message to the European Commission and national data protection authorities

This is the point that the European Commission should consider in adopting the operational acts of the AI regulation and in identifying the changes to the GDPR that are finally being discussed.

repubblica

Meta May Use User Data to Train AI: What the German Ruling Says

Similar News

Meta May Use User Data to Train AI: What the German Ruling Says

Similar News

Why The New York Times vs. OpenAI Legal Battle Has Europe Concerned

Meta AI, How Chats in the US End Up in the Public Square. Here's How to Avoid It

Energy label for smartphones and tablets: how it works

Research, progress against myeloma: a new bispecific antibody arrives

“Healthy” is born, a virtual assistant in the digital museum dedicated to health