In the context of the enforcement of the Regulation (EU) 2024/1689 on Artificial Intelligence (the AI Act), providers of general-purpose AI (GPAI) models will be required to publish a detailed summary of the data used to train their models. This obligation, set out in Article 53 of the regulation, serves a dual purpose: enhancing transparency for stakeholders with legitimate interests, such as copyright holders, while safeguarding trade secrets and the competitiveness of providers subject to these rules. The report focuses on the crucial issues of this obligation, in particular the search for an effective and efficient measure that preserves the balance between transparency, protection of the legitimate interests of all stakeholders and stimulation of innovation.
The working group, comprising representatives from technology companies, AI researchers, and legal experts, has worked to identify the legitimate interests at play in this transparency exercise, while offering clear recommendations on the format and substance of the required summary. These recommendations aim to ensure that rights are effectively exercised without undermining the competitive advantage of providers.
At the outset, the study explores in depth the legal, technical, and ethical implications of the transparency requirement. It provides an analysis of the obligation to develop and publish the summary, particularly in light of the need to protect the legitimate interests of all stakeholders. The report highlights potential tensions between the rights of copyright holders, who need to verify the lawful use of their data, and the concerns of providers regarding the excessive disclosure of competitively sensitive information.
The recommendations seek to reconcile these competing imperatives. They advocate for a training data summary that is clear, accessible, and sufficiently detailed to ensure practical utility and effectiveness, while avoiding excessive burdens on providers. The inclusion of narrative explanatory elements is encouraged to contextualize decisions, particularly where certain information cannot be disclosed for legitimate reasons. Finally, the report emphasizes the importance of harmonizing practices through technical standards, optimizing personal data management in compliance with the GDPR, and establishing points of contact to facilitate exchanges between stakeholders.
This study, conducted in the context of increasing regulation of digital technologies, underscores the need for effective but proportionate transparency. Such transparency is essential to build user trust and support ethical and responsible innovation across Europe.