A fifth round of the trilogue negotiations on the Artificial Intelligence (AI) Act is scheduled for October 24, 2023. Together with Creative Commons, and Wikimedia Europe, COMMUNIA, in a statement, calls on the co-legislators to take a holistic approach on AI transparency and agree on proportionate solutions.
As discussed in greater detail in our Policy Paper #15, COMMUNIA deems it essential that the flexibilities for text-and-data mining enshrined in Articles 3 and 4 of the Copyright in the Digital Single Market Directive are upheld. For this approach to work in practice, we welcome practical initiatives for greater transparency around AI training data to understand whether opt-outs are being respected.
The full statement is provided below:
Statement on Transparency in the AI Act
The undersigned are civil society organizations advocating in the public interest, and representing knowledge users and creative communities.
We are encouraged that the Spanish Presidency is considering how to tailor its approach to foundation models more carefully, including an emphasis on transparency. We reiterate that copyright is not the only prism through which reporting and transparency requirements should be seen in the AI Act.
General transparency responsibilities for training data
Greater openness and transparency in the development of AI models can serve the public interest and facilitate better sharing by building trust among creators and users. As such, we generally support more transparency around the training data for regulated AI systems, and not only on training data that is protected by copyright.
We also believe that the existing copyright flexibilities for the use of copyrighted materials as training data must be upheld. The 2019 Directive on Copyright in the Digital Single Market and specifically its provisions on text-and-data mining exceptions for scientific research purposes and for general purposes provide a suitable framework for AI training. They offer legal certainty and strike the right balance between the rights of rightsholders and the freedoms necessary to stimulate scientific research and further creativity and innovation.
We support a proportionate, realistic, and practical approach to meeting the transparency obligation, which would put less onerous burdens on smaller players including non-commercial players and SMEs, as well as models developed using FOSS, in order not to stifle innovation in AI development. Too burdensome an obligation on such players may create significant barriers to innovation and drive market concentration, leading the development of AI to only occur within a small number of large, well-resourced commercial operators.
Lack of clarity on copyright transparency obligation
We welcome the proposal to require AI developers to disclose the copyright compliance policies followed during the training of regulated AI systems. We are still concerned with the lack of clarity on the scope and content of the obligation to provide a detailed summary of the training data. AI developers should not be expected to literally list out every item in the training content. We maintain that such level of detail is not practical, nor is it necessary for implementing opt-outs and assessing compliance with the general purpose text-and-data mining exception. We would welcome further clarification by the co-legislators on this obligation. In addition, an independent and accountable entity, such as the foreseen AI Office, should develop processes to implement it.