Shipwreck off a Rocky Coast

Can Voss salvage the text and data mining exception?

Last week’s big news was dominated by the agreement from COREPER on a negotiating mandate for the proposed Directive on Copyright in the Digital Single Market. The verdict: Member States have agreed on a text that fails to address the biggest shortcomings of the Commission’s proposal, and in a number of ways actually makes it worse.

But recently Rapporteur MEP Axel Voss also published a his first proposal for a compromise amendment on Article 3, the exception for text and data mining (TDM).

Since the release of the original Commission proposal, we’ve criticised the TDM exception as not going far enough to achieve its intended objectives, because it would limit the beneficiaries of the exception only to research organisations, and only for purposes of scientific research. While there were interesting amendments floated by a few of the Parliamentary committees, it seems that few of the progressive changes have been seriously considered by JURI.

In parallel, the Council presidencies have not done anything that would significantly improve the situation, either, with their main contribution being the introduction of an optional provision, often referred to as “3a”. This additional arrangement would cover TDM activities that fall under temporary reproductions and extractions, and would apply to beneficiaries beyond research organisations, and for uses other than scientific research. But those acts would be limited in that they only apply for works for which rights holders are not explicitly prohibiting such uses.

Voss’ compromise amendment is a mashup of Article 3 of the Commission’s proposal and Article 3a of the Council text. In opposition to his approach in many other areas, the changes here seem to be a reasonable attempt at arriving at a compromise between those who agree with the Commission’s original narrow approach and those (like us) — who argue for a much broader exception that allows anyone to engage in text and data mining for any purpose. The devil of course is in the details of the proposed text.

Baseline beneficiaries and scope: same old song?

Voss’ proposal is based on the original thinking from the Commission’s proposal. It includes the foundation that limits beneficiaries to research organisations, and only for purposes of scientific research, but those baseline beneficiaries would be covered by an exception that would allow them to conduct TDM on anything to which they have acquired lawful access:

b) reproductions and extractions of works or other subject-matter to which they have acquired lawful access made in order to carry out on a non-for-profit basis text and data mining for the purposes of scientific research by research organisations and cultural heritage institutions.

For scientific research institutions and cultural heritage institutions, this comes pretty close to the “the right to read is the right to mine” standard. However, the inclusion of the word “acquired” in the phrase “to which they have acquired lawful access” could turn out to be a poison pill for these beneficiaries. The addition of “acquired” introduces legal uncertainty into the equation, and could limit the types of works that beneficiaries can mine under the exception, especially if “acquired” is interpreted in such a way that beneficiaries need to have actively concluded licenses with content providers. This innocuous change should be rejected to restore a more flexible interpretation for potential beneficiaries.

New text to permit other TDM uses: Will it work?

On the positive side, Voss’ compromise proposal attempts to expand the range of beneficiaries covered by Article 3 to include users beyond those within research organisations, and for purposes outside of strictly scientific research.

A crucial problem of the Commission’s plan was that the TDM exception would be available only to research organisations that operate on a not-for-profit basis or pursuant to a public interest mission as recognised by a Member State. The practical effect of this limitation means that the private sector will be excluded from the benefits of the exception. Therefore, under the Commission’s proposal the ability to undertake TDM would be off limits to important stakeholder groups such as journalists, citizen scientists, social enterprises, civil society organisations and cultural heritage organisations, all of whom stand to benefit from automated data analysis.

Voss’ inclusion of an additional provision attempts to bridge this gap.

a) reproductions and extractions made by research organisations in order to carry out text and data mining of as regards works or and other subject-matter to which they have lawful access for the purposes of scientific research that are lawfully available online, provided that the rightholder has not reserved such uses in a machine readable format.

This version constitutes a clear improvement to the Commission’s original plan. And it’s also a better approach than the Council’s “3a” wording because the expansion to the exception would be mandatory, not merely optional. This clause means that any user would be allowed to conduct TDM on a work, except for those materials the right holders have reserved the right to do so because they’ve made them available in a machine readable format. The requirement for such reservations to be made in machine readable format is important (and goes beyond the Council’s suggestion that such requirements could be made “by technical means”) since it opens the possibility of engaging in text and data mining of online resources at scale.

Voss would not be Voss if he didn’t include some special provisions for press publishers: the compromise amendment text gives press publishers the ability to opt out of the scope of the extended exception (but not the baseline exception) by “listing their websites in a central point of information online.” What looks at first sight like special treatment of press publishers actually contains an interesting concept that should be expanded beyond press publications.

Opting out (reserving the rights to text and data mining) via a central online information point (and providing this information in a machine readable format) would make it relatively easy for anyone engaging in text and data mining to respect such opt outs. From our perspective the “machine readable format” requirement should be combined with the “central point of information online” idea to create a more robust mechanism that both ensures that opt outs can be respected with minimal effort and that text and data mining of online resources can happen at scale.

All in all, the compromise amendment makes some reasonable steps toward improving the TDM exception. Voss’ proposal would be a substantial improvement over the Commission’s draft plan. But overall it will continue to constrain the way Europeans and European businesses can interact with online information, which will limit innovation in the EU. We still think that a simplified, mandatory exception that permits TDM by anyone for any purpose is the best option to future-proof this aspect of the copyright reform and promote an active and innovative EU technology sector. Unfortunately, under the prevailing political situation Voss’ compromise may be the closest we can get to such an approach.

Several men standing in a bull-fighting arena, one man on a horse
Featured Blog post:
A first look at the Spanish proposal to introduce ECL for AI training
Read more
Newer post
104 Members of Parliament agree: It’s time to dump the #LinkTax
June 7, 2018
Older post
SCCR/36: COMMUNIA statement on educational and research exceptions
May 31, 2018