Yesterday the Brussels-based think tank The Lisbon Council published the policy brief Text and Data Mining for Research and Innovation: What Europe Must Do Next. It was written by Sergey Filippov and Paul Hofheinz. In the paper, the authors analyse contemporary text and data mining (TDM) trends, and make recommendations for how European policymakers can better support researchers who wish to engage in TDM activities.
The authors observe that Europe has fallen behind other parts of the world in text and data mining research. One reason is due to the ambiguous legal environment in Europe surrounding TDM. In 2014 the United Kingdom adopted a copyright exception for text and data mining for non-commercial research purposes, but the situation for other countries in Europe is not so clear. The European Commission has not been entirely helpful, either. In their December 2015 communication on copyright, they said they would consider introducing an exception for TDM. However, instead of recommending a robust exception that would truly support text and data mining as an increasingly important research tool, the Commission suggested a narrow interpretation that would restrict TDM only to those affiliated with a “public interest research institution”, and only for “scientific research purposes.”
In their paper, Filippov and Hofheinz say that European researchers may be “hesitant to perform valuable analysis that may or may not be legal”, and that scholars “are forced, on occasion, to outsource their text-and-data-mining needs to researchers elsewhere in the world.” They recognize that some of the language in play—such as “public interest research organisation”, “scientific research purposes”, and “non-commercial”—could be open to misinterpretation, or even be at odds with the underlying public policy intention.
As we have noted that the Commission’s approach would only cover an extremely limited set of beneficiaries. It would result in a situation where text and data mining by anyone who is not associated with a “public interest research organisations” would require a license which would put up additional barriers to using TDM for citizens, innovators and European companies. It’s unsettling to see that established commercial publishers are trying to position TDM as something they are permitted to control. The Lisbon Council paper observes that some publishers are requiring users to get a license in order to conduct text and data mining on the publisher’s research database—even though those researchers already have legal access to the content. In addition, incumbent publishers are introducing further restrictions on TDM, such as the condition that a researcher may only engage in it for non-commercial purposes. The authors of the report see these additional conditions as increasingly burdensome for researchers, especially since the terms are different from publisher to publisher.
Our recommendations on EU copyright reform call for “the development of clear rules for researchers who must be able to read and analyse all information that is available to them, whether through text and data mining or otherwise.” In other words, we believe that “the right to read is the right to mine”. In this light we agree with a primary recommendation offered in the Lisbon Council report:
We believe that the only workable and justifiable solution is the least ambiguous one: a harmonised, mandatory exception at the EU level covering all text-and-data-mining activities, for any purpose, commercial and non-commercial, and an exception that cannot be overriden by a contract and is applicable to all rights holders – corporate, individual, public and private.
Read the full report here.