What Is a ‘Research Organisation’ and Why It Matters: From Text and Data Mining to AI Research

GRUR International, 2025

Copyright, text and data mining


Article{nokey, title = {What Is a ‘Research Organisation’ and Why It Matters: From Text and Data Mining to AI Research}, author = {Quintais, J.}, doi = {https://doi.org/10.1093/grurint/ikaf030}, year = {2025}, date = {2025-03-19}, journal = {GRUR International}, keywords = {Copyright, text and data mining}, }

Towards a European Research Freedom Act: A Reform Agenda for Research Exceptions in the EU Copyright Acquis external link



This article explores the impact of EU copyright law on the use of protected knowledge resources in scientific research contexts. Surveying the current copyright/research interface, it becomes apparent that the existing legal framework fails to offer adequate balancing tools for the reconciliation of divergent interests of copyright holders and researchers. The analysis identifies structural deficiencies, such as fragmented and overly restrictive research exceptions, opaque lawful access provisions, outdated non-commercial use requirements, legal uncertainty arising from the three-step test in the EU copyright acquis, obstacles posed by the protection of paywalls and other technological measures, and exposure to contracts that override statutory research freedoms. Empirical data confirm that access barriers, use restrictions and the absence of harmonised rules for transnational research collaborations impede the work of researchers. Against this background, we advance proposals for legislative reform, in particular the introduction of a mandatory, open-ended research exemption that offers reliable breathing space for scientific research across EU Member States, the clarification of lawful access criteria, a more flexible approach to public-private partnerships, and additional rules that support modern research methods, such as text and data mining.

Copyright, open science, research exceptions, right to research, technological protection measures, text and data mining, three-step test


Online publication{nokey, title = {Towards a European Research Freedom Act: A Reform Agenda for Research Exceptions in the EU Copyright Acquis}, author = {Senftleben, M. and Szkalej, K. and Sganga, C. and Margoni, T.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5130069}, year = {2025}, date = {2025-02-11}, abstract = {This article explores the impact of EU copyright law on the use of protected knowledge resources in scientific research contexts. Surveying the current copyright/research interface, it becomes apparent that the existing legal framework fails to offer adequate balancing tools for the reconciliation of divergent interests of copyright holders and researchers. The analysis identifies structural deficiencies, such as fragmented and overly restrictive research exceptions, opaque lawful access provisions, outdated non-commercial use requirements, legal uncertainty arising from the three-step test in the EU copyright acquis, obstacles posed by the protection of paywalls and other technological measures, and exposure to contracts that override statutory research freedoms. Empirical data confirm that access barriers, use restrictions and the absence of harmonised rules for transnational research collaborations impede the work of researchers. Against this background, we advance proposals for legislative reform, in particular the introduction of a mandatory, open-ended research exemption that offers reliable breathing space for scientific research across EU Member States, the clarification of lawful access criteria, a more flexible approach to public-private partnerships, and additional rules that support modern research methods, such as text and data mining.}, keywords = {Copyright, open science, research exceptions, right to research, technological protection measures, text and data mining, three-step test}, }

Generative AI, Copyright and the AI Act external link

Computer Law & Security Review, vol. 56, num: 106107, 2025


This paper provides a critical analysis of the Artificial Intelligence (AI) Act's implications for the European Union (EU) copyright acquis, aiming to clarify the complex relationship between AI regulation and copyright law while identifying areas of legal ambiguity and gaps that may influence future policymaking. The discussion begins with an overview of fundamental copyright concerns related to generative AI, focusing on issues that arise during the input, model, and output stages, and how these concerns intersect with the text and data mining (TDM) exceptions under the Copyright in the Digital Single Market Directive (CDSMD). The paper then explores the AI Act's structure and key definitions relevant to copyright law. The core analysis addresses the AI Act's impact on copyright, including the role of TDM in AI model training, the copyright obligations imposed by the Act, requirements for respecting copyright law—particularly TDM opt-outs—and the extraterritorial implications of these provisions. It also examines transparency obligations, compliance mechanisms, and the enforcement framework. The paper further critiques the current regime's inadequacies, particularly concerning the fair remuneration of creators, and evaluates potential improvements such as collective licensing and bargaining. It also assesses legislative reform proposals, such as statutory licensing and AI output levies, and concludes with reflections on future directions for integrating AI governance with copyright protection.

AI Act, Content moderation, Copyright, DSA, Generative AI, text and data mining, Transparency


Article{nokey, title = {Generative AI, Copyright and the AI Act}, author = {Quintais, J.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4912701}, doi = {https://doi.org/10.1016/j.clsr.2025.106107}, year = {2025}, date = {2025-01-30}, journal = {Computer Law & Security Review}, volume = {56}, number = {106107}, pages = {}, abstract = {This paper provides a critical analysis of the Artificial Intelligence (AI) Act\'s implications for the European Union (EU) copyright acquis, aiming to clarify the complex relationship between AI regulation and copyright law while identifying areas of legal ambiguity and gaps that may influence future policymaking. The discussion begins with an overview of fundamental copyright concerns related to generative AI, focusing on issues that arise during the input, model, and output stages, and how these concerns intersect with the text and data mining (TDM) exceptions under the Copyright in the Digital Single Market Directive (CDSMD). The paper then explores the AI Act\'s structure and key definitions relevant to copyright law. The core analysis addresses the AI Act\'s impact on copyright, including the role of TDM in AI model training, the copyright obligations imposed by the Act, requirements for respecting copyright law—particularly TDM opt-outs—and the extraterritorial implications of these provisions. It also examines transparency obligations, compliance mechanisms, and the enforcement framework. The paper further critiques the current regime\'s inadequacies, particularly concerning the fair remuneration of creators, and evaluates potential improvements such as collective licensing and bargaining. It also assesses legislative reform proposals, such as statutory licensing and AI output levies, and concludes with reflections on future directions for integrating AI governance with copyright protection.}, keywords = {AI Act, Content moderation, Copyright, DSA, Generative AI, text and data mining, Transparency}, }

The paradox of lawful text and data mining? Some experiences from the research sector and where we (should) go from here external link


Scientific research can be tricky business. This paper critically explores the 'lawful access' requirement in European copyright law which applies to text and data mining (TDM) carried out for the purpose of scientific research. Whereas TDM is essential for data analysis, artificial intelligence (AI) and innovation, the paper argues that the 'lawful access' requirement in Article 3 CDSM Directive may actually restrict research by complicating the applicability of the TDM provision or even rendering it inoperable. Although the requirement is intended to ensure that researchers act in good faith before deploying TMD tools for purposes such as machine learning, it forces them to ask for permission to access data, for example by taking out a subscription to a service, and for that reason provides the opportunity for copyright holders to apply all sorts of commercial strategies to set the legal and technological parameters of access and potentially even circumvent the mandatory character of the provision. The paper concludes by drawing on insights from the recent European Commission study 'Improving access to and reuse of research results, publications and data for scientific purposes' that offer essential perspectives for the future of TDM, and by suggesting a number of paths forward that EU Member States can take already now in order to support a more predictable and reliable legal regime for scientific TDM and potentially code mining to foster innovation.

ai, CDSM Directive, Copyright, text and data mining


Article{nokey, title = {The paradox of lawful text and data mining? Some experiences from the research sector and where we (should) go from here}, author = {Szkalej, K.}, url = {https://ssrn.com/abstract=5000116 }, doi = {https://doi.org/10.2139/ssrn.5000116 }, year = {2024}, date = {2024-11-04}, abstract = {Scientific research can be tricky business. This paper critically explores the \'lawful access\' requirement in European copyright law which applies to text and data mining (TDM) carried out for the purpose of scientific research. Whereas TDM is essential for data analysis, artificial intelligence (AI) and innovation, the paper argues that the \'lawful access\' requirement in Article 3 CDSM Directive may actually restrict research by complicating the applicability of the TDM provision or even rendering it inoperable. Although the requirement is intended to ensure that researchers act in good faith before deploying TMD tools for purposes such as machine learning, it forces them to ask for permission to access data, for example by taking out a subscription to a service, and for that reason provides the opportunity for copyright holders to apply all sorts of commercial strategies to set the legal and technological parameters of access and potentially even circumvent the mandatory character of the provision. The paper concludes by drawing on insights from the recent European Commission study \'Improving access to and reuse of research results, publications and data for scientific purposes\' that offer essential perspectives for the future of TDM, and by suggesting a number of paths forward that EU Member States can take already now in order to support a more predictable and reliable legal regime for scientific TDM and potentially code mining to foster innovation.}, keywords = {ai, CDSM Directive, Copyright, text and data mining}, }

Machine readable or not? – notes on the hearing in LAION e.v. vs Kneschke external link

Kluwer Copyright Blog, 2024

Artificial intelligence, Germany, text and data mining


Online publication{nokey, title = {Machine readable or not? – notes on the hearing in LAION e.v. vs Kneschke}, author = {Keller, P.}, url = {https://copyrightblog.kluweriplaw.com/2024/07/22/machine-readable-or-not-notes-on-the-hearing-in-laion-e-v-vs-kneschke/}, year = {2024}, date = {2024-07-22}, journal = {Kluwer Copyright Blog}, keywords = {Artificial intelligence, Germany, text and data mining}, }

TDM: Poland challenges the rule of EU copyright law external link

Kluwer Copyright Blog, 2024

Copyright, EU, Poland, text and data mining


Online publication{nokey, title = {TDM: Poland challenges the rule of EU copyright law}, author = {Keller, P.}, url = {https://copyrightblog.kluweriplaw.com/2024/02/20/tdm-poland-challenges-the-rule-of-eu-copyright-law/}, year = {2024}, date = {2024-02-20}, journal = {Kluwer Copyright Blog}, keywords = {Copyright, EU, Poland, text and data mining}, }

Generative AI and Author Remuneration

IIC, vol. 54, pp: 1535-1560, 2023


With the evolution of generative AI systems, machine-made productions in the literary and artistic field have reached a level of refinement that allows them to replace human creations. The increasing sophistication of AI systems will inevitably disrupt the market for human literary and artistic works. Generative AI systems provide literary and artistic output much faster and cheaper. It is therefore foreseeable that human authors will be exposed to substitution effects. They may lose income as they are replaced by machines in sectors ranging from journalism and writing to music and visual arts. Considering this trend, the question arises whether it is advisable to take measures to compensate human authors for the reduction in their market share and income. Copyright law could serve as a tool to introduce an AI levy system and ensure the payment of equitable remuneration. In combination with mandatory collective rights management, the new revenue stream could be used to finance social and cultural funds that improve the working and living conditions of flesh-and-blood authors.

collective rights management, Copyright, Freedom of expression, text and data mining, three-step test


Article{nokey, title = {Generative AI and Author Remuneration}, author = {Senftleben, M.}, doi = {https://doi.org/10.1007/s40319-023-01399-4}, year = {2023}, date = {2023-11-07}, journal = {IIC}, volume = {54}, pages = {1535-1560}, abstract = {With the evolution of generative AI systems, machine-made productions in the literary and artistic field have reached a level of refinement that allows them to replace human creations. The increasing sophistication of AI systems will inevitably disrupt the market for human literary and artistic works. Generative AI systems provide literary and artistic output much faster and cheaper. It is therefore foreseeable that human authors will be exposed to substitution effects. They may lose income as they are replaced by machines in sectors ranging from journalism and writing to music and visual arts. Considering this trend, the question arises whether it is advisable to take measures to compensate human authors for the reduction in their market share and income. Copyright law could serve as a tool to introduce an AI levy system and ensure the payment of equitable remuneration. In combination with mandatory collective rights management, the new revenue stream could be used to finance social and cultural funds that improve the working and living conditions of flesh-and-blood authors.}, keywords = {collective rights management, Copyright, Freedom of expression, text and data mining, three-step test}, }

Compliance of National TDM Rules with International Copyright Law: An Overrated Nonissue? external link

IIC - International Review of Intellectual Property and Competition Law, vol. 53, pp: 1477-1505, 2022


Seeking to devise an adequate regulatory framework for text and data mining (TDM), countries around the globe have adopted different approaches. While considerable room for TDM can follow from the application of fair use provisions (US) and broad statutory exemptions (Japan), countries in the EU rely on a more restrictive regulation that is based on specific copyright exceptions. Surveying this spectrum of existing approaches, lawmakers in countries seeking to devise an appropriate TDM regime may wonder whether the adoption of a restrictive approach is necessary in the light of international copyright law. In particular, they may feel obliged to ensure compliance with the three-step test laid down in Art. 9(2) of the Berne Convention, Art. 13 of the TRIPS Agreement and Art. 10 of the WIPO Copyright Treaty. Against this background, the analysis raises the question whether international copyright law covers TDM activities at all. TDM does not concern a traditional category of use that could have been contemplated at the diplomatic conferences leading to the current texts of the Berne Convention, the TRIPS Agreement and the WIPO Copyright Treaty. It is an automated, analytical type of use that does not affect the expressive core of literary and artistic works. Arguably, TDM constitutes a new category of copying that falls outside the scope of international copyright harmonization altogether.

Artificial intelligence, Auteursrecht, text and data mining


Article{nokey, title = {Compliance of National TDM Rules with International Copyright Law: An Overrated Nonissue?}, author = {Senftleben, M.}, url = {https://link.springer.com/article/10.1007/s40319-022-01266-8}, doi = {https://doi.org/10.1007/s40319-022-01266-8}, year = {2022}, date = {2022-11-25}, journal = {IIC - International Review of Intellectual Property and Competition Law}, volume = {53}, pages = {1477-1505}, abstract = {Seeking to devise an adequate regulatory framework for text and data mining (TDM), countries around the globe have adopted different approaches. While considerable room for TDM can follow from the application of fair use provisions (US) and broad statutory exemptions (Japan), countries in the EU rely on a more restrictive regulation that is based on specific copyright exceptions. Surveying this spectrum of existing approaches, lawmakers in countries seeking to devise an appropriate TDM regime may wonder whether the adoption of a restrictive approach is necessary in the light of international copyright law. In particular, they may feel obliged to ensure compliance with the three-step test laid down in Art. 9(2) of the Berne Convention, Art. 13 of the TRIPS Agreement and Art. 10 of the WIPO Copyright Treaty. Against this background, the analysis raises the question whether international copyright law covers TDM activities at all. TDM does not concern a traditional category of use that could have been contemplated at the diplomatic conferences leading to the current texts of the Berne Convention, the TRIPS Agreement and the WIPO Copyright Treaty. It is an automated, analytical type of use that does not affect the expressive core of literary and artistic works. Arguably, TDM constitutes a new category of copying that falls outside the scope of international copyright harmonization altogether.}, keywords = {Artificial intelligence, Auteursrecht, text and data mining}, }

Implementing User Rights for Research in the Field of Artificial Intelligence: A Call for International Action external link

Flynn, S., Geiger, C., Quintais, J., Margoni, T., Sag, M., Guibault, L. & Carroll, M.
European Intellectual Property Review, vol. 2020, num: 7, 2020


Last year, before the onset of a global pandemic highlighted the critical and urgent need for technology-enabled scientific research, the World Intellectual Property Organization (WIPO) launched an inquiry into issues at the intersection of intellectual property (IP) and artificial intelligence (AI). We contributed comments to that inquiry, with a focus on the application of copyright to the use of text and data mining (TDM) technology. This article describes some of the most salient points of our submission and concludes by stressing the need for international leadership on this important topic. WIPO could help fill the current gap on international leadership, including by providing guidance on the diverse mechanisms that countries may use to authorize TDM research and serving as a forum for the adoption of rules permitting cross-border TDM projects.

Artificial intelligence, Auteursrecht, frontpage, machine learning, tdm, text and data mining


Article{Flynn2020b, title = {Implementing User Rights for Research in the Field of Artificial Intelligence: A Call for International Action}, author = {Flynn, S. and Geiger, C. and Quintais, J. and Margoni, T. and Sag, M. and Guibault, L. and Carroll, M.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3578819}, year = {0421}, date = {2020-04-21}, journal = {European Intellectual Property Review}, volume = {2020}, number = {7}, pages = {}, abstract = {Last year, before the onset of a global pandemic highlighted the critical and urgent need for technology-enabled scientific research, the World Intellectual Property Organization (WIPO) launched an inquiry into issues at the intersection of intellectual property (IP) and artificial intelligence (AI). We contributed comments to that inquiry, with a focus on the application of copyright to the use of text and data mining (TDM) technology. This article describes some of the most salient points of our submission and concludes by stressing the need for international leadership on this important topic. WIPO could help fill the current gap on international leadership, including by providing guidance on the diverse mechanisms that countries may use to authorize TDM research and serving as a forum for the adoption of rules permitting cross-border TDM projects.}, keywords = {Artificial intelligence, Auteursrecht, frontpage, machine learning, tdm, text and data mining}, }

The New Copyright in the Digital Single Market Directive: A Critical Look external link

European Intellectual Property Review, vol. 42, num: 1, pp: 28-41, 2020


This article provides an overview and critical examination of the new Directive on copyright and related rights in the Digital Single Market. Despite some positive aspects, the Directive includes multiple problematic provisions, including the controversial new right for press publishers and the new liability regime for content-sharing platforms. On balance, the Directive denotes a normative preference for private ordering over public choice in EU copyright law, and lacks adequate safeguards for users. It is also a complex text with multiple ambiguities, which will likely fail promote the desired harmonization and legal certainty in this area.

Collective licensing, Copyright, digital content, Digital Single Market, EU law, exceptions and limitations, frontpage, Licensing, Online services, text and data mining


Article{Quintais2019e, title = {The New Copyright in the Digital Single Market Directive: A Critical Look}, author = {Quintais, J.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3424770}, year = {0107}, date = {2020-01-07}, journal = {European Intellectual Property Review}, volume = {42}, number = {1}, pages = {28-41}, abstract = {This article provides an overview and critical examination of the new Directive on copyright and related rights in the Digital Single Market. Despite some positive aspects, the Directive includes multiple problematic provisions, including the controversial new right for press publishers and the new liability regime for content-sharing platforms. On balance, the Directive denotes a normative preference for private ordering over public choice in EU copyright law, and lacks adequate safeguards for users. It is also a complex text with multiple ambiguities, which will likely fail promote the desired harmonization and legal certainty in this area.}, keywords = {Collective licensing, Copyright, digital content, Digital Single Market, EU law, exceptions and limitations, frontpage, Licensing, Online services, text and data mining}, }