The paradox of lawful text and data mining? Some experiences from the research sector and where we (should) go from here external link

Abstract

Scientific research can be tricky business. This paper critically explores the 'lawful access' requirement in European copyright law which applies to text and data mining (TDM) carried out for the purpose of scientific research. Whereas TDM is essential for data analysis, artificial intelligence (AI) and innovation, the paper argues that the 'lawful access' requirement in Article 3 CDSM Directive may actually restrict research by complicating the applicability of the TDM provision or even rendering it inoperable. Although the requirement is intended to ensure that researchers act in good faith before deploying TMD tools for purposes such as machine learning, it forces them to ask for permission to access data, for example by taking out a subscription to a service, and for that reason provides the opportunity for copyright holders to apply all sorts of commercial strategies to set the legal and technological parameters of access and potentially even circumvent the mandatory character of the provision. The paper concludes by drawing on insights from the recent European Commission study 'Improving access to and reuse of research results, publications and data for scientific purposes' that offer essential perspectives for the future of TDM, and by suggesting a number of paths forward that EU Member States can take already now in order to support a more predictable and reliable legal regime for scientific TDM and potentially code mining to foster innovation.

ai, CDSM Directive, Copyright, text and data mining

Bibtex

Article{nokey, title = {The paradox of lawful text and data mining? Some experiences from the research sector and where we (should) go from here}, author = {Szkalej, K.}, url = {https://ssrn.com/abstract=5000116 }, doi = {https://doi.org/10.2139/ssrn.5000116 }, year = {2024}, date = {2024-11-04}, abstract = {Scientific research can be tricky business. This paper critically explores the \'lawful access\' requirement in European copyright law which applies to text and data mining (TDM) carried out for the purpose of scientific research. Whereas TDM is essential for data analysis, artificial intelligence (AI) and innovation, the paper argues that the \'lawful access\' requirement in Article 3 CDSM Directive may actually restrict research by complicating the applicability of the TDM provision or even rendering it inoperable. Although the requirement is intended to ensure that researchers act in good faith before deploying TMD tools for purposes such as machine learning, it forces them to ask for permission to access data, for example by taking out a subscription to a service, and for that reason provides the opportunity for copyright holders to apply all sorts of commercial strategies to set the legal and technological parameters of access and potentially even circumvent the mandatory character of the provision. The paper concludes by drawing on insights from the recent European Commission study \'Improving access to and reuse of research results, publications and data for scientific purposes\' that offer essential perspectives for the future of TDM, and by suggesting a number of paths forward that EU Member States can take already now in order to support a more predictable and reliable legal regime for scientific TDM and potentially code mining to foster innovation.}, keywords = {ai, CDSM Directive, Copyright, text and data mining}, }

Generative AI, Copyright and the AI Act (v.2) external link

Abstract

Published 1 November 2024. This is a revised and extended version of a paper initially published in August 2024. This paper examines the copyright-relevant rules of the recently published Artificial Intelligence (AI) Act for the EU copyright acquis. The aim of the paper is to provide a critical overview of the relationship between the AI Act and EU copyright law, while highlighting potential gray areas and blind spots for legal interpretation and future policy-making. The paper proceeds as follows. After a short introduction, Section 2 outlines the basic copyright issues of generative AI and the relevant copyright acquis rules that interface with the AI Act. It mentions potential copyright issues with the input or training stage, the model, and outputs. The AI Act rules are mostly relevant for the training of AI models, and the Regulation primarily interfaces with the text and data mining (TDM) exceptions in Articles 3 and 4 of the Copyright in the Digital Single Market Directive (CDSMD). Section 3 then briefly explains the AI Act’s structure and core definitions as they pertain to copyright law. Section 4 is the heart of the paper. It covers in some detail the interface between the AI Act and EU copyright law, namely: the clarification that TDM is involved in training AI models (4.1); the outline of the key copyright obligations in the AI Act (4.2); the obligation to put in place policies to respect copyright law, especially regarding TDM opt-outs (4.3); the projected extraterritorial effect of such obligations (4.4); the transparency obligations (4.5); how the AI Act envisions compliance with such obligations (4.6); and potential enforcement and remedies (4.7). Section 5 offers some concluding remarks, focusing on the inadequacy of the current regime to address one of its main concerns: the fair remuneration of authors and performers.

AI Act, Content moderation, Copyright, DSA, Generative AI, text and data mining, Transparency

Bibtex

Working paper{nokey, title = {Generative AI, Copyright and the AI Act (v.2)}, author = {Quintais, J.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4912701}, year = {2024}, date = {2024-11-01}, abstract = {Published 1 November 2024. This is a revised and extended version of a paper initially published in August 2024. This paper examines the copyright-relevant rules of the recently published Artificial Intelligence (AI) Act for the EU copyright acquis. The aim of the paper is to provide a critical overview of the relationship between the AI Act and EU copyright law, while highlighting potential gray areas and blind spots for legal interpretation and future policy-making. The paper proceeds as follows. After a short introduction, Section 2 outlines the basic copyright issues of generative AI and the relevant copyright acquis rules that interface with the AI Act. It mentions potential copyright issues with the input or training stage, the model, and outputs. The AI Act rules are mostly relevant for the training of AI models, and the Regulation primarily interfaces with the text and data mining (TDM) exceptions in Articles 3 and 4 of the Copyright in the Digital Single Market Directive (CDSMD). Section 3 then briefly explains the AI Act’s structure and core definitions as they pertain to copyright law. Section 4 is the heart of the paper. It covers in some detail the interface between the AI Act and EU copyright law, namely: the clarification that TDM is involved in training AI models (4.1); the outline of the key copyright obligations in the AI Act (4.2); the obligation to put in place policies to respect copyright law, especially regarding TDM opt-outs (4.3); the projected extraterritorial effect of such obligations (4.4); the transparency obligations (4.5); how the AI Act envisions compliance with such obligations (4.6); and potential enforcement and remedies (4.7). Section 5 offers some concluding remarks, focusing on the inadequacy of the current regime to address one of its main concerns: the fair remuneration of authors and performers.}, keywords = {AI Act, Content moderation, Copyright, DSA, Generative AI, text and data mining, Transparency}, }

Machine readable or not? – notes on the hearing in LAION e.v. vs Kneschke external link

Kluwer Copyright Blog, 2024

Artificial intelligence, Germany, text and data mining

Bibtex

Online publication{nokey, title = {Machine readable or not? – notes on the hearing in LAION e.v. vs Kneschke}, author = {Keller, P.}, url = {https://copyrightblog.kluweriplaw.com/2024/07/22/machine-readable-or-not-notes-on-the-hearing-in-laion-e-v-vs-kneschke/}, year = {2024}, date = {2024-07-22}, journal = {Kluwer Copyright Blog}, keywords = {Artificial intelligence, Germany, text and data mining}, }

TDM: Poland challenges the rule of EU copyright law external link

Kluwer Copyright Blog, 2024

Copyright, EU, Poland, text and data mining

Bibtex

Online publication{nokey, title = {TDM: Poland challenges the rule of EU copyright law}, author = {Keller, P.}, url = {https://copyrightblog.kluweriplaw.com/2024/02/20/tdm-poland-challenges-the-rule-of-eu-copyright-law/}, year = {2024}, date = {2024-02-20}, journal = {Kluwer Copyright Blog}, keywords = {Copyright, EU, Poland, text and data mining}, }

Generative AI and Author Remuneration

IIC, vol. 54, pp: 1535-1560, 2023

Abstract

With the evolution of generative AI systems, machine-made productions in the literary and artistic field have reached a level of refinement that allows them to replace human creations. The increasing sophistication of AI systems will inevitably disrupt the market for human literary and artistic works. Generative AI systems provide literary and artistic output much faster and cheaper. It is therefore foreseeable that human authors will be exposed to substitution effects. They may lose income as they are replaced by machines in sectors ranging from journalism and writing to music and visual arts. Considering this trend, the question arises whether it is advisable to take measures to compensate human authors for the reduction in their market share and income. Copyright law could serve as a tool to introduce an AI levy system and ensure the payment of equitable remuneration. In combination with mandatory collective rights management, the new revenue stream could be used to finance social and cultural funds that improve the working and living conditions of flesh-and-blood authors.

collective rights management, Copyright, Freedom of expression, text and data mining, three-step test

Bibtex

Article{nokey, title = {Generative AI and Author Remuneration}, author = {Senftleben, M.}, doi = {https://doi.org/10.1007/s40319-023-01399-4}, year = {2023}, date = {2023-11-07}, journal = {IIC}, volume = {54}, pages = {1535-1560}, abstract = {With the evolution of generative AI systems, machine-made productions in the literary and artistic field have reached a level of refinement that allows them to replace human creations. The increasing sophistication of AI systems will inevitably disrupt the market for human literary and artistic works. Generative AI systems provide literary and artistic output much faster and cheaper. It is therefore foreseeable that human authors will be exposed to substitution effects. They may lose income as they are replaced by machines in sectors ranging from journalism and writing to music and visual arts. Considering this trend, the question arises whether it is advisable to take measures to compensate human authors for the reduction in their market share and income. Copyright law could serve as a tool to introduce an AI levy system and ensure the payment of equitable remuneration. In combination with mandatory collective rights management, the new revenue stream could be used to finance social and cultural funds that improve the working and living conditions of flesh-and-blood authors.}, keywords = {collective rights management, Copyright, Freedom of expression, text and data mining, three-step test}, }

Compliance of National TDM Rules with International Copyright Law: An Overrated Nonissue? external link

IIC - International Review of Intellectual Property and Competition Law, vol. 53, pp: 1477-1505, 2022

Abstract

Seeking to devise an adequate regulatory framework for text and data mining (TDM), countries around the globe have adopted different approaches. While considerable room for TDM can follow from the application of fair use provisions (US) and broad statutory exemptions (Japan), countries in the EU rely on a more restrictive regulation that is based on specific copyright exceptions. Surveying this spectrum of existing approaches, lawmakers in countries seeking to devise an appropriate TDM regime may wonder whether the adoption of a restrictive approach is necessary in the light of international copyright law. In particular, they may feel obliged to ensure compliance with the three-step test laid down in Art. 9(2) of the Berne Convention, Art. 13 of the TRIPS Agreement and Art. 10 of the WIPO Copyright Treaty. Against this background, the analysis raises the question whether international copyright law covers TDM activities at all. TDM does not concern a traditional category of use that could have been contemplated at the diplomatic conferences leading to the current texts of the Berne Convention, the TRIPS Agreement and the WIPO Copyright Treaty. It is an automated, analytical type of use that does not affect the expressive core of literary and artistic works. Arguably, TDM constitutes a new category of copying that falls outside the scope of international copyright harmonization altogether.

Artificial intelligence, Auteursrecht, text and data mining

Bibtex

Article{nokey, title = {Compliance of National TDM Rules with International Copyright Law: An Overrated Nonissue?}, author = {Senftleben, M.}, url = {https://link.springer.com/article/10.1007/s40319-022-01266-8}, doi = {https://doi.org/10.1007/s40319-022-01266-8}, year = {2022}, date = {2022-11-25}, journal = {IIC - International Review of Intellectual Property and Competition Law}, volume = {53}, pages = {1477-1505}, abstract = {Seeking to devise an adequate regulatory framework for text and data mining (TDM), countries around the globe have adopted different approaches. While considerable room for TDM can follow from the application of fair use provisions (US) and broad statutory exemptions (Japan), countries in the EU rely on a more restrictive regulation that is based on specific copyright exceptions. Surveying this spectrum of existing approaches, lawmakers in countries seeking to devise an appropriate TDM regime may wonder whether the adoption of a restrictive approach is necessary in the light of international copyright law. In particular, they may feel obliged to ensure compliance with the three-step test laid down in Art. 9(2) of the Berne Convention, Art. 13 of the TRIPS Agreement and Art. 10 of the WIPO Copyright Treaty. Against this background, the analysis raises the question whether international copyright law covers TDM activities at all. TDM does not concern a traditional category of use that could have been contemplated at the diplomatic conferences leading to the current texts of the Berne Convention, the TRIPS Agreement and the WIPO Copyright Treaty. It is an automated, analytical type of use that does not affect the expressive core of literary and artistic works. Arguably, TDM constitutes a new category of copying that falls outside the scope of international copyright harmonization altogether.}, keywords = {Artificial intelligence, Auteursrecht, text and data mining}, }

Implementing User Rights for Research in the Field of Artificial Intelligence: A Call for International Action external link

Flynn, S., Geiger, C., Quintais, J., Margoni, T., Sag, M., Guibault, L. & Carroll, M.
European Intellectual Property Review, vol. 2020, num: 7, 2020

Abstract

Last year, before the onset of a global pandemic highlighted the critical and urgent need for technology-enabled scientific research, the World Intellectual Property Organization (WIPO) launched an inquiry into issues at the intersection of intellectual property (IP) and artificial intelligence (AI). We contributed comments to that inquiry, with a focus on the application of copyright to the use of text and data mining (TDM) technology. This article describes some of the most salient points of our submission and concludes by stressing the need for international leadership on this important topic. WIPO could help fill the current gap on international leadership, including by providing guidance on the diverse mechanisms that countries may use to authorize TDM research and serving as a forum for the adoption of rules permitting cross-border TDM projects.

Artificial intelligence, Auteursrecht, frontpage, machine learning, tdm, text and data mining

Bibtex

Article{Flynn2020b, title = {Implementing User Rights for Research in the Field of Artificial Intelligence: A Call for International Action}, author = {Flynn, S. and Geiger, C. and Quintais, J. and Margoni, T. and Sag, M. and Guibault, L. and Carroll, M.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3578819}, year = {0421}, date = {2020-04-21}, journal = {European Intellectual Property Review}, volume = {2020}, number = {7}, pages = {}, abstract = {Last year, before the onset of a global pandemic highlighted the critical and urgent need for technology-enabled scientific research, the World Intellectual Property Organization (WIPO) launched an inquiry into issues at the intersection of intellectual property (IP) and artificial intelligence (AI). We contributed comments to that inquiry, with a focus on the application of copyright to the use of text and data mining (TDM) technology. This article describes some of the most salient points of our submission and concludes by stressing the need for international leadership on this important topic. WIPO could help fill the current gap on international leadership, including by providing guidance on the diverse mechanisms that countries may use to authorize TDM research and serving as a forum for the adoption of rules permitting cross-border TDM projects.}, keywords = {Artificial intelligence, Auteursrecht, frontpage, machine learning, tdm, text and data mining}, }

The New Copyright in the Digital Single Market Directive: A Critical Look external link

European Intellectual Property Review, vol. 42, num: 1, pp: 28-41, 2020

Abstract

This article provides an overview and critical examination of the new Directive on copyright and related rights in the Digital Single Market. Despite some positive aspects, the Directive includes multiple problematic provisions, including the controversial new right for press publishers and the new liability regime for content-sharing platforms. On balance, the Directive denotes a normative preference for private ordering over public choice in EU copyright law, and lacks adequate safeguards for users. It is also a complex text with multiple ambiguities, which will likely fail promote the desired harmonization and legal certainty in this area.

Collective licensing, Copyright, digital content, Digital Single Market, EU law, exceptions and limitations, frontpage, Licensing, Online services, text and data mining

Bibtex

Article{Quintais2019e, title = {The New Copyright in the Digital Single Market Directive: A Critical Look}, author = {Quintais, J.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3424770}, year = {0107}, date = {2020-01-07}, journal = {European Intellectual Property Review}, volume = {42}, number = {1}, pages = {28-41}, abstract = {This article provides an overview and critical examination of the new Directive on copyright and related rights in the Digital Single Market. Despite some positive aspects, the Directive includes multiple problematic provisions, including the controversial new right for press publishers and the new liability regime for content-sharing platforms. On balance, the Directive denotes a normative preference for private ordering over public choice in EU copyright law, and lacks adequate safeguards for users. It is also a complex text with multiple ambiguities, which will likely fail promote the desired harmonization and legal certainty in this area.}, keywords = {Collective licensing, Copyright, digital content, Digital Single Market, EU law, exceptions and limitations, frontpage, Licensing, Online services, text and data mining}, }

Text and Data Mining in the Proposed Directive: Where do we stand? external link

Kluwer Copyright Blog, 2018

Copyright, directive, frontpage, text and data mining

Bibtex

Article{Zeybek2018b, title = {Text and Data Mining in the Proposed Directive: Where do we stand?}, author = {Zeybek, B.}, url = {http://copyrightblog.kluweriplaw.com/2018/03/23/text-data-mining-proposed-directive-stand/}, year = {0326}, date = {2018-03-26}, journal = {Kluwer Copyright Blog}, keywords = {Copyright, directive, frontpage, text and data mining}, }

Een auteursrechtelijke uitzondering voor TDM: is het genoeg? external link

AMI, num: 2, pp: 80-86, 2017

Abstract

Met het recente DSM-richtlijnvoorstel wil de Europese Commissie de weg vrijmaken voor gebruik van ‘text and data mining’ (TDM) ten behoeve van wetenschappelijk onderzoek. Daarmee bevestigt zij dat TDM in principe een auteursrechtelijk relevante handeling is, waarmee zij gebruikers die niet onder de exceptie vallen in een ongunstige positie kan brengen. Tegelijkertijd miskent de focus op het auteursrecht de onderliggende problematiek. Dit artikel neemt de ‘TDM-exceptie’ onder de loep, plaatst vraagtekens bij de effectiviteit ervan en werpt een blik op de toekomstbestendigheid van de TDM-exceptie.

Auteursrecht, excepties, frontpage, tdm, text and data mining, wetenschappelijk onderzoek

Bibtex

Article{Caspers2017, title = {Een auteursrechtelijke uitzondering voor TDM: is het genoeg?}, author = {Caspers, M.}, url = {https://www.ivir.nl/publicaties/download/AMI_2017_2-1.pdf}, year = {0523}, date = {2017-05-23}, journal = {AMI}, number = {2}, abstract = {Met het recente DSM-richtlijnvoorstel wil de Europese Commissie de weg vrijmaken voor gebruik van ‘text and data mining’ (TDM) ten behoeve van wetenschappelijk onderzoek. Daarmee bevestigt zij dat TDM in principe een auteursrechtelijk relevante handeling is, waarmee zij gebruikers die niet onder de exceptie vallen in een ongunstige positie kan brengen. Tegelijkertijd miskent de focus op het auteursrecht de onderliggende problematiek. Dit artikel neemt de ‘TDM-exceptie’ onder de loep, plaatst vraagtekens bij de effectiviteit ervan en werpt een blik op de toekomstbestendigheid van de TDM-exceptie.}, keywords = {Auteursrecht, excepties, frontpage, tdm, text and data mining, wetenschappelijk onderzoek}, }