Generative AI and Creative Commons Licences – The Application of Share Alike Obligations to Trained Models, Curated Datasets and AI Output external link

JIPITEC, vol. 15, iss. : 3, 2024

Abstract

This article maps the impact of Share Alike (SA) obligations and copyleft licensing on machine learning, AI training, and AI-generated content. It focuses on the SA component found in some of the Creative Commons (CC) licences, distilling its essential features and layering them onto machine learning and content generation workflows. Based on our analysis, there are three fundamental challenges related to the life cycle of these licences: tracing and establishing copyright-relevant uses during the development phase (training), the interplay of licensing conditions with copyright exceptions and the identification of copyright-protected traces in AI output. Significant problems can arise from several concepts in CC licensing agreements (‘adapted material’ and ‘technical modification’) that could serve as a basis for applying SA conditions to trained models, curated datasets and AI output that can be traced back to CC material used for training purposes. Seeking to transpose Share Alike and copyleft approaches to the world of generative AI, the CC community can only choose between two policy approaches. On the one hand, it can uphold the supremacy of copyright exceptions. In countries and regions that exempt machine-learning processes from the control of copyright holders, this approach leads to far-reaching freedom to use CC resources for AI training purposes. At the same time, it marginalises SA obligations. On the other hand, the CC community can use copyright strategically to extend SA obligations to AI training results and AI output. To achieve this goal, it is necessary to use rights reservation mechanisms, such as the opt-out system available in EU copyright law, and subject the use of CC material in AI training to SA conditions. Following this approach, a tailor-made licence solution can grant AI developers broad freedom to use CC works for training purposes. In exchange for the training permission, however, AI developers would have to accept the obligation to pass on – via a whole chain of contractual obligations – SA conditions to recipients of trained models and end users generating AI output.

ai, Copyright, creative commons, Licensing, machine learning

Bibtex

Article{nokey, title = {Generative AI and Creative Commons Licences – The Application of Share Alike Obligations to Trained Models, Curated Datasets and AI Output}, author = {Szkalej, K. and Senftleben, M.}, url = {https://www.jipitec.eu/jipitec/article/view/415}, year = {2024}, date = {2024-12-13}, journal = {JIPITEC}, volume = {15}, issue = {3}, pages = {}, abstract = {This article maps the impact of Share Alike (SA) obligations and copyleft licensing on machine learning, AI training, and AI-generated content. It focuses on the SA component found in some of the Creative Commons (CC) licences, distilling its essential features and layering them onto machine learning and content generation workflows. Based on our analysis, there are three fundamental challenges related to the life cycle of these licences: tracing and establishing copyright-relevant uses during the development phase (training), the interplay of licensing conditions with copyright exceptions and the identification of copyright-protected traces in AI output. Significant problems can arise from several concepts in CC licensing agreements (‘adapted material’ and ‘technical modification’) that could serve as a basis for applying SA conditions to trained models, curated datasets and AI output that can be traced back to CC material used for training purposes. Seeking to transpose Share Alike and copyleft approaches to the world of generative AI, the CC community can only choose between two policy approaches. On the one hand, it can uphold the supremacy of copyright exceptions. In countries and regions that exempt machine-learning processes from the control of copyright holders, this approach leads to far-reaching freedom to use CC resources for AI training purposes. At the same time, it marginalises SA obligations. On the other hand, the CC community can use copyright strategically to extend SA obligations to AI training results and AI output. To achieve this goal, it is necessary to use rights reservation mechanisms, such as the opt-out system available in EU copyright law, and subject the use of CC material in AI training to SA conditions. Following this approach, a tailor-made licence solution can grant AI developers broad freedom to use CC works for training purposes. In exchange for the training permission, however, AI developers would have to accept the obligation to pass on – via a whole chain of contractual obligations – SA conditions to recipients of trained models and end users generating AI output.}, keywords = {ai, Copyright, creative commons, Licensing, machine learning}, }

Annotatie bij Hof van Justitie EU 9 maart 2021, Hof van Justitie EU 22 juni 2021 & Hoge Raad 27 januari 2023 download

Nederlandse Jurisprudentie, iss. : 34, num: 314, pp: 6726-6728, 2024

case law, Copyright

Bibtex

Case note{nokey, title = {Annotatie bij Hof van Justitie EU 9 maart 2021, Hof van Justitie EU 22 juni 2021 & Hoge Raad 27 januari 2023}, author = {Hugenholtz, P.}, url = {https://www.ivir.nl/publications/annotatie-bij-hof-van-justitie-eu-9-maart-2021-hof-van-justitie-eu-22-juni-2021-hoge-raad-27-januari-2023-stichting-brein-news-service-europe/annotatie_nj_2024_314/}, year = {2024}, date = {2024-12-05}, journal = {Nederlandse Jurisprudentie}, issue = {34}, number = {314}, keywords = {case law, Copyright}, }

Copyright, the AI Act and extraterritoriality external link

Kluwer Copyright Blog, 2024

AI Act, Copyright

Bibtex

Online publication{nokey, title = {Copyright, the AI Act and extraterritoriality}, author = {Quintais, J.}, url = {https://copyrightblog.kluweriplaw.com/2024/11/28/copyright-the-ai-act-and-extraterritoriality/}, year = {2024}, date = {2024-11-28}, journal = {Kluwer Copyright Blog}, keywords = {AI Act, Copyright}, }

The paradox of lawful text and data mining? Some experiences from the research sector and where we (should) go from here external link

Abstract

Scientific research can be tricky business. This paper critically explores the 'lawful access' requirement in European copyright law which applies to text and data mining (TDM) carried out for the purpose of scientific research. Whereas TDM is essential for data analysis, artificial intelligence (AI) and innovation, the paper argues that the 'lawful access' requirement in Article 3 CDSM Directive may actually restrict research by complicating the applicability of the TDM provision or even rendering it inoperable. Although the requirement is intended to ensure that researchers act in good faith before deploying TMD tools for purposes such as machine learning, it forces them to ask for permission to access data, for example by taking out a subscription to a service, and for that reason provides the opportunity for copyright holders to apply all sorts of commercial strategies to set the legal and technological parameters of access and potentially even circumvent the mandatory character of the provision. The paper concludes by drawing on insights from the recent European Commission study 'Improving access to and reuse of research results, publications and data for scientific purposes' that offer essential perspectives for the future of TDM, and by suggesting a number of paths forward that EU Member States can take already now in order to support a more predictable and reliable legal regime for scientific TDM and potentially code mining to foster innovation.

ai, CDSM Directive, Copyright, text and data mining

Bibtex

Article{nokey, title = {The paradox of lawful text and data mining? Some experiences from the research sector and where we (should) go from here}, author = {Szkalej, K.}, url = {https://ssrn.com/abstract=5000116 }, doi = {https://doi.org/10.2139/ssrn.5000116 }, year = {2024}, date = {2024-11-04}, abstract = {Scientific research can be tricky business. This paper critically explores the \'lawful access\' requirement in European copyright law which applies to text and data mining (TDM) carried out for the purpose of scientific research. Whereas TDM is essential for data analysis, artificial intelligence (AI) and innovation, the paper argues that the \'lawful access\' requirement in Article 3 CDSM Directive may actually restrict research by complicating the applicability of the TDM provision or even rendering it inoperable. Although the requirement is intended to ensure that researchers act in good faith before deploying TMD tools for purposes such as machine learning, it forces them to ask for permission to access data, for example by taking out a subscription to a service, and for that reason provides the opportunity for copyright holders to apply all sorts of commercial strategies to set the legal and technological parameters of access and potentially even circumvent the mandatory character of the provision. The paper concludes by drawing on insights from the recent European Commission study \'Improving access to and reuse of research results, publications and data for scientific purposes\' that offer essential perspectives for the future of TDM, and by suggesting a number of paths forward that EU Member States can take already now in order to support a more predictable and reliable legal regime for scientific TDM and potentially code mining to foster innovation.}, keywords = {ai, CDSM Directive, Copyright, text and data mining}, }

Opinion of the European Copyright Society on CG and YN v Pelham GmbH and Others, Case C-590/23 (Pelham II) external link

Mezei, P., Senftleben, M. & Sganga, C.
European Copyright Society, 2024

Abstract

In its questions for preliminary ruling, the German Federal Court of Justice asked for clarification as regards the definition of pastiche under EU copyright law; and, in essence, whether and how this concept applies to musical sampling. In the present Opinion, the European Copyright Society takes the view that pastiche is an autonomous concept of EU law. Article 5(3)(k) InfoSoc Directive (ISD) should be read as an overarching provision including three forms of permitted use that share their underlying nature but shall be judged differently. The meaning of pastiche cannot be understood as a mere imitation of an artistic style and it need not entail an explicit interaction with the original work. The presence of humour or mockery is not a necessary requirement for the application of the pastiche exception. Also, the expression resulting from the exercise of the pastiche exception need not itself be an original work. Finally, the intention of the user to create pastiche plays no role in the review of the legality of any given use. At the same time, legitimate forms of pastiche need to have their own features that are distinguishable from the copyrighted expression in pre-existing works used as source materials. Overall the use of the pastiche exception for purposes of musical sampling, as in the underlying Metall auf Metall case, complies with all the three steps of Article 5(5) ISD.

Copyright

Bibtex

Online publication{nokey, title = {Opinion of the European Copyright Society on CG and YN v Pelham GmbH and Others, Case C-590/23 (Pelham II)}, author = {Mezei, P. and Senftleben, M. and Sganga, C.}, url = {https://europeancopyrightsociety.org/wp-content/uploads/2024/11/ecs-opinion-pelham-ii-1.pdf}, year = {2024}, date = {2024-11-06}, journal = {European Copyright Society}, abstract = {In its questions for preliminary ruling, the German Federal Court of Justice asked for clarification as regards the definition of pastiche under EU copyright law; and, in essence, whether and how this concept applies to musical sampling. In the present Opinion, the European Copyright Society takes the view that pastiche is an autonomous concept of EU law. Article 5(3)(k) InfoSoc Directive (ISD) should be read as an overarching provision including three forms of permitted use that share their underlying nature but shall be judged differently. The meaning of pastiche cannot be understood as a mere imitation of an artistic style and it need not entail an explicit interaction with the original work. The presence of humour or mockery is not a necessary requirement for the application of the pastiche exception. Also, the expression resulting from the exercise of the pastiche exception need not itself be an original work. Finally, the intention of the user to create pastiche plays no role in the review of the legality of any given use. At the same time, legitimate forms of pastiche need to have their own features that are distinguishable from the copyrighted expression in pre-existing works used as source materials. Overall the use of the pastiche exception for purposes of musical sampling, as in the underlying Metall auf Metall case, complies with all the three steps of Article 5(5) ISD.}, keywords = {Copyright}, }

Opinion of the European Copyright Society on the CG and YN v Pelham GmbH and Others, Case C-590/23 (Pelham II) external link

Mezei, P., Senftleben, M., Sganga, C. & Geiger, C.
Kluwer Copyright Blog, 2024

Copyright

Bibtex

Online publication{nokey, title = {Opinion of the European Copyright Society on the CG and YN v Pelham GmbH and Others, Case C-590/23 (Pelham II)}, author = {Mezei, P. and Senftleben, M. and Sganga, C. and Geiger, C.}, url = {https://copyrightblog.kluweriplaw.com/2024/11/07/opinion-of-the-european-copyright-society-on-the-cg-and-yn-v-pelham-gmbh-and-others-case-c-590-23-pelham-ii/}, year = {2024}, date = {2024-11-07}, journal = {Kluwer Copyright Blog}, keywords = {Copyright}, }

Everything is harmonized. The CJEU’s decision in Kwantum v. Vitra external link

Kluwer Copyright Blog, 2024

Copyright

Bibtex

Online publication{nokey, title = {Everything is harmonized. The CJEU’s decision in Kwantum v. Vitra}, author = {Hugenholtz, P.}, url = {https://copyrightblog.kluweriplaw.com/2024/11/06/everything-is-harmonized-the-cjeus-decision-in-kwantum-v-vitra/}, year = {2024}, date = {2024-11-06}, journal = {Kluwer Copyright Blog}, keywords = {Copyright}, }

Copyright and the Expression Engine: Idea and Expression in AI-Assisted Creations download

Chicago-Kent Law Review (forthcoming), 2024

Abstract

This essay explores AI-assisted content creation in light of EU and U.S. copyright law. The essay revisits a 2020 study commissioned by the European Commission, which was written before the surge of generative AI. Drawing from traditional legal doctrines, such as the idea/expression dichotomy and its equivalents in Europe, the author argues that iterative prompting may lead to copyright protection of GenAI-assisted output. The paper critiques recent U.S. Copyright Office guidelines that severely restrict registration of works created with the aid of GenAI. Human input, particularly in the conceptual and redaction phases, provides sufficient creative control to justify copyright protection of many AI-assisted works. With many of the expressive features being machine-generated, the scope of copyright protection of such works should, however, remain fairly narrow.

Artificial intelligence, artistic expression, Copyright

Bibtex

Article{nokey, title = {Copyright and the Expression Engine: Idea and Expression in AI-Assisted Creations}, author = {Hugenholtz, P.}, url = {https://www.ivir.nl/publications/copyright-and-the-expression-engine-idea-and-expression-in-ai-assisted-creations/chicagokentlawreview2024/}, year = {2024}, date = {2024-11-05}, journal = {Chicago-Kent Law Review (forthcoming)}, abstract = {This essay explores AI-assisted content creation in light of EU and U.S. copyright law. The essay revisits a 2020 study commissioned by the European Commission, which was written before the surge of generative AI. Drawing from traditional legal doctrines, such as the idea/expression dichotomy and its equivalents in Europe, the author argues that iterative prompting may lead to copyright protection of GenAI-assisted output. The paper critiques recent U.S. Copyright Office guidelines that severely restrict registration of works created with the aid of GenAI. Human input, particularly in the conceptual and redaction phases, provides sufficient creative control to justify copyright protection of many AI-assisted works. With many of the expressive features being machine-generated, the scope of copyright protection of such works should, however, remain fairly narrow.}, keywords = {Artificial intelligence, artistic expression, Copyright}, }

Generative AI, Copyright and the AI Act (v.2) external link

Abstract

Published 1 November 2024. This is a revised and extended version of a paper initially published in August 2024. This paper examines the copyright-relevant rules of the recently published Artificial Intelligence (AI) Act for the EU copyright acquis. The aim of the paper is to provide a critical overview of the relationship between the AI Act and EU copyright law, while highlighting potential gray areas and blind spots for legal interpretation and future policy-making. The paper proceeds as follows. After a short introduction, Section 2 outlines the basic copyright issues of generative AI and the relevant copyright acquis rules that interface with the AI Act. It mentions potential copyright issues with the input or training stage, the model, and outputs. The AI Act rules are mostly relevant for the training of AI models, and the Regulation primarily interfaces with the text and data mining (TDM) exceptions in Articles 3 and 4 of the Copyright in the Digital Single Market Directive (CDSMD). Section 3 then briefly explains the AI Act’s structure and core definitions as they pertain to copyright law. Section 4 is the heart of the paper. It covers in some detail the interface between the AI Act and EU copyright law, namely: the clarification that TDM is involved in training AI models (4.1); the outline of the key copyright obligations in the AI Act (4.2); the obligation to put in place policies to respect copyright law, especially regarding TDM opt-outs (4.3); the projected extraterritorial effect of such obligations (4.4); the transparency obligations (4.5); how the AI Act envisions compliance with such obligations (4.6); and potential enforcement and remedies (4.7). Section 5 offers some concluding remarks, focusing on the inadequacy of the current regime to address one of its main concerns: the fair remuneration of authors and performers.

AI Act, Content moderation, Copyright, DSA, Generative AI, text and data mining, Transparency

Bibtex

Working paper{nokey, title = {Generative AI, Copyright and the AI Act (v.2)}, author = {Quintais, J.}, url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4912701}, year = {2024}, date = {2024-11-01}, abstract = {Published 1 November 2024. This is a revised and extended version of a paper initially published in August 2024. This paper examines the copyright-relevant rules of the recently published Artificial Intelligence (AI) Act for the EU copyright acquis. The aim of the paper is to provide a critical overview of the relationship between the AI Act and EU copyright law, while highlighting potential gray areas and blind spots for legal interpretation and future policy-making. The paper proceeds as follows. After a short introduction, Section 2 outlines the basic copyright issues of generative AI and the relevant copyright acquis rules that interface with the AI Act. It mentions potential copyright issues with the input or training stage, the model, and outputs. The AI Act rules are mostly relevant for the training of AI models, and the Regulation primarily interfaces with the text and data mining (TDM) exceptions in Articles 3 and 4 of the Copyright in the Digital Single Market Directive (CDSMD). Section 3 then briefly explains the AI Act’s structure and core definitions as they pertain to copyright law. Section 4 is the heart of the paper. It covers in some detail the interface between the AI Act and EU copyright law, namely: the clarification that TDM is involved in training AI models (4.1); the outline of the key copyright obligations in the AI Act (4.2); the obligation to put in place policies to respect copyright law, especially regarding TDM opt-outs (4.3); the projected extraterritorial effect of such obligations (4.4); the transparency obligations (4.5); how the AI Act envisions compliance with such obligations (4.6); and potential enforcement and remedies (4.7). Section 5 offers some concluding remarks, focusing on the inadequacy of the current regime to address one of its main concerns: the fair remuneration of authors and performers.}, keywords = {AI Act, Content moderation, Copyright, DSA, Generative AI, text and data mining, Transparency}, }

Geoblocking measures sufficient to prevent a “communication to the public”? The CJEU gets a second chance external link

Kluwer Copyright Blog, 2024

Copyright, Geoblocking, right of communication to the public

Bibtex

Online publication{nokey, title = {Geoblocking measures sufficient to prevent a “communication to the public”? The CJEU gets a second chance}, author = {Toepoel, I. and Valk, E.G.}, url = {https://copyrightblog.kluweriplaw.com/2024/10/31/geoblocking-measures-sufficient-to-prevent-a-communication-to-the-public-the-cjeu-gets-a-second-chance/}, year = {2024}, date = {2024-10-31}, journal = {Kluwer Copyright Blog}, keywords = {Copyright, Geoblocking, right of communication to the public}, }