By Roy S. Kaufman, CCC
Bronze sponsor of the ALPSP Annual Conference and Awards 2025.
AI is not one thing. It’s a collection of technologies, models, and use cases - everything from large language models that generate text, to tools that fine-tune outputs or access content as needed for specialized tasks. That diversity matters when we talk about copyright, licensing, and litigation, because different AI uses raise very different legal and business issues. Moreover, how materials are (or were) gathered for AI, how they are stored, and indeed what judge reviews the case, may well change the outcome.
Recent U.S. court decisions underscore this point. They don’t settle the AI copyright debate, but they do shape the way we should think about licensing and litigation going forward. They accomplish this paradoxically, by increasing the uncertainty around unlicensed use of content.
The Warhol Decision: Transformative Use is Not a Free Pass
Fair use cases rarely reach the Supreme Court, so those that do tend to receive a lot of scrutiny. The Warhol case, which was not about AI, narrowed the way “transformative use” should be applied in fair use analysis. The justices emphasized that transformative use isn’t binary. It’s a matter of degree, and it must be weighed against factors like commercial purpose and market harm.
In plain terms: it’s not enough to say, “we changed it, so it’s fair use.” Courts will look closely at whether the secondary use serves the same purpose as the original and if it does, the case for fair use gets much weaker.
For scholarly publishers, that reasoning is significant. AI developers often argue that ingesting and reusing your works is “transformative” because the technology outputs something new. But if the AI performs a similar function to your original, such as supplying factual or analytical content that competes in the same market, it may be seen as substitution, not transformation.
The AI Cases
There are more than 40 pending AI training cases in the U.S. (plus others in Japan, Europe, China, UK, etc.) and we now have preliminary decisions in three. If I were to sum them up in a word, it would be “inconsistent.” To vastly oversimplify the results, training AI is non-transformative infringement (Thomson Reuters v. Ross), training is transformative and mostly fair use with a major caveat that pirated materials cannot be used (Bartz), or training is mostly not fair use– but was fair use in the case before the court because the lawyers did not plead correctly and it was irrelevant that pirated content was used (Kadrey).
I can criticize and disagree with major elements of the last two cases, but what is important is that courts looking at similar use cases (training) are today arriving at vastly different results, and these cases will take years to be resolved. And, because litigation tends to solve yesterday’s problem tomorrow, none of the decided cases mention fine tuning, RAG or MCP, which are newer AI applications and have been the subject of reported deals between AI companies and scholarly publishers.
Why This Matters for Scholarly Publishing
Fair use isn’t a blanket defense for mass, unlicensed ingestion, storage and copying of copyrighted content. On the other hand, publishers simply should not rest on their rights given some clearly adverse language in the Bartz and Kadrey cases. Moreover, publishers cannot assume their materials have not been used simply because they refuse to license. In fact, the opposite is true.
From a litigation perspective, whether the AI company uses content for training, fine tuning, RAG, or MCP, we need to consider:
- Does the AI use substitute for the original, in whole or in part?
- Could there be an actual or potential licensing market for this use?
But litigation is not an end, it is a means. Publishing is not about preventing others from using content. It is about creating high-quality works and ensuring they are used appropriately, with compensation flowing back to ensure sustainability of future of content creation.
Practical Steps for Publishers
While the legal landscape is still evolving, the only “wrong” thing to do is wait on the sidelines. The necessary steps are:
- Define your licensing terms now. Specify how your works can be used in AI training, RAG, MCP, or fine-tuning. When content is used without your consent, you have no control and do not get paid. Imperfect licenses are better than no licenses.
- License directly or join collective licensing arrangements. Working with CCC and others, publishers can pool rights to create scalable, attractive licensing opportunities for AI developers and users.
- Speak up. Engage in industry associations, submit comments to policymakers, and ensure your perspective as an academic publisher is heard in AI governance debates.
- Experiment strategically. Some AI developers are already licensing content. Early deals will help you learn the market’s value and refine your terms.
The Road Ahead
Litigation will continue for years, and different courts will reach different conclusions. Waiting for “final clarity” on training could mean missing years of licensing revenue and letting market practices solidify in ways that disadvantage you.
Ultimately, all courts need to take market harm seriously and they will ultimately recognize the legitimacy of licensing as a solution. For scholarly publishers, that’s both a shield and an opportunity.
The future of AI will be shaped not just in courtrooms, but in direct licensing negotiations, collective rights initiatives, and the choices we make today about how our works can (and cannot) be used. About the author
No comments:
Post a Comment