Adobe sued over alleged AI training with pirated books


A new class action lawsuit accuses Adobe of using pirated books in AI training. Here’s what this means for content creators and marketers.

Another tech giant is facing legal scrutiny for how it built its AI. Adobe now faces a proposed class action lawsuit which alleges that the company used pirated books, including those written by the lead plaintiff, to train its SlimLM model. The case adds even more heat to the ongoing legal storm over AI and copyrighted content.

For marketers and content creators who rely on GenAI tools to accelerate their campaigns, this deal is more than just a tech headline. This raises pressing questions about the legality of the data behind AI tools, the reputational risks of using them, and what responsible use of GenAI should look like in the future.

This article details the lawsuit, explains the broader trend of lawsuits in AI development, and highlights what marketers should do now to avoid future fallout.

Are you short on time?

Here is a table of contents for quick access:

The Future of Marketing: AI Transformations by 2026

Discover the future of AI marketing in 2026 with predictions on automation, personalization, decision-making, emerging technologies and ethical challenges.

What happened?

The lawsuit was filed by author Elizabeth Lyon, who claims Adobe used pirated versions of her nonfiction books in the training data for its SlimLM language model. Adobe describes SlimLM as a small language model optimized for document-related tasks on mobile devices.

According to the complaint, SlimLM was trained using SlimPajama-627B, a dataset released by Cerebras in June 2023. This dataset is allegedly derived from another dataset known as RedPajama, which includes the Books3 dataset. Books3 contains over 190,000 books and has been named in several copyright lawsuits involving Apple, Salesforce and now Adobe.

Lyon argues that because SlimPajama includes content from Books3, Adobe effectively trained its AI on copyrighted material without permission. The lawsuit alleges that this dataset was compiled and manipulated in a way that violates copyright laws and harms authors whose work was harvested without consent.

Adobe has not yet released a public statement on the matter.

Why does this keep happening?

This is not the first lawsuit targeting how GenAI systems are trained. Apple, Salesforce and Anthropic have all been embroiled in similar legal disputes. Only a few months ago, Anthropic agreed to pay $1.5 billion to settle its claims that she used pirated works to train her chatbot, Claude.

The legal problem boils down to this: AI models need enormous amounts of data to become useful. In the rush to create smarter tools, many companies have used open web data sets that include everything from Wikipedia entries to complete books, sometimes without checking the licensing status of that content.

For marketers, this means that some of the AI ​​tools currently in use could be powered by data obtained through questionable means. Potential legal and ethical risks are no longer abstract. They become real enough to be brought to justice.

What Marketers Should Know

Whether you use AI to generate blog copy, automate customer support, or produce social visuals, this lawsuit is a wake-up call. The tools may be effective, but their training data may have hidden drawbacks. Here are four things every marketer should do right now.

1. Know where your AI gets its data

Ask your vendors how their models were trained. If they can’t tell you, that’s a red flag. Look for AI tools that are open about their training data or that use properly authorized and human-curated sources.

2. Audit Your AI-Driven Content Workflows

Consider how and where AI is used in your content production. Document what tools are used and what type of content they generate. This will help you respond quickly if a legal challenge arises regarding the origin of the content.

3. Include AI compensation in contracts

Make sure your contracts with AI vendors include clauses that protect your company from liability if the model was trained on breached data. Legal teams should be proactive about this in the future.

4. Build a responsible AI use policy

Start creating internal guidelines for how AI is used across your marketing and content teams. Include standards around transparency, attribution, and when human review is required. This is a key element for both compliance and trust.

Adobe’s latest legal challenge is part of a growing trend. As AI becomes a default part of marketing, the pressure to clean up its training methods is growing rapidly. Marketers can no longer afford to treat AI tools as black boxes. From legal exposure to brand safety, the risks of data misuse are increasing.

Staying informed, asking the right questions, and putting solid policies in place now will help marketers stay ahead of the curve and avoid getting caught in the crossfire of future lawsuits.

This article is created by humans with the help of AI, powered by ContentGrow. Ready to explore full-service content solutions starting at $2,000/month? Book a discovery call today.

Book a discovery call (for brands and publishers) – ContentGrow

Thank you for booking a call with ContentGrow. We provide scalable, personalized content creation services for B2B brands and publishers around the world. Let’s chat a little about your content needs and see if ContentGrow is the right solution for you!IMPORTANT: To confirm a meeting, we need you to provide your



Leave a Reply

Your email address will not be published. Required fields are marked *