This week a group of major publishers and the author Scott Turow filed a class-action complaint in Manhattan federal court accusing Meta and mark zuckerberg of copying millions of copyrighted books and academic works without permission to train the company’s Llama large language models.
The plaintiffs — Hachette, Macmillan, McGraw-Hill, Elsevier, Cengage and Turow among them — say Meta copied and distributed everything from textbooks and scientific articles to novels, and that the alleged pirated material includes titles such as The Fifth Season by N.K. Jemisin and The Wild Robot by Peter Brown. The complaint says Zuckerberg was directly involved in approving those practices and asks the court to let the publishers represent a larger class of copyright owners and to award an unspecified amount in damages.
Those numbers are meant to raise the stakes. The complaint centers on “millions of copyrighted works,” the plaintiffs wrote, and they pressed for class certification that would sweep in a broad set of creators and publishers. If the court grants that request, the case could consolidate claims from rights holders who say they were used as raw training data without authorization or compensation.
The filing arrives amid growing litigation over how large language models are built. Last year Anthropic agreed to pay $1.5 billion to settle a class-action suit by authors who alleged their work had been used without permission. And a San Francisco court previously ruled against 13 writers, including Ta-Nehisi Coates, in a comparable case against Meta, a split of outcomes that has not yet produced a settled rule of law.
Meta responded that it would contest the complaint forcefully, arguing that artificial intelligence is enabling major innovations and that courts have rightly found training AI on copyrighted material can qualify as fair use. A company spokesperson said Meta intends to fight the claims aggressively.
The complaint presses a contrary narrative. The publishers say Meta’s systems copied and redistributed copyrighted text at scale rather than using licensed or public-domain material, and they single out internal decisions — including, they say, direct approval by Zuckerberg — as evidence that the conduct was intentional rather than accidental. The filing also points to the phrase sometimes associated with the company’s approach to product development as reflective of that mindset.
That contrast — legal defenses rooted in emerging fair-use precedent versus an allegation of deliberate mass copying — is the central friction in the case. Courts have reached differing results. The Anthropic settlement shows companies may choose to pay to resolve claims, while the San Francisco ruling demonstrates that plaintiffs do not always prevail. The new complaint tests whether courts will treat the wholesale ingestion of copyrighted corpora differently when the defendant is a dominant social-media and AI company and a CEO is named personally.
For authors and publishers the case is straightforward: they seek compensation and a ruling that would limit unlicensed use of creative and scholarly work. For Meta the stakes include not only potential damages but also how the law will shape permissible methods for training large language models going forward. The plaintiffs did not quantify damages in the complaint; they asked only that the court allow them to represent a broad class and leave the monetary relief to later proceedings.
The legal fight now moves to briefing and motions in Manhattan federal court, where the publishers will push for class certification and Meta will mount a vigorous defense asserting fair use and other arguments. Given the industry settlement by Anthropic and the mixed results in prior litigation, the case is likely to run for months and could influence whether companies change how they source training data or prefer to resolve disputes through settlements.
At bottom, the filing makes a clear demand: publishers want courts to stop what they call mass-scale copying and to compensate creators. The outcome will shape not only this company’s practices but the contours of copyright law as it meets a new generation of AI systems.





