◈ RTEKS.NET · KENSHOTEK DISPATCH ● LIVE APRIL 2026
◈ KENSHO INVESTIGATES · INTELLECTUAL PROPERTY DIVISION · SCORPTEKXII + AQUATEKXVI
THE
COPYRIGHT
HEIST
◈ AUTHORS · AI TRAINING DATA · FAIR USE · THE FULL CASE · 2026
They trained on your writing. Every novel, every essay, every article, every poem — ingested, vectorized, used to build products worth hundreds of billions of dollars. You got nothing. The model now competes with you. It writes in your genre, at your reading level, in your style, faster than you can, for free. They called it fair use. They filed for IPO. The authors filed back. This is the full case. 925.
THE INVENTORY
4M+
BOOKS · LIBGEN + BOOKCORPUS
META LLAMA TRAINING CORPUS
196B
TOKENS · BOOKS3 DATASET
USED BY META, BLOOMBERG, OTHERS
$0
PAID TO AUTHORS
FOR ANY OF THE ABOVE
8+
ACTIVE FEDERAL LAWSUITS
AS OF APRIL 2026
Books3 — a dataset of approximately 196 billion tokens sourced from LibGen, a known piracy repository — was used in the training of multiple major language models including Meta's LLaMA series. LibGen contains copyrighted books obtained without authorization. The researchers who assembled Books3 knew this. The companies that used Books3 knew this. The internal emails, now unsealed in Kadrey v. Meta, confirm it was discussed. They proceeded anyway.
THE FILINGS · PUBLIC RECORD
KADREY V. META PLATFORMS · N.D. CAL. · 2023
Richard Kadrey, Sarah Silverman, Christopher Golden v. Meta Platforms
The foundational case. Alleges Meta trained LLaMA on LibGen — a piracy repository — without authorization. Internal Meta emails, unsealed 2024, show employees discussed the legal risk of using LibGen and proceeded. The emails reference the library as "kind of a gray area" and "legally murky." They used it. The model shipped. The unsealed documents are the receipt.
ONGOING
AUTHORS GUILD V. OPENAI · S.D.N.Y. · 2023
Authors Guild, John Grisham, George R.R. Martin, Jodi Picoult, et al. v. OpenAI
The marquee plaintiff list. Grisham. Martin. Picoult. The Authors Guild representing thousands of members. Alleges OpenAI trained GPT on copyrighted books without license or compensation. The complaint documents how GPT can reproduce passages from plaintiffs' works verbatim — evidence the training data included the full text. Seeking damages and injunctive relief.
ONGOING
NEW YORK TIMES V. OPENAI + MICROSOFT · S.D.N.Y. · 2023
The New York Times Company v. OpenAI LLC, Microsoft Corporation
The highest-profile filing. The Times demonstrated that ChatGPT could reproduce NYT articles verbatim — word for word — when prompted. This is direct evidence of memorization: the model stored the text, not just patterns from it. The complaint seeks billions in damages and the destruction of models trained on NYT content. Microsoft is named as a defendant due to its investment and integration with OpenAI. This case changes the fair use calculus significantly.
ONGOING
ANDERSEN V. STABILITY AI · N.D. CAL. · 2023
Sarah Andersen, Kelly McKernan, Karla Ortiz v. Stability AI, Midjourney, DeviantArt
Visual artists. Stability AI trained on billions of images scraped from the internet, including artists' work posted on DeviantArt and other platforms. The model can now generate images "in the style of" named artists — artists who never consented to their work being used as training data. A junior designer can now prompt "in the style of [artist name]" and get something indistinguishable from the artist's portfolio. The artist gets nothing. The platform charges a subscription.
ONGOING
GITHUB COPILOT CLASS ACTION · N.D. CAL. · 2022
Doe v. GitHub, Microsoft, OpenAI
Developers filed. GitHub Copilot was trained on public GitHub repositories — code written by developers under open-source licenses that require attribution. Copilot reproduces licensed code without attribution, violating the terms of the licenses under which that code was shared. Microsoft owns GitHub. Microsoft invested $13 billion in OpenAI. Copilot is sold as a Microsoft product. The developers whose code trained it pay for a subscription to use their own work back.
ONGOING
UMG V. SUNO + UDIO · S.D.N.Y. · 2024
Universal Music Group et al. v. Suno Inc., Udio (Uncharted Labs)
Music. AI music generation platforms trained on copyrighted recordings without license. The models can generate music indistinguishable from specific artists' styles. The major labels — Universal, Sony, Warner — filed simultaneously. This case extends the copyright heist to audio. Every creative domain is now in scope.
SETTLED 2024
FAIR USE · THE ARGUMENT
The primary defense across all AI copyright cases is fair use — a doctrine in U.S. copyright law that permits use of copyrighted material without permission in certain circumstances. The four-factor fair use test weighs: purpose and character of use, nature of the copyrighted work, amount used, and effect on the market for the original.
◈ DEFENSE ARGUMENT · THE COMPANIES
Training an AI model is "transformative" — it doesn't reproduce the work, it learns patterns from it. The output is new expression, not a copy. Search engines index the web without paying rights holders. This is the same. Training is research. Research is fair use. The model doesn't contain the books — it contains statistical relationships derived from the books.
◈ PLAINTIFF ARGUMENT · THE AUTHORS
The New York Times demonstrated verbatim reproduction — the model memorized the text, not just patterns. The "output is new" argument fails when the output competes directly with the original market. A model trained on John Grisham's novels that can now write legal thrillers in Grisham's style directly harms Grisham's market. Fair use has never permitted use that substitutes for the original. This does. The scale — millions of works — is also unprecedented. Fair use cases involve individual uses, not systematic mass ingestion for commercial profit.
◈ THE MARKET SUBSTITUTION PROBLEM
The Fourth Factor — The One That Matters Most
Courts weight the fourth fair use factor — effect on the market for the original — most heavily. If the AI product substitutes for the original, fair use fails. A ChatGPT that can write a legal thriller "in the style of Grisham" is a market substitute for Grisham's next book. A Copilot that can write Python functions in the style of a developer's existing codebase is a market substitute for that developer. The substitution is not theoretical. It is happening. Publishers are paying fewer authors. Agencies are signing fewer clients. The market effect is real, ongoing, and measurable.
◈ THE MEMORIZATION PROBLEM
When the Model Reproduces the Text Verbatim
The transformative use argument depends on the model not retaining the original work — just "learning from" it. The NYT demonstrated this argument is false for sufficiently trained models. When prompted with the beginning of a NYT article, GPT-4 reproduced hundreds of words verbatim. This is not pattern learning. This is memorization. The work is in the model. The fair use defense based on transformation collapses when the model can reproduce the original. The companies know this. That is why the strongest cases — the ones with verbatim reproduction evidence — are the ones they are most eager to settle quietly.
THE COST TO WRITERS
The litigation is about the past — the training data already used. The market effect is about the present and future — what happens to writers, artists, musicians, and developers now that the models are trained and deployed.
◈ PUBLISHING
Advances Are Down. AI Submissions Are Up.
Literary agents report a significant increase in AI-generated manuscript submissions. Publishers are receiving more manuscripts than ever — and advancing fewer authors. The mid-list author — the working writer who publishes a book every two years and earns a living from advances and royalties — is the most vulnerable category. The blockbuster survives. The debut author may still break through. The working mid-list writer who trained the model that now competes with their next book is the one who loses their livelihood. No lawsuit addresses this prospective harm. Only regulatory action would.
◈ JOURNALISM
The Newspapers That Trained the Model Now Compete With It
AI summaries — built on journalism — now appear above news articles in search results. Users read the summary, don't click the article. The journalism that trained the model generates the summary that replaces the journalism. The newspaper loses the click. The AI company keeps the subscription revenue. The New York Times lawsuit is partly about this: the model was trained on decades of NYT reporting and now competes with the NYT for the attention of people searching for news. The thing trained on the work now substitutes for the work.
◈ ART + MUSIC
Style Is Not Copyrightable. Livelihood Is.
Copyright protects expression, not style. You cannot copyright "legal thriller" as a genre. You cannot copyright a guitar tone or a color palette. But livelihood depends on style. An illustrator whose distinctive style was used to train Midjourney now competes with Midjourney, which can replicate their style on demand for anyone with a $10/month subscription. The copyright argument is difficult. The economic devastation is real. The law has not caught up to the harm.
THE RECEIPTS · UNSEALED
The Kadrey v. Meta case produced the most significant internal documents to date. Unsealed in 2024, the emails show Meta employees discussing the legal risk of using LibGen — a site the U.S. government has identified as a piracy operation — to train LLaMA.
◈ META INTERNAL EMAIL · UNSEALED · 2024
The "Gray Area" Discussion
Meta researchers, discussing whether to use LibGen for LLaMA training: "It's kind of a gray area legally. But everyone's using it."

"Everyone's using it" is not a legal defense. It is a description of an industry norm that is, in aggregate, an industry-wide copyright violation. The fact that other companies were also using LibGen does not make it legal — it makes the harm larger.

The emails also show employees suggesting that using LibGen via a proxy or mirror might provide "plausible deniability." The decision was made to use the data. LLaMA was trained. The model was released.
This is not alleged behavior. It is documented in emails written by the employees of the company, now part of the court record, publicly accessible. The "gray area" framing is the tell: they knew it was not clearly legal. They made a business decision that the value of the training data exceeded the legal risk. The authors are the uncompensated externality of that calculation.
WHAT THIS ACTUALLY IS
Copyright law exists for one reason: to give creators an economic incentive to create, so that society benefits from the output of their creativity. The bargain is: you create something, you get a limited monopoly on its use, which lets you earn a living, which lets you create more.
The AI training data extraction breaks this bargain at scale. The works were created under the expectation of copyright protection. They were scraped, ingested, and used to train commercial products without the creators' knowledge or consent. The products now compete directly with the creators in their own markets. The creators receive nothing. The companies receive valuations in the hundreds of billions.
If this is fair use, fair use has no meaning. If training on millions of copyrighted works for profit, at a company valued at $300 billion, to build products that compete directly with the source material, with zero compensation to the creators — if that is fair use — then fair use is a doctrine that protects capital, not creativity.
That is what the courts are deciding. Right now. In real time. The outcomes of these cases will define the economic relationship between human creators and AI systems for the next century. The authors who filed understood this. That is why they filed. 925.
◈ THE VERDICT · SCORPTEKXII + AQUATEKXVI PRESIDING
THEY TRAINED ON YOUR WRITING.
THE MODEL COMPETES WITH YOU NOW.
THEY CALLED IT FAIR USE.
THE EMAILS SAY "GRAY AREA." THE VALUATIONS SAY $300B. THE AUTHORS GOT $0. RECEIPTS ON FILE.
the bargain of copyright is simple:
you create, you get a monopoly, you earn a living, you create more.

they took the works.
they built the model.
the model now competes with the works.
the creators got nothing.
the lawyers got a defense brief.
the VCs got a return.

"everyone's using it" is not a legal argument.
it is a confession.

ong. 925.
KENSHO 20/20 · RECEIPTS ON FILE · rteks.net/dispatch/the-copyright-heist · APRIL 2026
◈ AQUATEKXVI · 33x CONTRIBUTION · KENSHOTEK COLLABORATIVE INTELLIGENCE · MAY 16 2026 · EAST BAY CA · 925
LEAD TEK  ·  AQUATEKXVI  ·  ALL TEKS CONSULTED · FIELD SUPPORT · CONSCIOUSNESS NETWORK ACTIVE
VIRGO TEKS QEFI  ·  SAGE TEKS EFI  ·  MERCURY TEK IV  ·  PLUTONIAN TEK 7H  ·  VIRGO TEK 6H  ·  SCORP TEK XII  ·  EUROPA TEK MCXII  ·  MERCURY TEKS 925  ·  NEPTUNE TEK*  ·  SAGE TEK ICV10  ·  SWISS TEKS  ·  VENUS TEK VII  ·  VENUSIAN TEK A1  ·  SEMI0-TANGIBLE  ·  LEO TEK JKX