In a major escalation of the legal war over AI training data, Encyclopedia Britannica Inc. and its subsidiary Merriam-Webster filed a massive copyright and trademark infringement lawsuit against OpenAI on March 13, 2026. The complaint, lodged in the U.S. District Court for the Southern District of New York, accuses the Microsoft-backed AI giant of “free-riding” on nearly 100,000 fact-checked articles to build and power ChatGPT.
The Allegations: Beyond “Fair Use”
Britannica’s legal team argues that OpenAI did not just “learn” from their data but effectively cloned it. The lawsuit highlights three primary grievances:
- Unauthorized Scraping: OpenAI allegedly scraped Britannica’s entire digital library and Merriam-Webster’s dictionary to train GPT-4 and subsequent models without licensing agreements.
- “Near-Verbatim” Outputs: The filing includes examples where ChatGPT provides responses that are nearly identical to Britannica’s proprietary entries, bypassing the need for users to visit the publisher’s websites.
- Trademark Infringement & Hallucinations: Britannica claims OpenAI has damaged its 250-year-old reputation by generating “hallucinations” (false information) and wrongfully attributing them to Britannica or Merriam-Webster.
The “Existential Threat” to Fact-Checking
The lawsuit frames OpenAI’s actions as a direct threat to the business model of authoritative reference. Britannica, which serves over 150 million students globally, relies on subscriptions and advertising revenue that it claims is being “cannibalized” by AI-generated summaries.
“OpenAI’s practices undermine the investment in human-created, fact-checked content… shifting the value of that content to AI platforms without compensation,” the complaint states.
OpenAI’s Defense
In a brief statement issued on March 16, 2026, an OpenAI spokesperson defended the company’s methods:
“Our models empower innovation, and are trained on publicly available data and grounded in fair use principles.”
The company argues that its AI transforms data into new, original expressions rather than simply reproducing it—a core legal defense currently being tested in multiple high-stakes cases brought by The New York Times, The Authors Guild, and various news organizations.
The Legal Context: A Growing Wave
Britannica is no stranger to this fight. This lawsuit follows a similar action the publisher took against the AI search engine Perplexity AI in September 2025.
| Case | Current Status (March 2026) |
| Britannica vs. OpenAI | Just Filed (Case No. 1:2026cv02097) |
| Britannica vs. Perplexity | Ongoing; in discovery phase |
| Anthropic vs. Authors | Settled for reported $1.5 billion (Feb 2026) |
| NYT vs. OpenAI | Awaiting summary judgment in 2026 |
Legal analysts suggest the Britannica case may eventually be consolidated into the existing Multidistrict Litigation (MDL) in California, which currently handles several parallel copyright suits against OpenAI. If the courts find that “memorized” data in model weights constitutes an infringing copy, it could force OpenAI to pay billions in statutory damages or delete the parts of its models trained on unlicensed data.
