Friday, November 14, 2025

Trending

Related Posts

New York Times ask OpenAI to hand over 20m ChatGPT conversations

The ChatGPT conversations of millions of users are now at the centre of a high-stakes legal fight. The New York Times is demanding access to 20 million anonymised ChatGPT conversations from OpenAI as part of its copyright-infringement lawsuit. OpenAI is pushing back, warning this demand poses a sweeping risk to user privacy


What’s Going On?

  • The New York Times (NYT), along with other news organisations, sued OpenAI and Microsoft Corporation in December 2023, alleging they used NYT articles without permission to train ChatGPT and other large-language models
  • As part of the discovery process in this lawsuit, a federal judge in New York ordered OpenAI to produce some 20 million ChatGPT consumer conversation logs.
  • OpenAI argues the production order is overbroad, saying “99.99%” of the logs have no relevance to the NYT’s claims and that handing them over would expose deeply personal user data.
  • The logs in question cover a time-window from December 2022 to November 2024.

Why The New York Times Wants the Conversations

  • The lawsuit alleges that OpenAI used millions of NYT articles to train ChatGPT and that the model produces text derived from that copyrighted content. The NYT claims access to the chat logs may reveal instances where users asked ChatGPT to retrieve NYT content or where ChatGPT responded with NYT content.
  • By analysing a large sample of user-model exchanges, the NYT’s legal team hopes to provide evidence of how the model behaves, how often it reproduces or paraphrases NYT works, and whether the usage supports their infringement claims.

OpenAI’s Concerns: Privacy & Precedent

  • OpenAI says the order would expose personal and confidential user chats — including “sensitive conversations, files, credentials, memories, searches, payment information” — for users who have no connection to the NYT lawsuit.
  • The company argues this is not targeted discovery but a “speculative fishing expedition” through tens of millions of private logs.
  • OpenAI has proposed alternative, more limited approaches (for example, samples that specifically contain NYT-content references) but claims the NYT rejected them.
  • If the order is enforced, this could set a precedent for how much user-data AI firms must turn over in discovery for copyright cases — raising wider questions of user trust, data minimisation, and regulatory implications.

Legal & Technical Dimensions

Legal

  • Discovery rules in U.S. civil litigation allow plaintiffs to request data that may bear on claims or defenses. The judge found OpenAI’s privacy protections and the proposed de-identification adequate to justify the production of the conversations. Business Insider
  • However, OpenAI is appealing the decision, arguing the legal standard was mis-applied and that the order violates user privacy rights and common-sense data-security practices.

Technical

  • The logs are reportedly de-identified by OpenAI (removing or redacting personally identifiable information) and stored under a separate legal-hold system.
  • OpenAI emphasises only a small, audited legal/security team would normally access such data; if the NYT prevails, its lawyers and their consultants may get access under strict protective order rules.
  • OpenAI’s user policy and data-retention practices are under scrutiny: e.g., it has been reported that even deleted chats may now be preserved under court order.

What This Means for Users & Industry

  • For users: If enforced, millions of ChatGPT users’ past conversations might become accessible in legal proceedings, even if they are unrelated to the dispute at hand — raising new concerns about privacy expectations and how user data is stored and potentially disclosed.
  • For AI firms: The case underscores the tension between model-training or service-delivery data retention practices and user privacy/disclosure obligations under litigation. Companies might need to rethink how they log, store, anonymise, or restrict access to user data.
  • For publishers/creators: The outcome could influence how data rights, copyright, and model-input output practices interact. Publishers like the NYT may gain stronger tools to audit how AI models use their content and argue for compensation or control.
  • For regulation: Legal precedent from this case may affect how global jurisdictions approach AI-data disclosure, platform liability, and user-privacy protections in the context of AI services.

Specific Implications for India & International Context

  • While this is a U.S. case, India users of ChatGPT or similar services should note the risk that conversations held on global platforms might be subject to disclosure under foreign or cross-border legal orders (depending on how data is stored, accessed or processed).
  • Indian companies providing AI services must pay attention: this scenario could shape contractual terms, data retention/disclosure policies, and user-data protections in jurisdictions including India.
  • The broader dispute may influence Indian policy discussions about data governance, AI transparency, intermediary liability and user rights in digital platforms.

What Happens Next?

  • OpenAI has filed for reconsideration of the discovery order. The court will decide whether to revisit or refine the order (e.g., to limit the number of chats, restrict access, enhance protections) or leave it as is.
  • If the NYT obtains access, the logs may be used to build evidence in the copyright lawsuit — for example, quantifying how often ChatGPT responses replicate NYT content, which could influence damages or settlement.
  • Regardless of outcome, the case may conclude via settlement, dismissal, or proceed to full trial — but the precedent will likely be referenced in many future AI-copyright/privacy disputes.

Conclusion

The debate over ChatGPT conversations—whether they should remain strictly private or be produced in litigation—highlights a fundamental tension in the age of generative AI. On one hand, publishers seek accountability and transparency in how AI models use their work. On the other, AI platforms must safeguard user data and maintain trust.
The result of the OpenAI vs The New York Times case may have far-reaching implications: for how much user data platforms preserve, how they respond to legal requests, how creators’ rights are protected, and how users’ expectations of privacy evolve in the digital age.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Popular Articles