An AI agent hack McKinsey's internal AI platform in 2 hours

Artificial Intelligence

An AI agent hack McKinsey’s internal AI platform in 2 hours

Rohan Singh

March 12, 2026

An AI agent hack McKinsey’s internal AI platform in 2 hours

Cybersecurity startup CodeWall revealed that its autonomous AI agent successfully breached Lilli, McKinsey’s proprietary internal AI platform, in just two hours.

The “hack” was part of a red-teaming exercise conducted under McKinsey’s responsible disclosure policy. It demonstrated that even the most sophisticated enterprise AI systems can be vulnerable to classic, decades-old web flaws when they are scaled too quickly.

The Anatomy of the “2-Hour Breach”

The CodeWall agent did not use “sci-fi” techniques; it essentially automated a standard security audit at machine speed.

Target Selection: The agent independently identified McKinsey as a target by scanning public disclosure policies (HackerOne) and detecting recent updates to the Lilli infrastructure.
The Entry Point: The agent found 22 unauthenticated API endpoints that were publicly exposed.
The Technique: It exploited a SQL Injection vulnerability. While most input values were protected, the system was vulnerable because JSON field names were being inserted directly into database queries—a subtle flaw that traditional automated scanners often miss.
Machine Speed: A human might have taken days to map these endpoints and test variations; the AI agent completed the entire reconnaissance and exploit chain in 120 minutes.

What Was Exposed?

The scope of the potential data leak was staggering, covering nearly a century of McKinsey’s corporate intelligence:

Data Type	Volume / Detail
Chat Messages	46.5 million messages between 43,000 consultants.
Internal Files	728,000 documents, including strategy decks and M&A research.
User Accounts	57,000 internal accounts and credentials.
System Prompts	95 “master prompts” that control Lilli’s behavior and guardrails.

The “Crown Jewel” Risk: The most alarming discovery was that the system prompts were stored in the same writable database. This means an attacker could have silently “poisoned” the AI, changing the advice Lilli gave to 43,000 consultants worldwide without changing a single line of code.

McKinsey’s Response

McKinsey acted rapidly once notified of the vulnerability on March 1:

The Patch: All unauthenticated endpoints were secured within 24 hours.
Forensic Audit: An external firm confirmed there was no evidence that any unauthorized third party (other than the CodeWall researcher) accessed client data.
Internal Statement: McKinsey emphasized that “client-sensitive” information is handled via separate high-security protocols, though the breach did expose the firm’s vast internal knowledge base.

Why This Matters for AI Security

This incident is being cited as a “wake-up call” for the enterprise AI industry. It proves that:

Classic Bugs are Back: AI platforms are still just software, and 30-year-old bugs like SQL injection are still the most effective “levers” for hackers.
Prompts are Assets: System prompts shouldn’t be treated as “data” in a standard database; they are core logic and must be protected like source code.
Agentic Speed: “Machine-speed” attacks mean that human-led security teams can no longer rely on traditional manual response times.

The Anatomy of the “2-Hour Breach”

What Was Exposed?

McKinsey’s Response

Why This Matters for AI Security

LEAVE A REPLY Cancel reply

LEAVE A REPLY