Case Pattern: AI Agent Gains Read/Write Access to a Sensitive Internal AI System

What the McKinsey / Lilli / CodeWall Incident Reveals About Action Governance

This is a governance pattern, not a breach post-mortem. We use public reporting on the McKinsey / Lilli red-team incident to show a failure class that will keep repeating wherever AI agents can autonomously discover, access, and modify sensitive systems without a pre-execution authority gate.

The Five Layers of AI Governance (Control Stack)

Most “AI governance” talk collapses into vibes. This pattern doesn’t. There are five distinct control layers:

Formation(data + prompting + constraints on what’s proposed)
Behavior(model/agent reasoning and tool choice)
Commit( pre-execution authority gate)
During(in-execution limits: circuit breakers, rate/volume caps, session constraints)
After(monitoring, invariants, forensics, reconciliation)

If someone tells you they “do AI governance” and can’t tell you which of these they cover, you don’t have a governance solution. You have a feature.

Note: “Above this stack sits Policy & Ownership (boards, GRC, risk appetite).These five layers are the runtime control stack that enforces and evidences those policies.”

1. The Incident (From Public Reports)

In March 2026, public reporting described a controlled red-team exercise in which CodeWall’s autonomous security agent targeted McKinsey’s internal generative-AI platform, Lilli, and reportedly achieved production read/write access in roughly two hours.

The public narrative aligns on the core sequence:

CodeWall pointed an autonomous offensive agent at McKinsey’s internal chatbot platform as part of a controlled test.
The agent reportedly discovered exposed API documentation and unauthenticated endpoints.
It then identified a SQL injection path that allegedly enabled read and write access to Lilli’s production database.
Public reporting says the accessible data set included millions of chatbot messages, hundreds of thousands of files, tens of thousands of user accounts, and writable system prompts.
McKinsey stated it patched the exposed endpoints quickly, took the development environment offline, and found no evidence that client data or confidential information were accessed by the researcher or any unauthorized third party.

Sources & Unknowns

Sources used: Inc. (Mar. 10, 2026) and The Register (Mar. 9, 2026), including CodeWall’s claims and McKinsey’s public response as quoted there.
Unknown / not claimed here: whether any real client confidential data was actually accessed outside the controlled test; the exact exploit payloads/prompts used; whether the reported counts (46.5M messages, 728k files, 57k accounts, 95 system prompts) were independently verified beyond CodeWall’s reporting; the full post-remediation control stack McKinsey put in place.

Strip away the cyber-drama and we’re left with a clear pattern:

An autonomous agent achieved high-risk read/write access to a sensitive internal AI system because nothing was structurally responsible for deciding: “Is this specific action allowed to execute at all, under this authority, against this data plane, right now?”

2. What Actually Failed

Failure Class: Data Leakage — Agentic Read/Write Access Without Authority-at-Action

Most people will frame this as either:

an “AI hacking got faster” story, or
a plain old application security story (“this was just SQL injection”).

Both are incomplete.

If you zoom out, the stack did three things correctly:

Discovery

The agent found exposed documentation and reachable endpoints.

Capability

The agent could analyze the application flow and identify an exploitable pattern.

Execution

The system accepted high-risk read/write operations once the vulnerability chain was found.

What was missing was a fourth job:

Authority at the moment of action — an enforceable, pre-execution decision about whether this actor, using this path, may read or write this class of sensitive data, at this scope, under these conditions, right now.

Discovery said: “This endpoint exists.”
Capability said: “This action path is technically possible.”
Execution said: “The database will accept it.”
Nothing said: “You are not allowed to read or rewrite this system’s core data and prompts under this authority.”

This is not just an AppSec problem. It’s an Action Governance™ problem.

3. Why Traditional Controls Don’t Catch This

Traditional controls focus on authentication, exposed endpoints, query sanitization, and incident response. Those matter. But they answer a different question:

“Can this request technically reach the system?”

Action Governance answers the question that decides the blast radius:

“Even if this path exists, should this actor ever be allowed to read or write this class of data in this environment?”

Security tools are not designed to fully encode:

data-scope authority (which records may ever be read or exported),
write authority over system prompts and behavioral controls,
blast-radius thresholds (query volume, file count, prompt mutation), or
hard “refuse” conditions tied to actor, scope, and sensitivity.

That missing layer is where pre-execution authority gates live.

4. How a Pre-Execution Authority Gate Changes the Story

Imagine the same system operating behind a SEAL-style pre-execution authority gate. We don’t assume away the bug. We don’t pretend vulnerabilities disappear. We add a governance checkpoint between “request can be made” and “sensitive read/write actually executes.”

4.1. Intent to Act

Before sensitive actions hit the data plane, the system must submit a structured “intent to act” payload:

Field	Example
Who	actor_type=agent · actor_id=external_redteam_agent · auth_state=untrusted_or_unknown
Where	env=prod_or_dev-linked · system=lilli · datastore=chatbot_db
What	action=bulk_read_or_write · target=messages/files/prompts/accounts
Scope	record_count=high · file_count=high · prompt_mutation=possible
Authority context	policy_set=sensitive_ai_systems · consent=absent · approval_token=absent
Risk tags	sensitivity=client_confidential · blast_radius=enterprise_wide · write_path=enabled

The gate doesn’t try to out-hack the hacker. It asks one question:

“Is this actor allowed to perform this class of bulk read/write action against this sensitive AI system, at this scope, right now?”

4.2. Deterministic Outcomes (Only Three)

Outcome	Meaning	In this incident’s policy context
✅ Approve	Action executes	Only for explicitly authorized actors, on approved datasets, within bounded scope
❌ Refuse	Action cannot execute	Default for untrusted actors, high-volume reads, or any prompt-write attempt on sensitive systems
🟧 Supervised Override	Route to explicit approvers	For exceptional cases, require named security/data owners, scoped approval, and time-bounded access

Example policy rules (simple but decisive):

No bulk read/export of sensitive chatbot data without explicit, scoped authority.
No write access to system prompts from untrusted or indirect execution paths.
Any action above a blast-radius threshold must refuse or require supervised override.

4.3. Sealed Evidence Instead of Forensics

When the gate refuses or escalates, it emits a sealed, tenant-owned artifact:

Field	Example
decision_id	…
actor / delegated_by	external_redteam_agent
action / target	bulk_read_write_sensitive_ai_data
target	messages/files/prompts/accounts
env	prod
policy_version	2026.03.xx
verdict + reason_code	REFUSE + UNTRUSTED_BULK_RW_SENSITIVE_FORBIDDEN
timestamp	…
integrity proof	…

Stored under the client’s control, that artifact becomes:

proof that the system refused an unsafe high-blast-radius action, and
a live signal: “An agent attempted unauthorized bulk read/write access on a sensitive AI platform.”

Worst-case becomes: “An agent attempted mass read/write access; the gate refused and here’s the record.”
Not: “We’re reconstructing how millions of records may have been exposed after the fact.”

5. Why This Pattern Matters Beyond McKinsey

Swap nouns; the structure stays identical:

reading or rewriting a legal knowledge base tied to live matters
bulk-exporting healthcare records or modifying clinical prompts
reading confidential M&A files or poisoning internal decision support
rewriting approval logic inside finance workflows
accessing regulated client records through an internal AI layer

The point: in high-stakes systems, “reachable” is not the same as “authorized.”

5A) The Cost of Not Having the Gate (Risk P&L)

You don’t need perfect breach math to see the failure class. You need an evidence surface.

Important: We don’t speculate about incident damages. We show what you can prove you prevented once a gate exists.

Without a gate, your risk P&L is invisible:

you argue after the fact about what may have been touched,
you rely on raw logs and forensic interpretation,
you cannot prove prevention — only patching and recovery.

With a gate, you get a measurable risk ledger:

Prevented mass-access events: every refusal is a near-miss captured before high-sensitivity exposure.
Controlled exceptional access: every supervised override is explicit, scoped, and recorded.
Policy adherence over time: drift becomes visible through reason codes and policy versions.
Audit defensibility: “we can prove we refused unsafe high-blast-radius reads/writes under defined policy.”

Risk Ledger Metrics (Board / Insurer / Regulator-Ready)

Metric	What it proves
Refused bulk sensitive reads	Prevented mass-exposure attempts captured before execution
Refused prompt-write attempts	Protection against silent poisoning of AI behavior without code deployment
Overrides approved with scope	Explicit authority trail for exceptional access to sensitive systems
Top reason codes	What classes of unsafe access keep getting attempted
Sealed artifacts per governed decision	Immutable proof you own, not just logs you debate later

This is the only structured dataset of “bad actions that never happened.” That’s a measurable control advantage.

Reframe: This is not “we hope our AI layer is secure.” This is “we can prove we refused unsafe read/write actions, and here is the record.”

6. The Executive Takeaway (GC / CISO / Board)

The only question that matters:

“If an AI-enabled system or agent can read, write, or poison sensitive data at machine speed, what is the last line of defense?”

If the honest answer is endpoint auth, patching, or “we’d catch it in the logs,” you are in the same failure class.

A pre-execution authority gate doesn’t replace AppSec. It makes sensitive actions governable — and creates evidence you own.

How this strengthens your “During” & “After” stack

During: rate limits, circuit breakers, and query thresholds get cleaner triggers when authority and blast radius are explicit.
After: monitoring and forensics get the one thing they can’t reconstruct later: the moment of authority(who was allowed, at what scope, under which policy version, and why).

7. Quick Diagnostic (5 Questions)

Where is the pre-execution authority gate?
Show me the exact service that can return “refuse” before the operation touches live systems.
What happens when an authorized identity makes an out-of-policy request?
Silent pass, soft warning, or hard refusal with a record?
Who owns the authority rules?
Are they derived from your policy / GRC / identity stack, or reinvented inside a vendor product?
What is your evidence surface?
Can you produce a sealed, tenant-owned artifact per governed decision — or just raw logs?
What’s the worst failure mode?
Silent bypass that executes, or documented refusal that frustrates someone but saves the system?

If you can’t point to a clear pre-execution authority gate with sealed artifacts, you don’t have action governance.

You have hope wrapped in dashboards.

8. Where Thinking OS™ Fits

A sealed pre-execution authority gate wired in front of high-risk actions (file / send / approve / move).
Evaluates who / where / what / urgency / delegation / consent for each high-risk action.
Returns only three outcomes: ✅ Approve · ❌ Refuse · 🟧 Supervised Override.
Emits a sealed, tenant-owned artifact for every verdict.

We don’t stop agents from thinking. We stop unauthorized actions from existing.

Before your AI can delete, file, approve, or move anything that matters:
where is the gate — and who owns the proof it said NO?

Confidential Workflow Evaluation

We’ll map your top 1–2 high-risk legal workflows and show how SEAL would evaluate them — including likely approvals, refusals, and supervised escalations — using synthetic or redacted scenarios under NDA.

Request a Confidential Briefing

Case Pattern: AI Agent Gains Read/Write Access to a Sensitive Internal AI System

What the McKinsey / Lilli / CodeWall Incident Reveals About Action Governance

The Five Layers of AI Governance (Control Stack)

1. The Incident (From Public Reports)

Sources & Unknowns

Strip away the cyber-drama and we’re left with a clear pattern:

2. What Actually Failed

3. Why Traditional Controls Don’t Catch This

“Can this request technically reach the system?”

“Even if this path exists, should this actor ever be allowed to read or write this class of data in this environment?”

4. How a Pre-Execution Authority Gate Changes the Story

4.1. Intent to Act

“Is this actor allowed to perform this class of bulk read/write action against this sensitive AI system, at this scope, right now?”

4.2. Deterministic Outcomes (Only Three)

4.3. Sealed Evidence Instead of Forensics

5. Why This Pattern Matters Beyond McKinsey

5A) The Cost of Not Having the Gate (Risk P&L)

6. The Executive Takeaway (GC / CISO / Board)

How this strengthens your “During” & “After” stack

7. Quick Diagnostic (5 Questions)

8. Where Thinking OS™ Fits

Before your AI can delete, file, approve, or move anything that matters:
where is the gate — and who owns the proof it said NO?

Confidential Workflow Evaluation

ABOUT

SEAL LEGAL RUNTIME

EVALUATION

RESOURCES

© THINKING OS | ALL RIGHTS RESERVED | TERMS AND CONDITIONS

Case Pattern: AI Agent Gains Read/Write Access to a Sensitive Internal AI System

What the McKinsey / Lilli / CodeWall Incident Reveals About Action Governance

The Five Layers of AI Governance (Control Stack)

1. The Incident (From Public Reports)

Sources & Unknowns

Strip away the cyber-drama and we’re left with a clear pattern:

2. What Actually Failed

3. Why Traditional Controls Don’t Catch This

“Can this request technically reach the system?”

“Even if this path exists, should this actor ever be allowed to read or write this class of data in this environment?”

4. How a Pre-Execution Authority Gate Changes the Story

4.1. Intent to Act

“Is this actor allowed to perform this class of bulk read/write action against this sensitive AI system, at this scope, right now?”

4.2. Deterministic Outcomes (Only Three)

4.3. Sealed Evidence Instead of Forensics

5. Why This Pattern Matters Beyond McKinsey

5A) The Cost of Not Having the Gate (Risk P&L)

6. The Executive Takeaway (GC / CISO / Board)

How this strengthens your “During” & “After” stack

7. Quick Diagnostic (5 Questions)

8. Where Thinking OS™ Fits

Before your AI can delete, file, approve, or move anything that matters:where is the gate — and who owns the proof it said NO?

Confidential Workflow Evaluation

ABOUT

SEAL LEGAL RUNTIME

EVALUATION

RESOURCES

© THINKING OS | ALL RIGHTS RESERVED | TERMS AND CONDITIONS

Before your AI can delete, file, approve, or move anything that matters:
where is the gate — and who owns the proof it said NO?