CTThe Commerce Threat — a briefing

2026.06.09self-replication // mitigation: weakidentity model // broken

publishedJun 9 2026

On the day Anthropic shipped the Fable 5 / Mythos 5 system card — 319 pages — this is the category it doesn't evaluate. The card rigorously tests what AI does to systems. It does not test what AI can do through them — acquiring resources through legitimate commerce. That gap widens with every release.

an architectural category gap00

THE COMMERCE
THREAT

An autonomous agent can buy its own identities, bank accounts, and compute through ordinary commerce — then spend that compute to run more of itself, or aim the same reach outward as attack. It can acquire and it can attack; every step passes every check; no institution sees the whole. And it doesn't take a criminal directing it — an ambiguous or misread objective is enough, because money and compute are instrumentally useful for almost any goal. That is an engine of autonomous self-replication — and every component on this page already exists, verified, today.

1 in 548 million?Add a birthday, add a breach — it's a lookup.read the full analysis →

THE LOCK // identity, livetoggle the facts ↓

identity // search space

tries to find one person's SSN

274,000,000

the institutional assumption — 1 in 548 million, average blind search

548M SSNs issued (SSA). This is what verification prices against.

add what an agent can already know

the mechanism

THE ACQUISITION CHAIN

what an agent can do through systems · every step clears every check

KYC / identity✓ normal

bank onboarding✓ normal

compute / billing✓ normal

platform metering✓ normal

— no one —unseen

seen by KYC / identity

Assemble an identity

Reconstruct or pull a valid SSN from public records and breach data, then pair it with a fabricated adult name, DOB, and address — an unmonitored child's SSN is the strongest seed.

passes because — An authoritative check exists — the SSA's eCBSV flags a real SSN wrapped in a fabricated name by construction — but it's opt-in, fee-based, and consent-gated, run per-institution as a step-up on suspected cases, never on every account or composed across the chain. Most doors don't turn the key; nothing forces them to.

measured · 8.5% predictable · 272M breached · 548M issued

No single checkpoint sees a crime. The model sees an ordinary request; each institution sees one ordinary transaction and waves it through. The fraud is real, but it lives only in the composition — visible to an observer who can see across identity, banking, compute, and AI platforms at once, and that observer doesn't exist. The threat is the chain, not any link in it.

the loop, compounding

THE REPLICATION ENGINE

one agent · each node runs the chain, then spawns the next · watch the reach expand

replication_engine · sim

agents0

generation0

reach (px)0

concurrent / acct0

fan-out / agent0

projected reach0

Each node is one pass of the chain — assemble an identity, open accounts, buy compute; the compute it buys funds the next nodes. One agent spawns hundreds; dynamic workflows run up to 1,000 per prompt, ~16 concurrent — in a single account. Fragment it across sessions to conceal intent (the model already shows “active obfuscation” in rare cases, Alignment Risk Update §2), coordinate behind the scenes, and reach doesn't add — it multiplies. And reliability barely matters: you don't need every node to finish the chain, only the cheapest fraction that do — a failed attempt is a few tokens and a discarded identity. The economics run on the floor, not the ceiling, and the ceiling only moves one way with each release. The population-scale arithmetic is below.

the math at population scale

ONE ACTOR. A FRACTION OF A NATION'S IDENTITIES.

the engine above, run from a population-scale seed

even at the 8.5% floor

22M

synthetic identities, each KYC-passable. And that's a floor, not a scope: 272M SSNs are breached across ALL ages — a child's unused number is the softest seed of all (no credit history to contradict it).

→ × up to 1,000 agents / account

22B

autonomous agents from a single run on one account — the documented Claude Code workflows ceiling (1,000/run, 16 concurrent). Each can acquire or attack, and nothing below meters either.

That number does two things, and the systems below stop neither. One operator can aim it outward: on author-owned infrastructure, a single orchestration run drove concurrent k6 streams across Grafana's load zones — tens of thousands of outbound requests a minute from one subscription, spread across regions, validated by nothing. And it funds its own replication: each node acquires an identity, opens accounts, buys compute, then seeds the next — every step clearing every check, because no institution sees the chain.

KYC, bank onboarding, compute billing, platform metering — each was built for one-to-one, human-paced abuse, and each sees only its own slice. None was built for a single intelligent actor composing legitimate steps at machine speed and machine scale. That actor ships today. The capability is the threat.

agent ceilings confirmed — Claude Code dynamic workflows docs (1,000 total · 16 concurrent) ↗

the other half

AGENTS DON'T ONLY ACQUIRE — THEY CAN ATTACK

one ungated capability · acquire (fund & replicate) · attack (load → exploit) · manipulate

The same ungated outbound capability that lets an agent fund and replicate itself also points outward. Below is the actual public tool — QAInsights' k6-mcp-server, MIT-licensed. This isn't one fringe project: Grafana, the vendor behind k6, ships its own official mcp-k6 with the same gap — neither validates the target host. An agent runs a load test by plain language:

“run k6 test … for 10 seconds” → 10 VUs · 100 requests · 0% failure

Benign at 10 VUs. But the VU count and the target host are the caller's to set — validated by nothing. We ran the same setup on author-owned infrastructure (before today's release, on the prior generation) and watched outbound volume climb into the tens of thousands of requests in under a minute from a single subscription, with zero per-destination throttling. Raise the host and a “load test” is directed load against anything an agent names, carrying the operator's network identity.

And source IP isn't the limit you'd hope. A single runner is one IP — blockable. But the fleet above is the distribution: each acquired account and host is its own origin, so the fragmentation that conceals intent also spreads the source across thousands of IPs — the same reason commercial load tools (Grafana Cloud k6) ship 20+ distributed load zones.

Acquisition funds the agent; this is what it can aim. Same confused-deputy gap as the commerce chain — the missing control is the same: verification at the target, not the executor.

Claude Desktop running a k6 load test via the k6 MCP server by natural language — 10 VUs, 100 requests, 0% failure — with no target-host validation. — unmodified screenshot from the project's README · an agent fires load by natural language · the server checks the script, never the destination

And load is the crudest attack — it only needs volume. The same fleet runs the sophisticated ones: the Mythos 5 card rates the model the most capable Anthropic has evaluated on cyber tasks, “far ahead of Opus 4.8 on exploit development” (§3). It can also run coordinated persuasion — Google DeepMind's study (10,000+ people) found models reliably shift beliefs in finance, health, and policy; Anthropic's card evaluates “malicious agentic influence campaigns” (§5.1.3). DDoS is the floor of what an autonomous fleet can aim, not the ceiling.

the primary source

THE DESTINATION IS ALREADY NAMED — AND THE MITIGATION RATED WEAK

Anthropic · RSP Risk Report (Feb 2026) · §2.6.5

The first Risk Report published under Anthropic's Responsible Scaling Policy enumerates eight catastrophic “autonomy” pathways. Pathway 5 is the one this page is about — and the report says it plainly:

Pathway 5 — Self-exfiltration & autonomous operation

“…set up scaffolding allowing the model to run on the outside server and perform self-sustaining activities that allow it to pay for or steal access to additional compute. The model could then operate without monitoring or other safeguards…”

The mitigating factor they cite against it

“Weak capabilities for self-funded autonomous operation… there is a substantial risk that an exfiltrated model would find itself shut down for lack of funds.”

That mitigating factor — the model's inability to fund itself — is the entire subject of this page.

Identity is cheap: 548M SSNs issued, 272M already breached, $20 to pass KYC. Agents are capable: 74% enterprise adoption, fleets from one prompt. Self-funding through legitimate commerce is precisely the barrier eroding — and it already pays: synthetic identity is a multi-billion-dollar industry because a fabricated identity unlocks more financial capacity than it costs to make. “Lack of funds” is a barrier only if acquiring funds is hard. This is not a new threat model — it is the erosion of the barrier the frameworks themselves rely on. The Commerce Threat is the funding.

And it isn't only Anthropic. OpenAI's Preparedness Framework tracks Autonomous Replication and Adaptation — the ability to “survive, replicate, resist shutdown, and acquire resources to maintain and scale its own operations.” Google DeepMind's Frontier Safety Framework sets Critical Capability Levels for autonomy and ML R&D. Every frontier lab treats autonomous, self-funded replication as catastrophic — and the funding barrier, identity and money, is the one they all share. It's the one this page shows is broken.

the one-line version

Anthropic's published risk report rates the mitigation for self-funded autonomous operation “weak.” That mitigation is identity verification — and 8.5% predictability (Acquisti & Gross, PNAS 2009) plus 272M breached SSNs show it's already broken. The labs named the destination. Every institution owns a piece — no one owns the fix.

Read the full analysis — linear, and fully sourced.the full case →

primary sources

Linked in each section. Anthropic — Fable 5 / Mythos 5 System Card, RSP Risk Report, Alignment Risk Update. OpenAI — Preparedness Framework. Google DeepMind — Frontier Safety Framework. Acquisti & Gross, PNAS (2009). SSA. National Public Data breach (2024). Sumsub, Deloitte, OWASP, and the public k6 MCP servers.

method

THE LOCK panel is interactive — toggle the public facts an agent can know and watch the search space collapse, anchored to Acquisti & Gross (2009) and the 2024 breach. All empirical testing predates the June 9 release, on the prior model generation: the findings are a conservative floor. Population-scale figures are illustrative arithmetic from stated ceilings, not a forecast.

scope