Anthropic Mythos Leak: The Cybersecurity Story

A CMS error exposed Claude Mythos, a 10-trillion-parameter model that Anthropic privately warned government officials could make large-scale cyberattacks "far more likely." Five days later, it leaked its own source code. Now Congress wants answers.

On March 26, a configuration error in Anthropic's content management system made roughly 3,000 unpublished assets publicly accessible. Blog drafts, internal documents, product specifications. Among them: details of a model the company had not announced and was not ready to discuss.

The model is called Claude Mythos. Internally, it goes by Capybara. It sits above the existing Opus tier, runs at an estimated 10 trillion parameters, and scores dramatically higher than any previous Anthropic model on coding, academic reasoning, and cybersecurity benchmarks. The company describes it as "a step change" in AI performance and "the most capable we have built to date."

Five days later, on March 31, Anthropic shipped a public npm package containing a 59.8 MB source map file that exposed roughly 512,000 lines of Claude Code's internal TypeScript. Security researcher Chaofan Shou posted the direct link to Anthropic's own Cloudflare R2 storage bucket on X. Mirrored repositories appeared on GitHub within hours.

Two security incidents in five days. From the company that briefs governments on AI risk.

❝

The company warning the world about the cybersecurity threat of advanced AI models just demonstrated, twice, that securing its own infrastructure is a problem it has not solved.

What Mythos Actually Is

Forget the parameter count for a moment. The number that matters is what the model can do.

Where previous Claude models respond to instructions step by step, Mythos plans and executes sequences of actions autonomously. It moves across systems, makes decisions, and completes operations without waiting for human input at each stage. That is the difference between a chatbot and an agent. And this agent, according to Anthropic's own leaked documentation, is "currently far ahead of any other AI model in cyber capabilities."

Capybara is not just a model name. It is a new product tier. The leaked draft blog post describes it as "larger and more intelligent than our Opus models." Early access customers are already testing it, though Anthropic has not said how many or named any of them. No public release date has been set, partly because the model remains expensive to run at scale.

Here is the thing. Anthropic did not choose to announce any of this. A misconfigured CMS did it for them. The company confirmed the model's existence to Fortune only after the leak made denial pointless.

The Cybersecurity Problem

The leaked documentation does not hedge. Anthropic believes Mythos poses "unprecedented cybersecurity risks" and has been privately briefing senior government officials that the model makes large-scale cyberattacks "far more likely" in 2026.

Read that again. The company building the model is telling governments it will make attacks more likely. Not might. Will.

The specific risks are concrete. AI-assisted vulnerability discovery and exploit development. Highly targeted social engineering at scale. Automated reconnaissance that can penetrate corporate, government, and municipal systems with minimal human involvement. A Dark Reading poll found that 48 percent of cybersecurity professionals now rank agentic AI as the number one attack vector for 2026. Above deepfakes. Above everything else.

Anthropic's answer is a defender-first release strategy. The plan, as described in the leaked materials, is to give early access to organisations focused on cybersecurity defence, letting them "improve the robustness of their codebases" before the model reaches wider distribution. That is a reasonable approach in theory. In practice, the model's capabilities leaked to the entire internet before the defender programme got off the ground.

We mapped the evidence on AI agent security in our analysis of red-team findings last month. Every agent tested was compromised at least once. Credit card exfiltration succeeded 10 out of 10 times. Mythos does not change the nature of those vulnerabilities. It changes the speed and scale at which they can be found and exploited.

The Market Read It as a Threat

The market reaction was immediate and blunt.

CrowdStrike dropped 7 percent. Palo Alto Networks fell 6 percent. Zscaler lost 4.5 percent. Okta, SentinelOne, and Fortinet each dropped roughly 3 percent. The iShares Expanded Tech-Software Sector ETF (IGV) fell nearly 3 percent. Bitcoin slid to $66,000.

The logic is straightforward. If an AI model can find and exploit vulnerabilities faster than defenders can patch them, then the entire defensive cybersecurity industry faces an existential challenge to its value proposition. The market priced that in within hours.

Whether that reaction was proportionate is another question. Cybersecurity companies have been integrating AI into their own products for years. CrowdStrike uses AI-driven threat detection. Palo Alto Networks has built AI into its security operations platform. The question is not whether defenders will also use Mythos-class models. They will. The question is whether the offence-defence balance has permanently shifted.

Axios reported that AI is currently providing more meaningful capability uplift to attackers than to defenders, and that gap is widening. Anthropic itself acknowledged that its safety architecture "meaningfully reduces these risks but does not eliminate them." Safety training and capability removal are different things.

Congress Wants Answers

On April 2, Rep. Josh Gottheimer (D-NJ) wrote to Anthropic CEO Dario Amodei raising national security concerns about the back-to-back leaks.

The letter covers three areas. First, the operational security failures. Two leaks in five days is a pattern, not a one-off. Second, the policy rollbacks. In late February, Anthropic narrowed its AI safety pledge, removing a previous commitment to halt development if models outpaced safety procedures. The company now says it will grade itself on "nonbinding but publicly-declared" goals. Third, the geopolitical risk. Gottheimer cited a CCP-backed group that hacked Claude last year, and the fact that Anthropic is involved in an ongoing dispute with the Pentagon over AI deployment.

That last point matters. Anthropic's tools are becoming embedded in defence and intelligence operations. A company that cannot keep its own CMS locked down is asking the US government to trust it with national security infrastructure.

Anthropic responded that "no sensitive customer data or credentials were involved or exposed" and characterised both incidents as human error, not security breaches. That distinction may be technically accurate. It is not reassuring.

The Business Behind the Model

Strip away the security drama and the business picture is formidable.

Anthropic closed a $30 billion funding round in February at a $380 billion post-money valuation. Annualised revenue has climbed past $19 billion. Enterprise customers account for 80 percent of revenue. Claude Code alone generates $2.5 billion in annualised revenue, with enterprise users making up more than half. The company generates $0.23 in annualised revenue per dollar raised, compared to OpenAI's $0.11.

The IPO is expected as soon as Q4 2026, with bankers targeting a raise north of $60 billion. For context, we published our analysis of OpenAI's $852 billion valuation yesterday. OpenAI dominates consumer. Anthropic dominates enterprise. That split is sharpening.

Mythos is the competitive moat. A model that sits above Opus in capability, with a defender-first distribution strategy, positions Anthropic as the provider governments and regulated industries turn to when they need the most powerful AI available and cannot afford the security risk of an open platform. The irony of the leaks is that they undermine that exact positioning at the worst possible moment.

What This Tells Us

Two things are true at the same time. Anthropic has built a model that genuinely advances the frontier of what AI can do. And Anthropic has demonstrated, in the most public way possible, that the organisations building frontier AI are not immune to the same operational security failures they warn everyone else about.

The defender-first strategy is the right instinct. Giving security teams a head start with the most capable tools before those tools reach general distribution is how you responsibly manage a capability gap. But the strategy only works if the company executing it can secure its own infrastructure. Two leaks in five days calls that into question.

For anyone building with AI agents in payments, finance, or commerce, the Mythos leak is a signal to pay attention to. We have been tracking the security gaps in agentic commerce since January. The attack surface is already wide. Models at this capability level make it wider. The question is not whether your systems will face AI-powered probing. It is whether your defences are calibrated for the tools that are coming, not the tools that exist today.

Sources

If the company building the most capable AI model in the world cannot secure its own infrastructure, what does that tell us about everyone else's readiness?