Anthropic Built an AI Too Dangerous to Release. So It Gave It to Apple, Microsoft, and the Pentagon.

Anthropic just did something no major AI lab has done before. It built a model so capable at finding security vulnerabilities that it chose not to release it publicly. Instead, the company handed it to a coalition of the world's largest technology companies and told them to use it to fix their software before someone else uses something similar to break it.

The model is called Claude Mythos Preview. The initiative is called Project Glasswing. And what it has already found raises questions about the entire security posture of the software industry.

The cybersecurity landscape just split into before and after. Anthropic is betting that giving defenders a head start matters more than giving everyone access.

What Glasswing Actually Is

Project Glasswing is a coordinated cybersecurity effort built around Claude Mythos Preview, a general-purpose frontier model that Anthropic describes as its most powerful to date. The model was not specifically trained for cybersecurity. According to Anthropic's announcement, it is the model's "strong agentic coding and reasoning skills" that make it so effective at finding and exploiting software flaws.

The launch partners read like a who's who of global technology infrastructure: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. Roughly 40 additional organizations that build or maintain critical software have also been granted access.

Anthropic is committing up to $100 million in usage credits for the program, plus $4 million in direct donations to open-source security bodies, including $2.5 million to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5 million to the Apache Software Foundation, according to VentureBeat.

Access is restricted by design. Anthropic has no plans to make Mythos Preview generally available.

What the Model Found

In just a few weeks of operation, Claude Mythos Preview identified thousands of zero-day vulnerabilities, meaning flaws previously unknown to software developers, across every major operating system and every major web browser, along with a range of other critical software.

Several of these vulnerabilities had existed undetected for years. Fortune reported that the oldest was a 27-year-old bug in OpenBSD, an operating system specifically known for its strong security posture.

The technical approach is methodical. According to Anthropic's red team blog, the model operates inside isolated containers with no internet access. It is prompted to find vulnerabilities in a given software project, then agentically reads source code, forms hypotheses, runs the software to confirm or reject its suspicions, and outputs a bug report with a proof-of-concept exploit and reproduction steps. A second Mythos Preview agent then verifies whether each finding is genuine and significant.

Anthropic's own blog highlighted the key detail: the model identified vulnerabilities "and developed many related exploits, entirely autonomously, without any human steering." As we covered in our analysis of the original Mythos leak, this model was already raising alarms before Anthropic chose to acknowledge it publicly.

An AI model finding bugs that expert human security teams missed for nearly three decades is not an incremental improvement. It is a capability shift.

The Dual-Use Problem

The same capabilities that make Mythos Preview valuable for defense make it dangerous in the wrong hands. Anthropic is explicit about this tension. The company warned in its blog that "given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely."

This is not a theoretical concern. Claude Mythos Preview's existence was first revealed last month through a data leak, which Fortune first reported. A draft blog post inadvertently made public described the model as "by far the most powerful AI model" Anthropic had ever developed and "currently far ahead of any other AI model in cyber capabilities." Dianne Penn, a head of product management at Anthropic, told The Verge that the leak was attributed to human error and was "not related to software vulnerabilities in any way."

The restricted release is therefore not simply a product decision. It is a containment strategy. Anthropic is giving defenders time to patch their systems before models with comparable capabilities become widely available, whether from Anthropic or from competitors. As Simon Willison noted, GPT-5.4 already has a growing reputation for security vulnerability discovery, and stronger models are on the near horizon from multiple labs.

The broader pattern is clear: AI-powered vulnerability discovery is accelerating faster than the industry's ability to patch. We explored the evidence behind this in our assessment of AI agent security research, where every agent tested in red-team exercises was compromised at least once.

Anthropic's long-term plan is to develop new safeguards and launch them with an upcoming Claude Opus model, one that the company says "does not pose the same level of risk as Mythos Preview." Legitimate security professionals whose work is affected by those safeguards will be able to apply to a forthcoming Cyber Verification Program.

The Government Question

Anthropic also confirmed it has been in "ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities." Dianne Penn told The Verge the company had "briefed senior officials in the US government about Mythos and what it can do." Newton Cheng, the cyber lead for Anthropic's frontier red team, said the company is "engaged with" the government but declined to specify exactly who had been briefed.

The timing is notable. This engagement comes despite Anthropic's highly public recent clash with the Trump administration. The company appears to be signaling that regardless of political friction, it views government coordination on frontier AI capabilities as non-negotiable, particularly when those capabilities have direct national security implications.

What This Means for the Industry

Project Glasswing sets a precedent that the industry will have to reckon with. When a single AI model can autonomously discover and exploit vulnerabilities at a scale and speed that human teams cannot match, the entire calculus of software security changes.

Jim Zemlin, CEO of the Linux Foundation, framed the core asymmetry well: security expertise has historically been a luxury reserved for organizations with large security teams, while open-source maintainers, whose software underpins much of the world's critical infrastructure, have been left to manage security on their own. Glasswing, he told VentureBeat, "offers a credible path to changing that equation."

For now, Anthropic is subsidizing the cost. But the program could evolve into a paid service, creating a new revenue stream if partners find it valuable enough to keep using. That commercial angle matters. Anthropic disclosed a revenue milestone on the same day it launched Glasswing, and a separate report confirmed an expanded compute deal with Google and Broadcom covering roughly 3.5 gigawatts of capacity.

The supply chain dimension is just as important. As we detailed in our analysis of the LiteLLM backdoor, open-source infrastructure that AI agents depend on is already being targeted. Glasswing's focus on hardening that same open-source layer is a direct response to a threat that is already live.

Glasswing is not just a security initiative. It is a blueprint for how frontier labs might handle capability overhang: restrict access, coordinate with industry, give defenders a head start, and build safeguards before broad release.

The question now is whether that head start is enough. Anthropic itself has acknowledged that models with similar capabilities are coming from other labs. The window between discovery and proliferation is shrinking. What Glasswing buys is time. What the industry does with that time will determine whether AI-driven cybersecurity becomes a durable advantage for defenders or simply another front in an escalating arms race.

Sources

When a single AI model can autonomously find vulnerabilities that human teams missed for decades, does restricting access protect us, or just delay the inevitable?

Charlie Major is a Product Development Manager at Mastercard. The views and opinions expressed in Major Matters are his own and do not represent those of Mastercard.

What Glasswing Actually Is

What the Model Found

The Dual-Use Problem

The Government Question

What This Means for the Industry

Sources

Keep reading

Apple Makes Its Case: On-Device Is Critical AI Infrastructure

The shutdown that made Europe's AI sovereignty argument for it

Anthropic's Own Investor Helped Switch Off Its Best Model