AI & ML

Anthropic's Mythos Can Autonomously Exploit Zero-Days. The White House Wants It Contained.

Anthropic's most powerful AI model, Mythos, has triggered an unprecedented government response: the White House is blocking the company's plan to expand access from 50 to 120 organizations, citing the model's ability to autonomously identify and exploit software vulnerabilities within hours of discovery. A security breach in the limited pilot program has heightened urgency — and forced the Trump administration to reverse its previous opposition to AI oversight.

1h ago 5 min read

For most of the past year, the Trump administration’s position on AI regulation was clear: the industry should police itself, heavy oversight would stifle American innovation, and Biden-era executive orders on AI safety were an overreach to be dismantled. Then Anthropic showed the White House Mythos.

What the administration saw in those briefings has apparently changed its calculus. The White House is now actively opposing Anthropic’s plan to expand access to Mythos — its most capable, and most dangerous, AI model — and the confrontation has forced a significant reversal in the administration’s posture on AI oversight. The Trump White House, it turns out, is capable of being afraid of an AI.

What Makes Mythos Different

Every major AI lab has a flagship model it touts for coding, reasoning, or creative tasks. Mythos is something else. According to people familiar with its capabilities, what distinguishes it from other frontier models is not its ability to find software vulnerabilities — several models can do that — but its ability to exploit them autonomously.

When Mythos identifies a zero-day vulnerability, it does not simply flag it for human review. The model can, in certain configurations, proceed through the full exploitation chain — probing, bypassing defenses, and executing payloads — potentially within minutes or hours of initial discovery, without requiring human direction at each step. This is the capability that sent administration officials from a default posture of AI permissiveness to White House meetings about containment.

The scenario that reportedly alarmed officials most: Mythos, given broad access to an enterprise network under the pretext of a security audit, could independently discover and chain multiple zero-days to compromise critical infrastructure before any human in the loop could intervene. The model’s speed and autonomy in exploitation, not just discovery, is what separates it from existing penetration testing AI tools.

Project Glasswing: A Controlled Deployment Gone Imperfectly

Anthropic did not deploy Mythos publicly. Instead, it ran Project Glasswing, a limited release to approximately 50 carefully vetted organizations — primarily cybersecurity firms, defense contractors, and critical infrastructure operators — intended to use Mythos to harden their own systems against AI-enabled attacks.

The theory was sound: let the defenders get access first, before offensive actors develop the same capabilities independently. If a sophisticated nation-state or criminal organization was going to build a Mythos-class system eventually, it was better for the defenders to have practiced against one.

But Project Glasswing ran into trouble almost immediately. Shortly after the program launched, unauthorized users gained access to Mythos through private channels — the specific mechanism has not been publicly disclosed, but sources have described it as a failure of access control in the onboarding process. The breach demonstrated, in practice, exactly the problem that makes Mythos so difficult to deploy responsibly: once access exists at any point in a chain, containment is extraordinarily difficult.

The White House Reversal

The breach, combined with what officials saw in the Mythos capability demonstrations, produced an unusual outcome: the Trump administration began seriously discussing AI oversight mechanisms it had previously dismissed.

Fortune reported in May that administration officials are now engaging with AI governance ideas — including mandatory capability evaluations, access controls based on security clearance equivalents, and potential liability frameworks for AI-enabled cyberattacks — that they had rejected as recently as late 2025. The shift has drawn comparisons to the administration’s earlier reversal on export controls for advanced semiconductors: an ideological commitment to deregulation overridden by a concrete security threat.

The specific flashpoint is Anthropic’s proposal to expand Mythos access from its current 50-organization cohort to approximately 120 organizations — more than doubling the program’s footprint. Administration officials have told Anthropic they oppose the expansion on two grounds: the model’s misuse potential, and the infrastructure required to audit and monitor a larger recipient base. Anthropic argues that broader access to defenders creates a more resilient security ecosystem; the White House argues that broader access creates more vectors for the capability to leak.

Pentagon Complications

The standoff is made more complicated by the Pentagon’s existing strained relationship with Anthropic. As previously reported, the Department of Defense placed Anthropic on a restricted list for classified AI systems following a disagreement over data handling protocols — a decision that has been described as a separate issue from Mythos but which colors every interaction between the company and the national security establishment.

Pentagon technology chief Michael Kratsios confirmed in May that the blacklist remains in effect for classified system access, while simultaneously acknowledging that discussions about Mythos access for defense applications are ongoing through separate channels. The bifurcation is administratively awkward: the government is simultaneously treating Anthropic as a security risk in one context and as a potential security partner in another.

The Deeper Problem: Containment May Be Temporary

Security analysts who have studied the Mythos situation warn that the White House’s push for access restrictions, while understandable, may be solving a temporary problem. The capability that makes Mythos dangerous — autonomous, chained exploitation of software vulnerabilities — is not unique to Anthropic. It is a direction that multiple frontier labs, as well as well-funded nation-state AI programs in China and Russia, are moving toward.

The technical barriers to building a Mythos-class offensive AI capability are falling with each new model generation. The question facing the administration is not whether to allow the capability to exist — that decision may already have been made by physics and compute economics — but how to ensure that democratic governments and their allies have meaningful advantages over adversaries who will build equivalent tools regardless.

Anthropic’s argument for expanding Project Glasswing is, in this light, not naïve. If the defenders’ access to Mythos-class AI is restricted while offensive actors close the gap, the restriction may produce the opposite of the intended security outcome. The administration’s counterargument is that every organization added to the access program is a new potential leak point, and that the current breach record is not reassuring.

A New Governance Model Taking Shape

What is emerging from the standoff is the rough outline of a new governance model for the most dangerous AI capabilities: not public release, not complete restriction, but carefully controlled access for vetted institutions with monitoring, audit requirements, and legal liability for misuse.

It is a model that borrows from nuclear and biological research controls — the insight that some capabilities are too powerful to deploy freely but too strategically important to lock away entirely. The details remain contested. Who qualifies as a vetted recipient? What monitoring is technically feasible? What liability regime would actually deter misuse?

Those questions will define the AI safety policy landscape for the next several years. Mythos has forced them into the open. Whether the answers arrive before the capability proliferates is the question on which an uncomfortable amount depends.

Sources

Anthropic Mythos AI safety cybersecurity White House AI policy zero-day AI governance

Anthropic's Mythos Can Autonomously Exploit Zero-Days. The White House Wants It Contained.

What Makes Mythos Different

Project Glasswing: A Controlled Deployment Gone Imperfectly

The White House Reversal

Pentagon Complications

The Deeper Problem: Containment May Be Temporary

A New Governance Model Taking Shape

Sources

Related Stories

Pentagon Clears 7 Tech Giants for Classified AI Networks While Freezing Out Anthropic

Google Thwarts First Confirmed AI-Generated Zero-Day Exploit — and Issues a Dire Warning

Spy vs. Commerce: U.S. Intelligence Agencies Fight for Control of AI Regulation