Skip to content
Policy Bytes

After Mythos

Anthropic’s latest AI model, Claude Mythos, has arrived with visions of impending upheaval and future crises in tow. Within days of the new model’s launch, senior federal financial officials held a special meeting with major U.S. bank CEOs to discuss its cybersecurity implications. This concern was warranted; Anthropic reports that Mythos found and, in some cases, exploited zero-day vulnerabilities in major software infrastructure, including decades-old bugs that had previously escaped extensive review. These stories and independent assessment indicate that Mythos represents a step-change in AI capability. Frontier AI governance must evolve accordingly. The arrival of genuinely dangerous AI capabilities requires moving beyond the current scheme of voluntary, developer-controlled oversight toward independent, structured pre-deployment assurance.

Progress Worth Reckoning With

Rather than releasing Mythos publicly, Anthropic launched Project Glasswing, a joint effort with dozens of software infrastructure organizations (including AWS, Apple, Microsoft, Google, and Cisco) to identify and patch vulnerabilities in some of the world’s most critical software infrastructure. The scale and urgency of this response reflect the remarkable cyber-proficiency revealed during Mythos’s testing. Anthropic’s chief science officer has disclosed that these capabilities are not the product of specialized training, but rather one of many areas in which the model surpasses its predecessors. Presumably, other frontier labs pursuing the same improvements are on a similar trajectory. Indeed, independent cybersecurity experts and policy analysts caution that, within a relatively short time, Mythos’s most concerning capabilities will likely proliferate as Anthropic’s competitors and open-weight models catch up. Reports of rival models with similar cyber-proficiency are already emerging.

Governance Unequal to the Moment

As one observer has noted, a single company presently controls a model with the demonstrated capacity to imperil critical software infrastructure worldwide, and presumably, within weeks or months, several others will possess the same capability. Anthropic’s decision to withhold Mythos and establish Glasswing was responsible, yet ultimately voluntary. Little in today’s AI governance landscape would reliably compel the next developer who achieves a comparable step-change to exercise similar restraint. Even among developers that do maintain frontier AI frameworks, operative language could prove ambiguous enough to permit prioritizing deployment over a framework’s internal risk mitigation system. This is what allegedly occurred when OpenAI’s GPT-5.3-Codex model triggered the company’s own “high” risk designation for cybersecurity, yet was deployed without implementing certain specific safeguards some argued that classification required.

Existing frontier governance schemes are also largely entity-level in scope, addressing what a single developer does with its own AI technologies. However, the proliferation of AI capabilities is not exclusively, or even primarily, an entity-level phenomenon. It occurs through research diffusion, open-weight releases, and (unauthorized) model distillation. Current schemes fail to meaningfully address these vectors, which involve actors beyond any single developer’s control. Recent reports of massive distillation campaigns by Chinese AI developers underscore this point: geopolitical competitors are already systematically extracting frontier capabilities through unauthorized means. Ultimately, no single developer, however responsible, can govern AI capability proliferation alone.

Elements for an Improved Scheme

The Governance Gap

Today (Largely) voluntary Entity-level scope Developer self-assessed No pre-deployment oversight What’s Needed Broader-scope governance Independent oversight Third-party assurance Structured safety cases

Whether to deploy advanced AI models with genuinely dangerous capabilities should not rest on a developer’s unilateral judgment.

Because Mythos is the first of such models, independent pre-deployment oversight is increasingly essential, both to verify the adequacy of a developer’s own safety determinations and to ensure that deployment decisions are informed by external assessments. Bipartisan legislative efforts to establish mandatory federal pre-deployment evaluation for advanced AI systems reflect a growing recognition of this need. Adequately addressing this in practice may necessitate building third-party assurance capacity: an ecosystem of independent evaluators capable of rigorously assessing frontier model risks and verifying developer safety claims.

A key tool for structuring pre-deployment vigilance and scrutiny is the safety case, a structured, evidence-based argument that a system is acceptably safe for a given deployment context. This concept is borrowed from safety-critical industries such as energy and aviation, where affirmative demonstrations of safety have long been standard practice before deployment. For frontier models with Mythos-level significance, safety cases should be a core component of an improved frontier governance scheme. Among frontier AI developers, explicit safety-case commitments vary considerably, but the underlying logic is compelling. Requiring developers to marshal evidence for specific safety claims and to explicitly state their assumptions and residual uncertainties imposes a discipline that evaluation disclosures (system cards) and broader frontier AI frameworks alone cannot provide. Yet while safety cases are necessary, they are likely not sufficient. They can discipline decisions that run through a developer, whether for closed-source deployments or open-weight releases. But safety cases cannot address AI proliferative dynamics that bypass developers entirely, such as research diffusion. Governing genuinely dangerous AI capabilities will therefore require complementary measures beyond what any single developer can be expected to provide.

Anatomy of a Safety Case

Safety claim What the developer asserts about the system’s safety Evidence Testing, evaluation, and data that support the claim Assumptions & residual uncertainties What is conditionally true or remains unknown. Concept borrowed from safety-critical industries including aviation and energy regulation

Before the Next Step-Change

Mythos may be the first frontier model in years whose developer concluded it was too dangerous for general release. It will likely not be the last. The voluntary restraint Anthropic exercised and its decision to stand up Glasswing are commendable, but they do not constitute a governance scheme. Policymakers should not assume that the precedent set by Anthropic’s choices will hold as other developers reach comparably significant capability thresholds.

The building blocks of a more durable governance scheme are available, including independent pre-deployment oversight, third-party assurance, and structured safety cases. What is missing is the urgency to assemble them.

Mythos has made the stakes of inaction visible. For all we know, the next step-change could make them tangible.