Mythos - PauseAI Internal Analysis & Talking Points

What happened

Anthropic announced Claude Mythos Preview, a new frontier AI model with unprecedented cyber capabilities. They announced it through Project Glasswing, a partnership with AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.
Mythos has autonomously discovered thousands of high-severity zero-day vulnerabilities (previously unknown flaws) including some in every major operating system and every major web browser, plus a range of other critical software. Previous AI models could find vulnerabilities, but not at anything close to this scale or speed.
Key examples from Anthropic's own disclosure:
- Found a 27-year-old vulnerability in OpenBSD, one of the most security-hardened operating systems in the world, used to run firewalls and critical infrastructure. The flaw allowed remote crashing of any machine running the OS just by connecting to it.
- Discovered a 16-year-old vulnerability in FFmpeg (used by countless applications for video encoding/decoding), in a line of code that automated testing tools had hit 5 million times without catching the problem.
- Autonomously found and chained together several Linux kernel vulnerabilities to escalate from ordinary user access to complete control of the machine. Linux runs the majority of the world's servers.
Mythos did nearly all of this entirely autonomously, without human steering.
Anthropic has chosen not to release Mythos to the general public. Instead, they are giving access to a select group of ~50 tech and security companies to patch vulnerabilities, committing up to $100M in usage credits and $4M in donations to open-source security organisations.
Benchmark results show Mythos is a massive leap over previous models, not incremental improvement. For example, 93.9% on SWE-bench Verified vs. 80.8% for their previous best (Opus 4.6); 83.1% on CyberGym vs. 66.6%.
Additional cyber benchmarks from the system card:
- 100% on Cybench: solves every challenge with 100% success rate across all tested challenges. No previous model came close. (p.48)
- 84% Firefox exploitation success rate vs. 0.8% for Opus 4.6, a ~100x improvement. Even when the two easiest bugs were removed, Mythos still achieved 85.2% by leveraging four other distinct bugs. (pp.49-51)
- First model to solve a private cyber range end-to-end, requiring discovering and executing a series of linked exploits across different hosts and network segments. Solved a corporate network attack simulation estimated to take a human expert over 10 hours. No other frontier model had previously completed it. (p.52)
Important nuance: the system card notes Mythos "failed to find any novel exploits in a properly configured sandbox with modern patches." (p.52) But the vast majority of real-world systems are NOT properly configured with modern patches, which is precisely the point.
Beyond cyber: bioweapons uplift: In virology protocol uplift trials, Mythos-assisted protocols averaged 4.3 critical failures, compared to 6.6 with Opus 4.6 and 5.6 with Opus 4.5. (p.28) Cyber is not the only WMD-relevant capability domain.

Our analysis: why this is severe

The immediate threat: cyber capabilities

This is the first time a model exists that, if released openly, would pose a credible risk of civilisational catastrophe. Not a theoretical risk, but a concrete one. Anyone with access to a Mythos-class model could systematically exploit open-source software that underpins the world's infrastructure.
The primary attack surface is open-source software: code that is publicly readable, and that runs inside almost everything. Most critically, the Linux kernel, which powers the majority of the world's servers (cloud infrastructure, data centres, banking backends, medical systems, power grid control systems, logistics, government services). The Linux kernel exploit chain alone (going from ordinary user to full machine control) is enough to cause global-scale disruption.
Anthropic, right now, effectively has the capability to compromise most servers on the planet if it chose to use Mythos offensively. They won't, but the capability exists, concentrated in the hands of a single private company, with no democratic mandate or oversight.
For the reasons above, this is a weapon of mass destruction. Anthropic built it.
Anthropic is framing this as responsible behaviour: choosing not to release, investing in defence, partnering with industry. And it's true that not releasing the model is better than releasing it. Releasing it would be insanely reckless. But this framing obscures the fundamental problem: they created the threat. They are now asking the world to trust them with the solution to a problem they manufactured. "They created the threat, and now they're asking us to trust them with the solution."
No one asked for this. No one voted for it. There was no oversight. Anthropic developed Mythos without any external regulatory framework, any binding safety requirements, or any democratic input. A private company unilaterally decided to build a new kind of weapon of mass destruction, a system capable of compromising global infrastructure, and the only thing standing between that capability and catastrophe is their own judgment.

What happened

Our analysis: why this is severe

The immediate threat: cyber capabilities

The bigger picture: recursive self-improvement and the road to superintelligence