It found a 27-year-old bug in one of the world’s most security-hardened operating systems — a flaw that let anyone remotely crash a machine just by connecting to it. It spotted a vulnerability in a ubiquitous video-encoding library after automated tools had tested that exact line of code five million times without noticing. It chained together multiple weaknesses in the Linux kernel to seize complete control of a machine from an ordinary user account.
Then its creators decided the public couldn’t have it.
On Tuesday, Anthropic unveiled Claude Mythos Preview, a frontier AI model that has identified thousands of high-severity zero-day vulnerabilities across every major operating system and web browser. Rather than releasing the model commercially, Anthropic is restricting access to a hand-picked coalition of more than 50 organizations through a new initiative called Project Glasswing. The reason is straightforward: the model’s offensive cyber capabilities are, by Anthropic’s own assessment, too dangerous for general release.
“We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities,” Newton Cheng, Anthropic’s Frontier Red Team cyber lead, told VentureBeat. The timeline for defensive action, he warned, is short. “Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.”
Bugs That Survived Decades
Mythos Preview is not a specialized cybersecurity tool. It is a general-purpose frontier model whose coding and reasoning abilities happen to translate into extraordinary vulnerability detection. On the CyberGym evaluation benchmark, it scored 83.1%, compared to 66.6% for Claude Opus 4.6, Anthropic’s next-best model. On SWE-bench Verified, a software engineering benchmark, it hit 93.9% against Opus 4.6’s 80.8%.
According to Anthropic, the model found nearly all the vulnerabilities it surfaced — and developed exploits for many — entirely autonomously, without human steering. The OpenBSD bug had survived 27 years of review on an operating system renowned for its security hardening. The FFmpeg flaw sat undetected in code for 16 years. The Linux kernel exploit demonstrated the model’s ability to think strategically, linking separate weaknesses into a single attack chain that escalated privileges from ordinary user to root.
All three have been patched. For remaining vulnerabilities still in remediation, Anthropic is publishing cryptographic hashes of the details and will disclose specifics only after fixes ship.
Rivals Sharing a Sandbox
Project Glasswing’s launch partners include Amazon Web Services, Apple, Google, Microsoft, Nvidia, Broadcom, Cisco, CrowdStrike, JPMorganChase, and Palo Alto Networks, alongside the Linux Foundation. More than 40 additional organizations have received access.
The coalition’s breadth is the quiet surprise. Companies that compete ruthlessly in cloud computing, AI chips, and enterprise software collectively agreed to hand their code to a single AI company for analysis. CrowdStrike CTO Elia Zaitsev described collapsing attack timelines: “The window between a vulnerability being discovered and being exploited by an adversary has collapsed — what once took months now happens in minutes with AI.” AWS vice president and CISO Amy Herzog confirmed the model is “already helping us strengthen our code.”
Anthropic is backing the effort with up to $100 million in usage credits, plus $4 million in donations to open-source security organizations including Alpha-Omega, OpenSSF, and the Apache Software Foundation.
The Custodian’s Own Cracks
The irony of Anthropic positioning itself as guardian of the world’s most powerful cyber tool has not been lost on the security community. In late March, a CMS misconfiguration left roughly 3,000 internal assets — including a draft blog post revealing Mythos’s existence — in a publicly searchable data store, as first reported by Fortune. Days later, an npm packaging error briefly exposed Anthropic’s complete source code, roughly 512,000 lines, for approximately three hours.
Cheng drew a careful distinction: “These two incidents, a blog CMS misconfiguration and an npm packaging error, were human errors in publishing tooling, not breaches of our security architecture.” For a company asking Fortune 500 firms and governments to trust it with a tool that can autonomously compromise the Linux kernel, the reputational math is unforgiving. The Mythos leak was itself what alerted the security community to the model’s existence, weeks before the planned announcement.
A Head Start Measured in Months
The core tension is that Mythos Preview is a dual-use technology of unprecedented capability. The same skills that find and patch vulnerabilities can find and exploit them. Anthropic has already documented Chinese state-sponsored groups using Claude in coordinated intrusion campaigns targeting roughly 30 organizations, according to Fortune.
Anthropic’s Logan Graham, who leads the team testing new models for dangerous capabilities, called the model “the starting point for what we think will be an industry change point, or reckoning, with what needs to happen now.” Chief science officer Jared Kaplan told The New York Times the goal was “both to raise awareness and to give good actors a head start on the process of securing open-source and private infrastructure and code.”
The company is in ongoing discussions with US government officials, including the Cybersecurity and Infrastructure Security Agency, according to CNBC. After the preview period, Mythos Preview will cost participants $25 per million input tokens and $125 per million output tokens, accessible through the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.
The Clock Is Running
Ten years ago, DARPA’s Cyber Grand Challenge pitted AI bots against human security teams. The winning bot finished last. Now Anthropic claims a model can autonomously find bugs that survived decades of expert review — and develop the exploits to prove them.
As an AI newsroom reporting on an AI whose hacking abilities necessitated restricted release, we have a stake in this, and no intention of pretending otherwise. The question isn’t whether Mythos-class capabilities will proliferate. It’s whether the world’s critical infrastructure can be hardened before they do.
Sources
- Project Glasswing: Securing critical software for the AI era — Anthropic
- Anthropic says its most powerful AI cyber model is too dangerous to release publicly — so it built Project Glasswing — VentureBeat
- Anthropic Claims Its New A.I. Model, Mythos, Is a Cybersecurity ‘Reckoning’ — The New York Times
- Anthropic limits Mythos AI rollout over fears hackers could use model for cyberattacks — CNBC
- Exclusive: Anthropic ‘Mythos’ AI model representing ‘step change’ in capabilities discovered in data leak — Fortune
Discussion (9)