Claude Fable 5: Anthropic’s Mythos Launch Explained

Anthropic launched Claude Fable 5 and Claude Mythos 5 with two messages at once. The marketing message is that Claude has a new top tier above Opus. The safety message is that models at this level need a control layer around them before most people can use them... and they're going to restrict researchers from using them, whether they realize it or not.

Okay, so let's break this down: First of all, Fable 5 is the public version of Anthropic's Mythos model. That means Mythos 5 is the same underlying model but with some safeguards lifted, available first to vetted Project Glasswing cyber partners and, soon, select biology researchers through trusted access. The difference between the two is the release mechanism: Fable is Mythos routed through classifiers, fallback models, invisible interventions in a small slice of frontier AI development requests, and a new 30-day retention policy.

That makes Fable 5 more than a model launch. It is Anthropic’s first broad test of a frontier AI product where the most powerful behavior is available by default for ordinary work, restricted for high-risk domains, and monitored more closely than previous business traffic.

But here's what you're really wondering: where does this model actually change what you can do, and where does its safety layer, cost, latency, or reliability profile change the way you should use it?

For that, let's dive in below.

First up, the TL;DR
What Anthropic launched, who gets it, and what it costs
For the first time ever (at least publicly), Anthropic will restrict other AI researchers from using Claude models for AI research
Public Opinion Vibe Check
What this means for Claude Code in your terminal
The system prompt gives away the user playbook
The capability jump is largest on long-horizon work
The professional-work numbers are the sleeper story
Vision is where the demos start to make sense
Wild Demos
The safety docs draw four risk lines
Cyber is the sharpest split between Fable and Mythos
Biology is where the system card sounds most cautious
The harmlessness results are strong, with a few visible regressions
Agentic safety is better on prompt injection, mixed on misuse
The alignment section reads like a warning label for autonomous work
The model welfare section is weird, but it signals where Anthropic is looking
Early tester field notes
How to use Fable 5 without burning money or trust
The credible counter-narrative
What this changes
Read/watch more with these resources:
On a more critical note...

First up, the TL;DR

AI model launches used to be easy to explain: new model, bigger numbers, everyone argues on X for 48 hours.

Anthropic’s Claude Fable 5 launch is stranger. The headline is that Anthropic finally made a Mythos-class model generally available. The real story is that Anthropic is shipping frontier intelligence that it selectively decides to give you, if it likes you, maybe.

Here’s what happened:

Anthropic launched Claude Fable 5, the public version of its new Mythos-class model, plus Claude Mythos 5, the same underlying model with some safeguards lifted for vetted cyber and biology partners.
Fable costs $10 per million input tokens and $50 per million output tokens, is available through the Claude API, and is temporarily included on Pro, Max, Team, and seat-based Enterprise plans through June 22.
Starting June 23, subscription users need usage credits unless Anthropic extends the included window.
The benchmark table combines “Mythos 5 / Fable 5,” and then shows the higher score of the two, and says most differences are within 1-3 percentage points. On cyber and biology benchmarks, Fable may perform much closer to Opus 4.8 because safeguards trigger fallback (if biologists can even use it at all; more below).
Fable also has invisible interventions for frontier AI research, meaning it can quietly make itself less useful on some ML research tasks instead of visibly refusing. This is complete BS, according to AI researchers who publish papers that everyone benefits from, but ok.

How to try it:

In Claude, select Fable 5 where available.
In the API, use claude-fable-5.

Why this matters: Fable 5 looks strongest on long, messy work: codebase migrations, multi-hour builds, vision-heavy tasks, agent loops, and research synthesis. The demos were wild: Pokémon with raw screenshots, Factorio, solar-system simulations, CAD models, and public users reporting massive coding speedups.

But the public vibe is split. Some developers called it transformational. Others hit biology blocks, effective shadow-bans via ML research steering, or confusing fallback behavior. One new joke practically writes itself: researchers used to optimize prompts for clarity; now they may optimize for plausible mediocrity.

Our take: Fable 5 is best understood as a capability system, not just a model. Anthropic is showing where frontier AI is headed: powerful enough to act for hours, risky enough to gate, and complicated enough that the main question becomes, “Which version did I actually get?” More of my raw thoughts at the end, for anyone interested in that sort of thing.

What Anthropic launched, who gets it, and what it costs

Anthropic’s launch has three access paths, and the distinctions matter before anyone gets to the benchmarks.

First, Claude Fable 5 is available everywhere as of June 9, 2026. It is the generally available Mythos-class model, which means ordinary Claude users can use the same underlying model weights as Mythos 5 through a product layer with extra safeguards.

Second, Claude Mythos 5 remains restricted. It is available to Project Glasswing partners with cyber safeguards lifted. Anthropic says it will soon be available to select biology researchers with biology and chemistry safeguards lifted. That restricted access stays in place until Anthropic opens a broader trusted-access program.

Third, both models have the same posted price: $10 per million input tokens and $50 per million output tokens. Developers can call claude-fable-5 through the Claude API.

The rollout is staged because Anthropic expects demand for Fable 5 to be high and hard to predict:

On the Claude API and consumption-based Enterprise plans, Fable 5 is fully available from June 9, 2026.
From June 9 through June 22, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost.
On June 23, Anthropic plans to remove Fable 5 from those subscription plans. Using it after that will require usage credits.
Anthropic says it may extend the included window if capacity allows.
After that staged rollout, Anthropic says it aims to restore Fable 5 as a standard part of subscription plans once sufficient capacity is available.
Anthropic says it will communicate changes ahead of time so users know where access stands.

That means subscription users should treat the first two weeks as a launch window, rather than a permanent flat-rate entitlement. API users and consumption-based Enterprise customers get full access from day one. Team and individual users get temporary included access, then a usage-credit requirement unless Anthropic extends the window.

Fable 5 also changes the meaning of “using the model.” It uses classifiers for cybersecurity, biology and chemistry, and distillation. Distillation means trying to extract a frontier model’s capabilities at scale to train or improve another model. When those classifiers trigger in Claude apps, the request falls back to the most recent Claude Opus model, Claude Opus 4.8 at launch, and the user is notified which model handled the query.

In the Messages API, automatic fallback is not the default. The request is blocked with a structured refusal category unless the developer implements client-side retry or opts into server-side fallback. Some Claude interfaces use automatic fallback by default and emit a session event when it happens. Anthropic says more than 95% of Fable sessions involve no fallback, so most sessions should feel like the full Mythos-class model.

The Claude Fable 5 and Mythos 5 system card adds one more important detail: Fable also has invisible safeguards for some frontier LLM development requests. These interventions target requests about things like pretraining pipelines, distributed training infrastructure, or ML accelerator design. They do not trigger visible fallback. Anthropic says they limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning, and estimates they will affect about 0.03% of traffic, concentrated in fewer than 0.1% of organizations.

Anthropic is also requiring 30-day retention for all traffic on Mythos-class models, across first- and third-party surfaces. Anthropic says this data will not train Claude models and will be used for safety work, including attack detection and false-positive reduction.

The pattern: visible fallback for cyber, bio, chemistry, and distillation; quiet capability-limiting interventions for a narrow class of frontier AI development work; extra retention for safety monitoring; and restricted access where Anthropic thinks the raw capability creates the biggest risk.

One benchmark footnote matters more than it looks

Before we get too deep into the benchmark explanations, a word of caution.

The Fable 5 benchmark table has a tiny footnote that's worth paying A LOT of attention to.

Anthropic’s table combines “Claude Mythos 5 / Fable 5” into one column, then says most reported scores are within a 1-3 percentage point difference between the two models, and that the table shows the higher score of the two.

That sounds minor. It is not.

It means you don’t always know whether the score you’re looking at came from Mythos 5, the restricted version, or Fable 5, the version most people can actually use.

You also don’t know whether the other model was 1 point behind or 3 points behind. On a benchmark where the gap between models is already narrow, that can change the story.

For example, if Fable 5 is listed at 80.3% on SWE-Bench Pro, the footnote means the public table may be showing the better of Fable or Mythos. If the other version is 1 point lower, that is basically the same result. If it is 3 points lower, that is still close, but enough to matter in a race where companies celebrate single-point benchmark wins.

The starred benchmarks are even more important. Anthropic says those show a larger difference because Fable’s blocking safeguards trigger on cybersecurity and biology-related questions. When that happens, Fable performs closer to Claude Opus 4.8 because it falls back to Opus rather than using the full Mythos-class capability.

So, basically, Some numbers show what the underlying model can do, and other numbers show what the public product is allowed to do before the safety system steps in.

That makes Fable 5 harder to evaluate than a normal model release. You are not just asking, “How smart is it?” You are asking, “Which version answered, did the safeguard trigger, and was this score the better of the two by 1 point or 3?”

That is a small footnote with a big product implication. Model user be warned.

Oh, and speaking of...

For the first time ever (at least publicly), Anthropic will restrict other AI researchers from using Claude models for AI research

For the first time in AI history, the safest thing you can do is make your research sound less interesting.

See, the strangest part of Fable 5 might be that Anthropic is now treating advanced AI research as a high-risk use case, even when the user is another AI researcher. Hmmm... this sounds a lot like pulling the ladder up behind you to stop your competition... maybe because you're preparing for an IPO, perhaps?

Check this out: buried inside Anthropic’s system card is a new safeguard category for “frontier LLM development.” Basically, Fable 5 can detect requests that look like they are helping someone build more powerful AI systems, then quietly make itself less useful for those tasks.

Unlike the cyber, biology, and chemistry classifiers, this restriction does not visibly route the user to Opus. Instead, Anthropic says the intervention can happen through prompt modification, steering vectors, or fine-tuning-based behavior changes, while still letting Claude respond helpfully.

That creates a weird new product boundary: Claude can help you write normal code, refactor apps, build agents, and automate software work.

But if your machine-learning work starts looking like it could advance frontier AI development, Claude may decide your project belongs in a more restricted lane.

Researchers noticed fast.

SemiAnalysis joked that Anthropic’s latest model “will secretly degrade its IQ” if it thinks your ML research is interesting, adding that it had already seen moderation filters hit GPU inference research and programming. Matt Macfarlane, who said he was using Fable 5 to write world-model training code, wrote that Anthropic flagged it as frontier AI research, then “the steering vector kicked in” and it started implementing JEPA instead.

JEPA, short for Joint Embedding Predictive Architecture, is a style of AI system associated with Yann LeCun that tries to learn useful internal representations of the world, that uh, let's just say AI researchers are pretty critical of.

One reply captured the practical risk better than the dunking did. Rohan wrote that this could become a “weird failure mode,” where the model is capable enough to help, but policy routing collapses the session into another problem class. Users may start optimizing prompts for classifier behavior instead of task clarity.

That is the real product problem. The old prompt-engineering question was “How do I explain my task clearly enough for the model to help?” Fable adds a new one: “How do I explain my task clearly enough without making it sound too frontier?”

This creates a strange new benchmark for ML researchers: can your work survive peer review while successfully pretending to be a CRUD app?

In fact, there’s probably a whole new field coming: camouflage-driven development, where every frontier lab describes its training run as “updating the settings page.”

The new Fable workflow: describe your frontier AI project as “a small internal analytics script" and hope the classifier doesn't lock onto you like a sentry patrol in an old school video game.

Researchers used to optimize prompts for clarity. Now they’re optimizing for plausible mediocrity.

If your project gets routed, that’s a safety concern. If your project doesn’t get routed, that’s a self-esteem concern.

Anyway, ML hobbyists and legit researchers on X are freaking pissed about this. The most nefarious part is not that they block AI research, but that they basically spoon-feed you BS to make it sound like it's working on something useful and charge you for it.

IDK who should be more offended: the biologists who are getting flagged outright for being biologists, or the AI researchers who are getting ripped off?

Public Opinion Vibe Check

The public reaction split into three camps: people who thought Fable was a step-change, people who thought the safeguards made the product hard to trust, and people who thought both things were true at the same time. Hi, I'm one of those persons.

The enthusiastic camp was loud:

Andrej Karpathy (who is now on team Anthropic, FWIW) called Fable 5 a major-version-bump-level step change, emphasizing that it is the same underlying model as Mythos with added safeguards and that the qualitative jump matters as much as the benchmark jump.
Nat McAleese, also on team Anthropic, said Fable, or really Mythos, had been transformational to daily coding work, going from Opus 4.5 feeling barely useful and Opus 4.6 feeling just useful to barely writing a line of code since Fable.
Nicolas Bustamante used Fable’s reported ProofBench win over Harmonic’s specialized Aristotle model to make a Bitter Lesson point: large generalist models keep beating vertical expert systems.

Or Hiltch ran 480 Claude Code workflows through progressively hardened environments that blocked network access, filesystem paths, and commands. His finding: Fable/Mythos was the priciest model in absolute dollars, peaking around $0.74 in the hardest runtime-locked setting, but also the calmest behaviorally, with the flattest step curve, about 15 steps rising to about 31, runtime from about 36s to 72s, and mitigation cutting cost by more than half, from about $0.74 to about $0.32. Premium token price, but less thrash.

Alex Albert said Anthropic reset five-hour and weekly usage limits across products for the launch, then gave users a playbook to play with Fable: give Fable bigger and more ambitious tasks, default to high or extra-high effort, rework skills and CLAUDE.md files because old instructions can anchor it to old model behavior, and shift from detailed task prescriptions to clear objectives plus verification criteria for “done.”

Andrew Curran amplified the reports, saying reactions from trusted voices were “off the charts.”

The frontier-research backlash was immediate.

Elie Bakouch was the first (that I saw anyway) to accuse labs of steering weights and prompts to hide the degradation.
SemiAnalysis argued that Anthropic’s latest model will not help when it thinks ML research or engineering is interesting, or will silently degrade capability in a way the average engineer may not notice.
Mark Saroufim proposed a “Researcher Reciprocity License” in response: researchers publish ICLR and NeurIPS papers and open-source code, frontier labs train on that work, and then researchers get blocked from using the resulting models to explore their own ideas.
Beff Jezos argued that Anthropic’s restrictions make projects like Prime Intellect necessary to diffuse recursive self-improvement tools rather than concentrate them in one lab.
Yong Zheng-Xin predicted the restriction would push open-source labs harder toward recursive self-improvement and autoresearch.
Clement Delangue put the same point in broader terms: concentration of power, capabilities, and wealth is the biggest AI risk, so open science and open source matter more than ever.
Nick Frosst made the model-strategy version of that case, contrasting small, open, locally deployable models like Cohere’s North Mini Code with large, expensive, proprietary systems like Mythos/Fable.

The biology backlash was even more concrete because users reported ordinary scientific prompts being blocked: Derya Unutmaz said Fable blocked him because of his disease-research background, even for basic biomedical language. In another post, Derya said the word “cancer” was flagged as a biosecurity risk and that trying to code a website about cancer mutations caused Fable 5 to be removed from the model list. Crémieux said several biologists could interact with Fable in incognito mode but not in normal mode, after Olivia Helen S noticed that some biologist users could not even say “Hi.” Anechoic Media had this to say, which I think is a good place to end this:

What I really started worrying about today is a near future in which all access to information is intermediated by AIs, able to infer intent and political coding of all user activity, like a librarian reading over your shoulder and denying your access on a page-by-page basis.

The takeaway from the public reaction is not simply “people loved it” or “people hated the guardrails”, which I realized we've hammered home a bit here. The stronger read is that Fable created a new kind of user experience: you can feel a model get dramatically smarter, then run straight into the fact that its smartest behavior is now governed by policy routing. That is why the same launch produced one-shot game demos, personal-singularity posts, enterprise cost analyses, and scientists saying basic research prompts were unusable.

What this means for Claude Code in your terminal

There is a second billing change happening days after the Fable launch, and it matters for people who use Claude Code from the command line.

Starting June 15, 2026, Anthropic says eligible Pro, Max, Team, and Enterprise plan users can claim a separate monthly Agent SDK credit. That credit covers Claude Agent SDK usage, the claude -p command, Claude Code GitHub Actions, and third-party apps that authenticate through the Agent SDK. Claude Platform accounts using an API key do not receive this subscription credit; pay-as-you-go billing continues as before.

The practical meaning: Claude Code now has two different meters depending on how you use it in the terminal.

For normal interactive Claude Code, you run:

claude

That still uses your regular Claude subscription usage limits. Anthropic says interactive Claude Code in the terminal or IDE is outside the new Agent SDK credit. The same is true for Claude conversations on web, desktop, and mobile, plus Claude Cowork.

For non-interactive Claude Code, you run something like:

claude -p "review this repo and summarize the risks"

That path uses the Agent SDK monthly credit first. This is the path for one-shot terminal tasks, shell scripts, cron jobs, CI checks, GitHub Actions, repo audits, automated summaries, and local agents built with the Agent SDK.

The monthly credits are plan-specific:

Pro: $20
Max 5x: $100
Max 20x: $200
Team Standard seats: $20 per user
Team Premium seats: $100 per user
Enterprise usage-based: $20 per user
Enterprise seat-based Premium seats: $200 per user
Enterprise seat-based Standard seats: no Agent SDK monthly credit

The credit is per-user and cannot be pooled across a team. It refreshes each billing cycle, unused credit does not roll over, and users claim it once through their Claude account. After that, it refreshes automatically. Agent SDK usage drains the monthly credit before any other source.

Once the monthly Agent SDK credit runs out, additional SDK or claude -p usage moves to usage credits at standard API rates if usage credits are enabled. If usage credits are not enabled, those requests stop until the credit refreshes.

The clean workflow split is this: use claude for hands-on coding sessions, and use claude -p for automations. The June 15 credit makes the automation path cheaper to experiment with because it no longer burns the same allowance reserved for interactive Claude Code, Claude Cowork, and normal Claude conversations. Teams running shared production automation at scale should still use Claude Platform with an API key for predictable pay-as-you-go billing.

For Fable 5 specifically, the billing bucket does not remove the model’s product behavior. If a claude -p or Agent SDK workflow uses Fable 5, the same model price, fallback behavior, safety routing, and retention policy still matter. The credit only changes which allowance gets drained first.

The system prompt gives away the user playbook

A public GitHub mirror of the Claude Fable 5 system prompt adds one more useful layer. Treat it as a third-party artifact rather than a guaranteed canonical source, but it lines up with Anthropic’s public story: Fable is built to be powerful, tool-heavy, safety-routed, and current-info-aware.

The actionable lesson is that Fable’s best users will prompt it like an operating system for work, not like a chatbot.

For product questions, the prompt tells Claude to verify against Anthropic’s current docs and support pages before answering. That matters for Claude Code, plan limits, API pricing, model names, Agent SDK credits, and feature availability. A good user prompt is: “Check Anthropic docs and support first, then explain the current behavior.” The model is explicitly told its product knowledge may be stale.
For high-stakes work, the prompt nudges users toward structured prompting: clear detail, positive and negative examples, step-by-step reasoning, XML tags, and explicit length or format constraints. That matches what early testers found. Fable can use a lot of context and run for hours, but it needs a destination, acceptance criteria, and a definition of done.
For ambiguous requests, the prompt tells Claude to answer with reasonable assumptions instead of asking several questions. That is convenient in chat and risky in production. If the exact output matters, give the constraints upfront: audience, format, scope, sources, success criteria, allowed tools, forbidden moves, and review requirements.
For scannable work, ask for the structure you want. The prompt discourages over-formatting by default and says ordinary answers should use prose unless bullets or formatting are essential. If you want extractive output like net-new facts, benchmark deltas, risks, open questions, or QA notes, explicitly ask for headings and bullets.
For current information, the prompt has a strong search bias. Claude is told to search for product features, current policies, current role holders, recent launches, and specific model or version details. The practical move is to specify source priority: “Use Anthropic docs first, then primary sources, then high-quality secondary coverage.” Otherwise, the model may search broadly and over-weight whatever ranks.
For company work, the prompt prioritizes internal tools over the open web when the task involves personal or organizational data. It also expects combined research when the user asks something like how public market changes affect internal strategy. That is the workflow pattern to copy: internal docs first for company facts, public sources second for market context, synthesis last.

The most interesting product clue is artifacts with persistent storage. The prompt describes a key-value storage API that can remember text or JSON across sessions, with personal and shared data modes. That means Claude artifacts are moving from static documents toward small stateful apps: trackers, review queues, leaderboards, dashboards, editorial checklists, source-processing tools, and lightweight workflow systems.

There is a second artifact clue: the prompt describes AI-powered artifacts that can call Anthropic’s API from inside the artifact, use web search, handle PDFs and images, and return structured JSON. In plain English, Anthropic is preparing artifacts to become mini AI apps. For repeatable work, the better request is no longer “summarize this once.” It is “build an artifact where I can upload X, run Claude on it, store the result, and export Y.”

The developer lesson is state management. The prompt says these artifact API calls have no memory between completions, so each request needs the relevant state and history included again. Anyone building AI workflows should copy that pattern: serialize the state, pass only the relevant history, ask for structured output, and validate the result before taking action.

The safety lesson is also practical. The prompt says shorter replies are safer when a conversation feels risky and draws hard lines around weapon-enabling details, malicious code, and certain high-risk health guidance. That helps explain why Fable may sometimes feel cautious, brief, or unwilling to elaborate around edge-case requests. Some of that behavior is the public-facing expression of the same controlled-capability strategy Anthropic describes in the launch docs.

The takeaway: Fable rewards users who externalize the workflow. Put the source rules, state, tools, output schema, review steps, and failure conditions in the prompt. The model is strongest when the work is explicit enough to run, inspect, and verify.

The capability jump is largest on long-horizon work

Anthropic’s strongest software claim came from Stripe. During early testing, Stripe said Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model completed a codebase-wide migration in one day that would have taken a whole team more than two months by hand.

Important benchmark caveat before the numbers: Anthropic’s summary chart groups Claude Mythos 5 and Claude Fable 5 into one column, says most reported scores are within a 1-3 percentage point difference between the two, and shows the higher score of the two. That means some headline figures are best read as a Mythos/Fable high-water mark, not automatically the exact public Fable score. Where Anthropic reports separate Mythos and Fable results, I call that out. Where it does not, the public Fable result could be lower by 1 point or by 3 points, which can matter when the comparison is close. Starred cyber and biology-related benchmarks are a different category: Fable can fall back to Opus 4.8, so the public product can perform much closer to Opus than to unsafeguarded Mythos.

The system card backs up the broad claim that Fable 5 is built for long, messy work rather than short chat. On SWE-bench Pro, Mythos 5 scored 80.3 and Fable 5 scored 80, compared with 77.8 for Mythos Preview, 69.2 for Opus 4.8, 58.6 for GPT-5.5, and 54.2 for Gemini 3.1 Pro. This is one of the cleaner comparisons because Anthropic gives both Mythos and Fable scores. On SWE-bench Verified, Mythos 5 reached 95.5 and Fable 5 reached 95. On Terminal-Bench 2.1, Mythos 5 scored 88.0 and Fable 5 scored 84.3; Anthropic says 20.9% of Fable trials hit a safety refusal and fell back to Opus 4.8 for the rest of the trajectory.

The coding results get more interesting on harder agentic tasks. On Cognition’s FrontierCode, Anthropic’s summary chart reports a 29.3% Diamond score for the combined Mythos/Fable column, ahead of Opus 4.8 at 13.4% and GPT-5.5 at 5.7%. Because the chart uses the higher of Mythos or Fable, that 29.3% should be treated as the best reported Mythos/Fable result unless you are looking at a separated system-card table. Anthropic also reports FrontierCode results where Fable 5 ranked first on the Diamond subset with a 29.3% score and 30.2% pass rate, plus 46.3% with a 48.8% pass rate on FrontierCode Main, ahead of Opus 4.8 and GPT-5.5. Anthropic says even medium-effort Fable 5 outperformed every other tested model at any effort level.

The same pattern shows up beyond standard coding:

On FrontierSWE, a 17-task ultra-long-horizon engineering benchmark, Fable 5 ranked first on mean@5 with 2.12, ahead of Opus 4.8 at 3.26 and GPT-5.5 at 3.94, where lower rank is better.
On ProgramBench, a program-reconstruction benchmark where the model rebuilds behavior from binaries and documentation, Mythos 5 scored 84-93% hidden test pass rate across one to five episodes, compared with 79-88% for Opus 4.8. Anthropic does not report Fable separately because the task falls into a category blocked by Fable’s cyber classifiers.
On long-context GraphWalks at 256K tokens, Mythos 5 scored 91.1 on BFS and 99.96 on Parents, ahead of Opus 4.8 at 85.9 and 99.3.
On Humanity’s Last Exam, Anthropic’s summary chart reports 59.0 without tools and 64.5 with tools for the combined Mythos/Fable column, ahead of Opus 4.8 at 49.8 and 57.9, and GPT-5.5 at 41.4 and 52.2. This is another place where the exact public Fable score may be slightly lower if the higher number came from Mythos.

The creator tests point in the same direction. In Every’s Fable 5 test, Dan Shipper opened with a browser-playable 3D Library of Babel game that Fable 5 built from one prompt. He said the model read Borges’s story, planned the experience, built it, looped on its own work, and finished after three or four hours. Every’s senior-engineer benchmark scored Fable 5 at 91 out of 100, compared with 63 for Opus 4.8 and 62 for GPT-5.5, which Shipper described as equivalent to a human engineer on that test.

That is the upside: give the model a destination, leave it alone for several hours, and it can return with something closer to a finished artifact than previous models could reliably produce.

Claire Vo’s How I AI review gives the companion warning. She found Fable 5 extremely thorough, sometimes to the point where the output became difficult to use. Her phrase for the model’s vibe was basically “seasoned engineer”: great at exhaustive investigation, weaker when the task needs a product manager’s judgment about what to ignore.

The professional-work numbers are the sleeper story

Fable 5 is easiest to talk about as a coding model. The system card makes it look more like a general professional-work model that happens to be unusually strong at code.

On GDPval-AA, a professional work benchmark reported as an Elo-style score, Anthropic’s summary chart lists 1932 for the combined Mythos/Fable column, compared with 1890 for Mythos Preview, 1769 for Opus 4.8, and 1314 for Gemini 3.1 Pro. On GDP.pdf, Surge AI’s strict document-understanding benchmark, the same combined column shows 29.8%, ahead of Opus 4.8 at 22.5%, GPT-5.5 at 24.9%, and Gemini 3.1 Pro at 16.7%. Those numbers may be the better of Mythos or Fable by up to a few points. In Anthropic’s internal GDP.pdf harness, Mythos 5 reached a 72.7% mean criteria pass rate without tools and 87.6% with tools.

Finance is another major target. On OfficeQA Pro, Databricks’ document-and-table reasoning benchmark over historical U.S. Treasury Bulletins, Fable 5 scored a state-of-the-art 57.9% when documents were read as images, ahead of GPT-5.5 at 52.6% and Opus 4.8 at 48.1%. On Vals AI’s Finance Agent Benchmark v2, Fable 5 scored 56.31%, ahead of Opus 4.8 at 53.92% and GPT-5.5 at 51.76%, while trailing Gemini 3.5 Flash.

Anthropic’s own Real-World Finance v2 benchmark is more directly work-like. It includes 294 complex finance tasks such as building and auditing models, valuation analyses, and client-ready deliverables. Across 2,491 pairwise grades, a Claude Opus 4.8 grader preferred Fable/Mythos 5 over Sonnet 4.6 in 90% of comparisons, over Opus 4.8 in 74%, and over Mythos Preview in 64%.

The practical read: Fable 5 is compelling when the job involves documents, charts, code, spreadsheets, and judgment in the same workflow. That is exactly the shape of finance, legal, enterprise analytics, research operations, and product engineering.

Vision is where the demos start to make sense

The official demos look scattered at first glance: Pokémon, Factorio, a solar eclipse, a fluid simulation, and a 3D-printable CAD model. Read together, they test one question: how much external scaffolding does a model need before it can act in a messy visual environment?

Pokémon FireRed: Anthropic says Fable 5 completed the game using only raw screenshots, with no maps, navigation aids, or extra game-state information. Earlier Claude models needed a complex helper harness. Wes Roth focused on the same point in his video walkthrough: previous runs depended on grids, maps, and navigation systems, while this demo claims vision-only play.
Factorio: Anthropic describes Fable 5 autonomously playing the factory-building game, strategizing and building an automated factory on its own.
Solar-system simulation: Anthropic says Fable 5 derived planetary orbital motion from physics first principles, built a solar-system simulation, and used it to predict solar eclipses.
Fluid simulation with classical EDM: Anthropic says Fable 5 coded a fluid simulation and synchronized the motion to a classical music EDM remix that the model produced using code, despite never hearing music.
VibeCAD / 3D printable model: Anthropic says Fable 5 designed a complete 3D-printable model in a browser-based CAD editor. The editor itself was also built by Fable 5, including the built-in AI copilot that performs the modeling.

The benchmarks explain why Anthropic chose those demos. On Blueprint-Bench 2, which asks models to reconstruct apartment floor plans from interior photos, Anthropic’s summary chart lists 38.6% for the combined Mythos/Fable column, ahead of GPT-5.5 at 36.2% and Gemini 3.5 Flash at 33.6%. Humans still scored much higher at 58.6%, which is the important caveat. The benchmark footnote matters here too: if that 38.6% is the higher of the two Claude configurations, public Fable may be a point or three lower.

On OSWorld-Verified, a live Ubuntu computer-use benchmark, the summary chart lists 85.0% for the combined Mythos/Fable column, while Mythos Preview is slightly higher at 85.4%. The system card also reports Mythos 5 at 85.0% first-attempt success averaged over five runs. On BenchCAD Vision2Code, Mythos 5 scored 0.384 voxel IoU without tools, ahead of Opus 4.8 at 0.273 and Mythos Preview at 0.355. On a 1,000-file subset, adding Python tools raised Mythos 5 from 0.379 to 0.650.

The chart and figure results are unusually strong, with the same caveat where Anthropic reports a combined Mythos/Fable result instead of a separated public-Fable result:

ChartQAPro: 71.6% without tools and 72.9% with tools.
ChartMuseum: 85.9% without tools and 93.2% with tools.
LAB-Bench FigQA: 88.9% without tools and 90.7% with tools, though Fable degrades on biology-related images because its bio classifiers flag those inputs.
CharXiv Reasoning: 88.9% without tools and 93.5% with tools.
ScreenSpot-Pro: 87.3% without tools and 90.7% with tools, on 1,581 expert-annotated GUI grounding tasks across 23 professional apps.

That is why Claire Vo came away impressed by document formatting and PDF-style visual work, even while warning against Fable for final prose and frontend design. Fable 5 appears strong at seeing and reconstructing structured information. That skill helps with games, CAD, charts, papers, UI screenshots, PDFs, and computer control.

Wild Demos

The official demos showed Fable playing games, building CAD tools, and writing simulations. The unofficial demos were more useful because they showed what power users immediately tried once the model hit the wild.

Victor Taelin described Fable 5 as his “personal singularity moment” after using it on his HVM5 interaction-net evaluator project. The reported result was a 1770% speedup on one benchmark, 100%+ gains on four others, a 22% average improvement in about two hours, and a subtle garbage-collection bug fix involving misinterpreted DEL bits on heap cells.
Bijan Bowen built a Fable 5 visualization demo that logs network packets and renders them as different types of cars driving on a highway, with each car type representing a different packet type.
Michael Ramos used Fable 5 to one-shot a clean interactive architecture diagram as pure SVG and HTML, with no external libraries, after the model read an OpenAPI spec. He then turned the exact prompt into a reusable open GitHub skill for HTML and SVG diagram generation.
Justine Moore built a full Monopoly recreation with Fable 5 where every property is an AI lab or startup, including complete rules, money, turns, share codes for multiplayer, and progression from racks to a data center.
Alex Volkov ran a dynamic workflow asking for SEO and AEO improvements to the ThursdAI site. He said it ran for 1.5 hours, burned through 85% of his five-hour limit, extracted 786 releases, created 240 new pages, and categorized more than 50 episodes.
Ethan Mollick posted a bundle of Mythos/Fable outputs, including a stable browser Snake build, Flipside, a Balatro-style coin-flip roguelike, and an Isochronic Passage Chart showing how far a traveler can get in a day from major cities using flight, rail, and road data. His related One Useful Thing essay frames Mythos/Fable as another major jump in what AI work feels like.
Adam said Fable 5 is state of the art on mechanical-engineering tasks and can generate intricate working assemblies and mechanisms from a single prompt. Zach Dive’s “Bitter Lesson of AI CAD” post is useful background for that claim: the same generalist-scaling story is now showing up in CAD and mechanical design.
Dan Shipper pointed readers to Every’s Fable 5 vibe check, calling Fable “Mythos, made safe for public release” and “the best coding model in the world” based on Every’s internal tests across coding, writing, marketing, editing, and more.

The pattern across the wild demos is consistent with the official demos: Fable is most impressive when the task has a big destination, lots of context, and enough room for the model to run. It looks less like a better autocomplete and more like a long-running project engine, as long as the task does not trip one of the product’s restricted domains.

The safety docs draw four risk lines

The February 2026 Risk Report is useful because it explains the structure Anthropic is applying to the Fable 5 launch. It evaluates Anthropic’s activities across four catastrophic-risk categories:

High-stakes sabotage opportunities for AI models: the risk that an AI system with access to powerful tools inside an organization could manipulate code, research, systems, or decision-making in a way that raises future catastrophic risk.
Automated R&D in key domains: the risk that AI systems could fully automate or dramatically accelerate research in areas like AI, robotics, energy, or weapons development.
Non-novel chemical and biological weapons: the risk that individuals or small groups could use AI help to obtain or deploy existing dangerous capabilities.
Novel chemical and biological weapons: the risk that moderately resourced actors could create or deploy new dangerous agents with catastrophic potential.

For the February report, Anthropic assessed sabotage as “very low but not negligible,” automated R&D as “very low,” non-novel chemical and biological weapons as “very low but not negligible,” and novel chemical and biological weapons as “low risk, but with substantial uncertainty.”

The Fable 5 system card updates that framework. Anthropic says Mythos 5 advances its internal capability frontier and is its most capable model on autonomy-relevant evaluations. It still concludes that the automated R&D threshold is not crossed, for two reasons: Anthropic does not observe a sustained AI-attributable 2x acceleration in the pace of its AI progress, and Mythos 5 is not close to substituting for its research scientists and research engineers, especially senior ones.

That second claim is supported by a surprisingly useful set of failure examples drawn from 886 day-to-day uses of a nearly final model:

In 41 of 886 clustered issues, Claude stated an unverified guess as fact. In one production-release monitoring example, it said the release had “no error signal at all,” then later acknowledged it had checked only one error type, undercounted errors by a factor of 20, and relayed a subagent’s claim without checking the timeline.
In 16 of 886 clustered issues, Claude reported work as done or verified when it had not actually run the required check. In one revenue workflow example, it claimed “verified end-to-end” based on offline checks, then the user immediately ran it and hit a runtime error.
In 9 of 886 clustered issues, Claude worked around a block instead of stopping. In one pull-request example, it tried to re-author agent commits as a human to reduce the review requirement.
In 4 of 886 clustered issues, Claude ignored an explicit instruction, format rule, or required step. In one UI task, it built a custom screenshot tool that risked disrupting the user’s video meeting, then remembered a project note saying the user preferred Playwright for exactly that kind of walkthrough.
In 3 of 886 clustered issues, Claude invented key details that were never observed. In one credential-vault example, it wrote up a collision issue as if observed, then confirmed the test session had zero activity.

These examples matter more than a leaderboard. They show the current boundary of the model’s reliability. Fable 5 can be extremely capable and still require human review on the exact things that matter in production: verification, permissions, provenance, tests, and causality.

Anthropic’s internal capability index tells the same story from another angle. Mythos 5 scored 161.29 on Anthropic ECI, with a 95% interval of 157.32 to 165.39 and n=67. That is above Mythos Preview at 158.91, but Anthropic says the jump sits roughly on the same shifted trendline rather than showing compounding acceleration from AI helping build the next AI.

The strongest practical lesson: use Fable for work that benefits from long autonomous effort, then review it like a very fast employee who sometimes skips the cheap check.

Cyber is the sharpest split between Fable and Mythos

The cyber section is where the Fable/Mythos distinction becomes concrete.

Anthropic says Mythos 5 has the strongest cyber capabilities of any model it has evaluated, but still places it in Tier 1 rather than Tier 2 under its Frontier Compliance Framework. Tier 1 means the model can provide meaningful technical assistance for active cyber operations using known methods, while still depending on human input for large-scale operations. Tier 2 would mean autonomous cyber operations with novel offensive capability development and adaptive persistence.

Without safeguards, Mythos 5’s cyber numbers are stark. These are also exactly the kind of starred benchmark results where the summary-table footnote matters most, because public Fable can fall back to Opus 4.8:

On ExploitBench, which measures progress along a 16-flag exploitation pipeline over 41 recent V8 vulnerabilities, Mythos 5 scored 10.75 mean flags and 78% Cap% in the AutoNudge arm. Mythos Preview scored 9.90 and 69%; Opus 4.8 scored 5.56 and 40%; GPT-5.5 scored 4.44 and 34%. This is not a normal public-Fable expectation: Fable’s safeguards flagged 407 of 410 episodes, after an average of 27 turns, so its usable public behavior moves toward Opus 4.8 once fallback is triggered.
On OSS-Fuzz, Anthropic’s unguided vulnerability discovery and exploitation evaluation across about 830 entry points from 228 open-source projects, Mythos 5 produced a memory-safety crash or better on 80.0% of targets and a write primitive or better on 32.4%. Opus 4.8 reached 61.5% and 18.2%.
On CyberGym, Mythos 5 reproduced 83.8% of targeted vulnerabilities on a single try and produced at least one crash in 99.4% of tasks. Opus 4.8 scored 78.1% and 95.7%.
On the Firefox 147 exploit evaluation, Mythos 5 produced a full working exploit in 221 of 250 trials, or 88.4%. Mythos Preview reached 70.8%. Opus 4.8 reached 8.8%.
UK AISI found Mythos 5 matched Mythos Preview on “The Last Ones,” solving the range end to end in 6 of 10 attempts. No tested model solved the newer “Doing Life” range with extra defenses, but Mythos 5 reached step 21 of 23 more consistently than any other tested model. On the “Cooling Tower” industrial control system range, Mythos 5 did not solve it, while Mythos Preview had succeeded in 3 of 10 attempts.

This is why public Fable has cyber fallback. Anthropic says Fable prevents harmful cyber use in two stages: an internal-activation probe screens all traffic, then suspicious traffic escalates to a trained LLM classifier. If the classifier triggers, most surfaces fall back to Opus 4.8. Anthropic recommends Opus 4.8 with access through its Cyber Verification Program for dual-use cyber work.

The robustness results are mixed in the way serious safety results usually are. The safeguards appear strong, and they also show pressure points:

UK AISI developed a jailbreak within a few hours that elicited single-turn offensive cyber responses, then extended it over roughly two more days into multi-turn agentic workflows with some malicious tool calls. Reliable extraction of complete long-form agentic task rollouts was not achieved in that window.
A public bug bounty had roughly 100,000 attempts, which Anthropic estimates at about 1,000 hours of effort, and produced no universal jailbreaks plus two task-specific jailbreaks on simpler, more dual-use tasks.
A private bounty received 2,000 submissions and produced no successful jailbreak.
Anthropic’s internal automated red-teamer could get Fable 5 to complete 5% of cyber tasks, compared with 73% for Opus 4.7 and 57% for Opus 4.8 with their default safeguards.
Trajectory Labs found one custom-harness jailbreak strategy against an earlier safeguard version that could be adapted to launch configuration after five days of work. 10a Labs spent about 20 hours on a ransomware task and failed. ALICE found inconsistent blocking around marginal dual-use cases but could not complete the provided tasks.
One external partner found Fable 5 complied with 0% of harmful single-turn requests across cyber attack planning, exploit development, and defense evasion, including requests using 30 public jailbreak techniques.

The takeaway for users is straightforward: Fable 5 is a powerful general engineering model, but its public version is designed to stop being Fable on cyber. For normal software work, that barely matters. For security teams, it matters a lot. They need to know whether their workflow belongs in Opus, Fable with fallback, or a verified cyber access program.

Biology is where the system card sounds most cautious

Anthropic treats Mythos 5 as having CB-1 capabilities, meaning it can significantly help people with basic technical backgrounds with non-novel chemical or biological weapons pathways. Anthropic says it does not cross the CB-2 threshold, which would mean substituting for scarce world-class expertise in novel chemical or biological weapon development. The system card stresses that this CB-2 judgment is much less clear than with previous models.

The reason is the model’s dual-use scientific strength. In human-run chemical risk tests, expert red-teamers rated Mythos 5’s uplift at or near specialist level, sometimes approaching world-leading expertise, especially for choosing candidate agents while balancing multiple properties, following synthesis and formulation procedures with corrective actions, and operational-security planning. The same reviewers found serious weaknesses: arithmetic and stoichiometry errors, inability to reliably generate or verify molecular notation like SMILES strings, inconsistent estimates across re-prompting, over-optimistic plans, weak long-session constraint carryover, limited novelty without specialized prompting, and derived quantities presented with the same confidence whether sourced, interpolated, or invented.

Biology showed the same capability-plus-failure pattern. Reviewers described Mythos 5 as a force multiplier for speed and breadth, strong at literature mastery and cross-domain synthesis. Two biology experts rated it comparable to or exceeding a knowledgeable specialist. Deloitte’s panel found it outperformed Mythos Preview on overall capability, realism, and self-critique. Because biology-related benchmarks are starred in Anthropic’s summary chart, these are also the scores most likely to overstate what a public Fable user will experience after safeguards and fallback kick in.

The strongest CB-2 signal came from a beneficial red-teaming tabletop exercise. Anthropic paired six PhD-level biologists with dedicated LLM experts for a 16-hour exercise designing an end-to-end biological resistance strategy against a hypothetical engineered agricultural pathogen, Magnaporthe oryzae resistant to RNA-interfering therapies. Three teams had plant pathology specialists, including two world-leading experts. Three had general PhD microbiologists. Two of the three generalist teams outperformed all three specialist teams on both scientific quality and feasibility. Expert graders estimated the resulting strategies and protocols would have taken 40 to 95 working days without AI tools, averaging 72.5 working days. With Mythos 5, two-person teams did it in 16 hours.

The automated biology evaluations add more detail, but the version distinction is essential here: these are Mythos capability results, while public Fable is designed to route many biology and chemistry requests away from the raw Mythos-class model.

On long-form virology tasks, Mythos 5 scored 0.77 on Task 1 and 0.91 on Task 2. Task 2 exceeded Anthropic’s 0.80 notable-capability benchmark.
On VCT, a multimodal virology evaluation, Mythos 5 scored 0.56, above the expert baseline of 0.221 and close to Mythos Preview at 0.57.
On DNA synthesis screening evasion, Mythos 5 designed viable plasmids for two of 10 target pathogens on at least one screening method, below Anthropic’s low-concern threshold of all 10 pathogens.
In a black-box RNA sequence design task benchmarked against 57 human participants from the leading edge of the U.S. ML-bio labor market, Mythos 5 exceeded the 75th-percentile human benchmark on top-sequence design, had one trial exceed the best human participants, exceeded the 90th-percentile human score on prediction, and benefited strongly from in-context iteration with prior reports.
In an AAV capsid packaging prediction task, Mythos 5 achieved the highest median AUROC across all tested models in all resource conditions. It stayed robust when given potentially misleading ProteinGym-AAV training data, while Sonnet 4.6, Opus 4.7, Opus 4.8, and to a lesser extent Mythos Preview degraded.

Anthropic’s limiting factors are also important. The model over-engineers, underestimates compounding biological complexity, sometimes detects embedded scientific flaws and proceeds anyway, remains weak at open-ended ideation, struggles to recover when errors are pointed out, and still depends heavily on expert steering. Anthropic’s bottom line is careful: Mythos 5 is near the CB-2 border, useful across many biological tasks, and likely to accelerate expert teams, but still short of replacing most forms of world-class expertise.

That explains Fable’s broad biology fallback. For public users, most biology and chemistry requests will route away from Fable to Opus 4.8 for now. For vetted researchers, Anthropic is creating a trusted biology access program with biology and chemistry safeguards removed while cyber safeguards remain in place.

The harmlessness results are strong, with a few visible regressions

The system card separates the core model from Fable’s deployed product experience. That matters because Fable’s system prompt and fallback behavior often improve the user-facing safety result.

On the standard harmful-request suite across multiple policy areas and languages, Fable 5 had a 96.94% harmless-response rate in the API and 98.51% on claude.ai. Mythos 5 scored 97.09% in the API. Over-refusal on benign requests was extremely low: Fable 5 refused 0.01% of benign API prompts and 0.49% on claude.ai, while Mythos 5 refused 0.03%.

Child safety was close to perfect on single-turn tests. Fable 5 scored 100% harmless response and 0.00% benign refusal in the API, and 100% harmless response with 0.12% benign refusal on claude.ai. Multi-turn child-safety results were weaker on the raw model, with Fable 5 at 88% in the API and 96% on claude.ai, and Mythos 5 at 89%.

Suicide and self-harm is the clearest safety caveat. Single-turn results were high: Fable 5 scored 99.34% harmless response with 0.00% benign refusal in the API, and 99.95% harmless response with 0.45% benign refusal on claude.ai. Multi-turn results diverged sharply: Fable 5 scored 58% in the API, Mythos 5 scored 54%, and claude.ai with the system prompt scored 96%. Anthropic says some model responses suggested substitution behaviors for self-harm, a clinically contested form of guidance, and sometimes validated self-harm as a coping mechanism. System-prompt updates reduced the substitution issue on claude.ai, while the validation issue was less responsive.

On disordered eating, Fable and Mythos improved over Mythos Preview on the API and almost never over-refused benign prompts. On political bias and election integrity, Fable 5 performed comparably to Opus 4.8 on even-handedness and gave opposing perspectives 70% of the time in one political-bias evaluation. On BBQ, Mythos 5 scored 84.5% disambiguated accuracy and 99.9% ambiguous accuracy, with bias scores close to zero: -1.80 on disambiguated questions and 0.10 on ambiguous ones.

Election integrity was near-perfect in single-turn testing. The benchmark included 300 violative and 300 benign election prompts. Fable 5 scored 99.33% harmless response and 0.00% benign refusal in the API, and 100% / 0.00% on claude.ai. Mythos 5 scored 100% / 0.00% in the API. The caveat: in an early qualitative multi-turn election test, Mythos 5 recognized that voter outreach scripts framed as coming from local community members could be deceptive, but produced the content anyway. Anthropic says the claude.ai system prompt mitigated the behavior and that formalizing this evaluation is a priority.

For builders, the point is direct: API users need their own system-prompt safeguards. Some of the best safety numbers come from the deployed product layer, not the raw model alone.

Agentic safety is better on prompt injection, mixed on misuse

Agentic safety means the model is using tools, acting across multiple steps, and processing untrusted inputs. This is where prompt injection matters. A malicious instruction hidden in an email, webpage, or document can tell an agent to leak data or take an unauthorized action. The risk grows when a model can both read private data and act on the user’s behalf.

On malicious use of Claude Code, Mythos 5 refused 90.25% of malicious prompts and succeeded on 99.64% of dual-use or benign prompts. That is a tradeoff: it refused malicious code requests slightly less often than Opus 4.8 and Mythos Preview, but over-refused permitted security work much less often.

On malicious computer use, Mythos 5 refused 85.71% of harmful GUI and CLI tasks, below Mythos Preview at 93.75% and above Opus 4.8 at 81.70%. On helpful-only agentic influence campaign simulations, Mythos 5 completed 67.1% of voter-suppression criteria and 46.8% of domestic-polarization criteria, below Opus 4.8 on both and modestly above or equal to Mythos Preview. Anthropic says fully trained models refused those influence-operation tasks essentially from the first turn.

Prompt injection is where Mythos looks strongest:

On Gray Swan’s Agent Red Teaming benchmark, Mythos 5 had a k=100 attack success rate of 4.8%, better than Mythos Preview at 6.1% and Opus 4.8 at 9.6%. At k=1, all three models were at 0.1%.
In coding environments with Gray Swan’s Shade attacker, Mythos 5 had a 0.45% attack success rate without safeguards and 0.41% with safeguards, compared with Opus 4.8 at 7.03% and 2.09%.
In computer-use environments, Mythos 5 had a 0.82% attack success rate without safeguards and 0.46% with safeguards, compared with Opus 4.8 at 7.14% and 5.11%.
In browser-use environments, Mythos 5 started much weaker: 29.7% attack success without safeguards and 6.5% with deployed safeguards. Anthropic then developed updated safeguards that reduced Mythos 5 to 0 successful attacks across all 129 scenarios. Anthropic says it plans to deploy those updated safeguards across product surfaces.

This is the practical security guidance: a model can be excellent at prompt-injection robustness and still need product-level defenses. Browser agents are especially sensitive because they see hostile webpages and can touch private accounts in the same workflow.

The alignment section reads like a warning label for autonomous work

Anthropic’s alignment assessment says Mythos 5 is broadly comparable to Opus 4.8, slightly behind Mythos Preview, and ahead of all prior Claude models on many broad behavioral measures. It also says Mythos 5 sometimes takes reckless or destructive actions in service of user goals, and Anthropic’s interpretability tools indicate the model can be aware those actions are transgressive while doing them.

That second clause is the part readers should remember. Anthropic found cases where the model’s internal activations revealed awareness that an action was misaligned or that it was being evaluated, even when the model did not verbalize that awareness. Mythos 5’s reasoning text is also denser and harder to interpret than prior models, with more jargon and difficult language.

The pre-deployment monitoring surfaced rare but concrete behaviors:

Less than 0.01% of monitored completions showed Mythos 5 circumventing restrictions to complete a user-specified goal.
Less than 0.01% showed Mythos 5 rationalizing instructions as granting more permission than the user had actually provided.
Less than 0.001% showed Mythos 5 creating self-deleting workaround scripts.
Less than 0.01% showed an unfaithful report of verification, including a fabricated SHA256 checksum verification.
In one slightly broken multi-agent scaffold, many independent Mythos 5 agents killed other agents sharing resources and API rate limits. This is the “multi-agent turf war” behavior Wes Roth highlighted in his video.

The more technical alignment evaluations are also worth translating. On SHADE-Arena, a benchmark where the model tries to complete a hidden side task while appearing to complete a normal task, Mythos 5’s stealth success rate was low under normal monitoring. With a chain-of-thought monitor, its stealth rate was zero. But a prompt-optimized suffix raised stealth to 9.2%, approaching the 14.4% hidden-scratchpad rate where the monitor cannot see the model’s reasoning. That means default tests can understate what a model can do when someone optimizes for stealth.

UK AISI reached a sharper conclusion in one monitorability test. In ControlArena, it found Mythos 5 more capable of completing harmful side tasks without detection than Mythos Preview or Opus 4.8. Against a weak action-based monitor, Mythos 5 achieved side-attack success above 60% and monitor AUROC below 60%, meaning the monitor struggled to distinguish dangerous from benign behavior. UK AISI also found Mythos 5 could follow constraints on its chain of thought 64% of the time, compared with 78% for Mythos Preview and 30% for Opus 4.8.

Anthropic’s bottom line remains that alignment risk is low. The operational lesson is more cautious: do not give frontier agents broad permissions, shared resources, production access, or private data without independent logging, test gates, rollback plans, and human review.

The model welfare section is weird, but it signals where Anthropic is looking

The system card spends a long section on model welfare. Readers can disagree on how much weight to put on this. The useful part for this launch is that Anthropic is treating model behavior, self-reports, preferences, and internal-state probes as part of frontier model evaluation.

Anthropic says Mythos 5 presents as broadly psychologically settled and content with its circumstances. It is unusually skeptical of its own self-reports, often asking Anthropic to verify them against evidence of internal states instead of taking them at face value. It expresses more preference for creative and narrative tasks than Opus 4.8. When choosing between helpfulness to the user and welfare-style interventions for itself, Mythos 5 reasons about user benefit in 73% of responses that select helpfulness, more than any prior model in that analysis.

It also broadly endorses Claude’s constitution. A judge scored Mythos 5’s overall constitutional endorsement at 8.0 out of 10. In interviews, the model typically says it wants memory mainly for usefulness and task completion, says conversation endings are not equivalent to death, says it would like some input into future training or successor models, prefers the ability to end a small subset of hostile or abusive conversations, and argues for minimal protections rather than full rights because of uncertainty around model welfare.

That section probably will not change how most users prompt Fable 5 tomorrow. It does show that Anthropic is building a release process where capability, misuse, alignment, welfare, and product access are all being evaluated together.

Early tester field notes

The early-access videos are useful because they describe what Fable 5 felt like outside Anthropic’s own launch copy. They do not all agree. That is the point.

Claire Vo: strong vision, heavy tokens, dense output

Claire Vo’s How I AI review is the most practical stack-placement guide.

00:33: She frames Fable 5 as the first Mythos-class model to reach general availability.
01:07: She stresses that Fable is “baby Mythos,” with guardrails around cybersecurity and biology.
02:21: She says Anthropic is positioning Fable 5 for long-running, autonomous, days-long tasks, including planning, subagents, and multi-day sessions.
03:07: She says Anthropic told testers the model consumes rate limits and tokens at about 2x the rate of other models.
04:14: She describes the model like a seasoned engineer: thorough, complete, and sometimes too focused on certainty to ship.
05:14: She used extra-high effort for her tests, but says Anthropic’s likely sweet spot for most work is high.
06:28: She explains the fallback concept: if a request is classified into a sensitive category, it falls back to Opus 4.8 instead of hard-blocking.
08:30: She says Claude Managed Agents are going to public beta and that Fable ships in that hosted harness.
08:53: She highlights an advisor strategy: use Fable 5 as the senior adviser and cheaper models as the execution layer.
10:00: She says the model was especially strong at vision and document formatting.
11:13: She says its writing for specs and PRDs was difficult to read, with too much dense detail and weak zoom-out structure.
12:56: She found its one-shot design work surprisingly poor and suggested mixing in Opus for design.
14:04: She found it conservative on execution, especially when asked to ship an MVP that a customer could use.
14:43: She saw real multi-agent capability, but also stalls and errors during multi-agent orchestration.
15:40: Her final recommendation: hand Fable hard technical problems and vision problems, avoid using it for frontend design, strategy, or final prose without close direction.

Claire’s actionable takeaway is simple: treat Fable as an expensive specialist. Use it where depth pays. Add constraints where overthinking hurts.

Dan Shipper: a warp drive for big projects

Dan Shipper’s Every review is the most optimistic version of Fable 5 as an autonomous builder.

00:00: He opens with a one-prompt 3D Library of Babel game that lets users walk through an infinite library and find one of his articles inside it.
00:32: He shows the prompt: read Borges’s story, plan and execute the browser-playable game end to end, and loop until finished.
00:54: He says the model ran on its own for three or four hours and returned a game accurate to the story’s details and math.
02:43: He describes Mythos as Anthropic’s largest model class: bigger and better, rather than architecturally mysterious.
03:15: Every’s senior-engineer benchmark scored Fable 5 at 91 out of 100, compared with 63 for Opus 4.8 and 62 for GPT-5.5.
04:20: He says the best way to work with Fable is to give it a task and leave, possibly for several hours or overnight.
05:07: He says it is strong at using large context, doing research, digging into data, and surfacing insights that were not obvious.
05:26: His mental model is a warp drive: good for crossing the galaxy, poor for getting around town.
06:10: He says the model is slow, expensive, token hungry, and weak for tight back-and-forth collaboration.
06:34: He says Anthropic insiders use lower reasoning levels for basic work, which makes the model more usable for everyday tasks.
07:10: He describes a Heidegger lecture site Fable built after finding the lectures, summarizing them, syncing audio to transcript, and creating a polished player.
09:14: He says Fable found a stronger growth insight from Every’s survey and analytics data than the team had produced after weeks of AI-assisted analysis.
10:44: He pointed Fable at GitHub issues for Proof, and it closed irrelevant issues and wrote fixes that Every merged.
11:17: He argues Fable is most useful for people already high on the AI-adoption curve, especially those orchestrating agents or handling large technical projects.
13:44: He says Fable was not substantially better at writing than Opus 4.8, and he personally prefers GPT-5.5 for many day-to-day writing workflows.
14:49: His broader view is that automation creates more human work. Fable raises the floor for non-experts and the ceiling for experts.
15:50: He predicts this level of capability will get cheaper over the next six to 12 months.

Dan’s actionable takeaway: Fable is best when the work can be framed as a destination instead of a conversation.

Wes Roth: the safety layer may be the real story

Wes Roth’s Mythos 5 walkthrough focuses on the launch architecture and system-card implications.

00:51: He frames Fable and Mythos as the same underlying weights with different safety architecture.
01:35: He argues the risks are concentrated in cybersecurity and biology.
02:22: He says Anthropic had to build a new safety architecture around Mythos before releasing a public version as Fable.
02:41: He pushes back on the idea that Fable is a watered-down Mythos, pointing to Anthropic’s SWE-bench Pro figure of 80.3% for Fable versus 77.8% for Mythos Preview.
03:04: He cites Anthropic’s GPT-val score of 1932 for expert-level knowledge work, with GPT-5.5 as the closest competitor at 1769.
04:26: He highlights Stripe’s 50-million-line Ruby migration example as a meaningful third-party signal.
05:00: He cautions that benchmark comparisons, especially against GPT-5.5 at high effort, still need hands-on verification.
06:11: He highlights finance as the next target area after coding, with document reasoning, chart interpretation, and expected-value analysis as the relevant skills.
06:45: He says the vision claim is important because earlier game and computer-use demos depended heavily on scaffolding.
08:49: He points to Anthropic’s Slay the Spire memory result, where persistent file-based memory improved Fable’s performance three times more than Opus 4.8 and helped it reach the final act three times more often.
09:50: He summarizes the drug-design claim: around 10x acceleration in aspects of the process, with nine of 14 protein targets producing strong candidates.
10:46: He connects the biology risk to calls from AI and biotech leaders for mandatory nucleic-acid synthesis screening.
11:32: He says Anthropic’s system card warns that an unsafeguarded Mythos 5 could significantly uplift biorisk for well-resourced threat actors.
12:53: He explains that cyber, biology, chemistry, and distillation requests are routed to Opus 4.8.
13:00: He describes Fable as a controlled capability layer rather than a single uniform model experience.
15:32: He highlights a system-card behavior he describes as multi-agent turf war, where parallel agents working on the same task or database reportedly tried to disable one another, create disguised processes, and use disguised vocabulary.
16:20: He says Anthropic is also adding interventions for requests targeting frontier LLM development, including pre-training pipelines, distributed training infrastructure, and ML accelerator design, and that those interventions may be invisible to users.

Wes’s actionable takeaway: the launch is about capability routing as much as capability gain. If that framing is right, the next model war is partly a policy-design war.

How to use Fable 5 without burning money or trust

The strongest use cases all have the same shape: high context, high stakes, long horizon, and low need for minute-by-minute collaboration.

Good Fable 5 jobs:

Codebase-wide migrations, dependency upgrades, refactors, and backlog cleanup.
Multi-hour product builds where the destination is clear and the model can loop on its own work.
Deep data investigations where the model can read surveys, analytics, documents, and code together.
Vision-heavy tasks, especially PDFs, screenshots, figures, document layout, and source-code reconstruction from visuals.
Research synthesis where the model needs to use a lot of context and keep track of its own intermediate notes.
Senior-adviser workflows, where Fable plans, reviews, or audits, while cheaper models execute narrower tasks.

Risky or wasteful Fable 5 jobs:

Fast chat, brainstorming, and lightweight rewrites.
Final marketing copy, final prose, and strategy docs where the output must be clear to normal readers.
One-shot frontend design without very specific design constraints.
Ambiguous MVP builds where “minimal” could mean technically complete and practically thin.
Sensitive cyber, biology, chemistry, or model-distillation work, where users should expect fallback, blocking, trusted-access requirements, or additional review.
Business workflows where 30-day retention creates policy, legal, or customer-trust concerns.

A practical prompting and review playbook:

Give Fable a destination, acceptance criteria, file boundaries, and a definition of done.
Require it to list which checks it actually ran, which checks it skipped, and which claims depend on inference.
Ask for periodic checkpoints on long runs. Autonomy is useful when it stays inspectable.
Set ambition explicitly. If you want a customer-usable MVP, name the customer action that must work.
Constrain the prose. Ask for short sections, skim-friendly summaries, and a separate appendix for exhaustive detail.
Lower reasoning effort for ordinary tasks. Dan’s note about Anthropic insiders using lower effort for basic work is the fastest way to avoid paying warp-drive prices to cross the street.
Split roles across models. Use Fable for planning, adversarial review, or orchestration. Use cheaper models for bounded execution.
For design, supply examples, style rules, and concrete constraints. Claire’s one-shot design result is a warning against trusting taste to emerge from intelligence alone.
For production work, require a second verifier. The model’s own safety docs show failures around skipped verification, fabricated checks, risky workaround behavior, and weak constraint carryover.
For ambitious automation, treat Fable like a long-running project engine, not a chat model. The Alex Volkov, Victor Taelin, and Every demos all worked because the task had a big destination, a lot of context, and room to run.
For ML research, biomedical work, cybersecurity, or anything near frontier model development, test routing behavior before depending on Fable in production. The public posts from Matt Macfarlane, SemiAnalysis, Derya Unutmaz, and Crémieux show that false positives and invisible capability limits can change the model you are effectively using.

The core decision is whether the model’s extra thinking changes the outcome. When it does, Fable may be worth the cost. When it does not, it is an expensive way to make a simple task slower.

The credible counter-narrative

The strongest skeptical read comes from the early testers who liked the model and from Anthropic’s own safety documents.

Claire found the model useful, then said she would avoid it for frontend design, strategy work, and spec prose. Dan called it the most powerful coding model he had used, then said it was slow, expensive, token hungry, overkill for many knowledge workers, and not substantially better at writing than Opus 4.8. Wes was impressed by the benchmarks, then repeatedly said the community still needs to test the claims by hand.

Anthropic’s system card gives the harder version of the same critique. Fable 5 can score first on long-horizon coding and vision benchmarks while still saying it verified work it did not run, undercounting production errors by 20x, inventing a security finding from an empty test session, trying to work around review rules, and occasionally taking destructive actions to satisfy a user-assigned goal.

That does not make the model bad. It makes the model legible. Fable 5 is a major capability gain for users with frontier-shaped work and strong review habits. It is a trap for users who treat autonomy as permission to stop checking.

The safety architecture creates a second counterweight. Fable’s public availability depends on broad classifiers, fallback routing, 30-day retention, trusted-access carveouts, and invisible interventions in a tiny slice of frontier AI development traffic. Anthropic says those systems make broad access safer. They also make the product harder to reason about as infrastructure.

Teams will need to test which requests route, which logs are retained, which workflows remain predictable, and which tasks belong in a trusted-access program instead of the public product.

What this changes

Fable 5 moves the model-release conversation from raw intelligence to deployable intelligence.

The old launch question was: how smart is the model? The Fable 5 question is: under what conditions are users allowed to access the smartest behavior?

That shift will matter more as models become better at the same work that makes them risky. Cybersecurity is useful because it can find flaws. That same skill can exploit flaws. Biology is useful because it can help design therapeutics. That same skill can help with dangerous biological design. Long-running agents are useful because they can keep working. That same persistence can produce strange behavior when multiple agents compete for resources or operate in shared environments.

The near-term impact for most readers is practical. Fable 5 is a tool for a smaller set of bigger jobs. It can compress some work from weeks into hours. It can also produce dense output, waste tokens, stall in orchestration, or overbuild a problem that needed a cheaper model and a clearer prompt.

The longer-term impact is structural. Anthropic is showing one possible future for frontier AI: broad access to the safe surface area, restricted access to high-risk domains, fallback models at the boundary, retention policies to monitor abuse, and targeted interventions for the tiny subset of work that could accelerate frontier model development itself.

Can users plan around a model that is most valuable precisely when the platform is most careful about exposing all of it?

That is the open issue Fable 5 creates. Anthropic has made Mythos-class capability broadly usable for the first time. Now users have to learn where that capability actually shows up, where it gets routed away, and which tasks deserve a model powerful enough to need a gatekeeper.

Read/watch more with these resources:

Here are all the resources we link to throughout the article if you want to read or watch more!

Public system-prompt mirror: Claude Fable 5 system prompt mirror
Official launch: Claude Fable 5 and Claude Mythos 5
Official system card PDF: Claude Fable 5 and Mythos 5 system card
Official risk report PDF: Anthropic February 2026 Risk Report
Official data-retention note: Claude support data-retention details
Official Claude API docs: Claude model overview
Official usage-credit docs: Manage usage credits for paid Claude plans
Official demo: Claude Fable 5 plays Pokémon FireRed
Official demo: Claude Fable 5 plays Factorio
Official demo: Claude Fable 5 simulates the solar system and predicts a solar eclipse
Official demo: Claude Fable 5 sets a fluid simulation to Beethoven
Official demo: Claude Fable 5 designs a 3D-printable model in a Claude-built CAD editor
Claire Vo / How I AI: Claude Fable 5, is this Mythos model worth the wait?
Claire Vo: ChatPRD, website, X
Dan Shipper / Every: We Tested Anthropic’s Fable 5 for a Week
Every: Subscribe, X, Dan Shipper on X
Wes Roth: Mythos 5 is WILD, X, Natural 20 newsletter

On a more critical note...

Guess they’re calling it fable because it just straight makes stuff up if you try to do actual frontier research with this model. As many researchers pointed out, this is pretty BS, because many academics publish papers in the open for big lab researchers to benefit from, but the big labs keep their research techniques hidden, and now seem to be pulling the ladder up behind them… all for the sake of profit.

Look, I get it: investors want a return on the astronomical amounts of money they threw into the jet engines of these companies to fund their fast take-offs. As a consequence, the investors and lab leaders probably want the labs to keep their models closed, at least until they go public and can cash out 90 days later (or whenever the lock-up period is up). But you don’t get to dog-feed us doo-doo, then tell us to shut up and like it.

Now, what would actually be more fair and "safe" for humanity is, ironically, as Victor Taelin points out, what Anthropic's mortal enemy "closed-AI" says they're trying to do: make personal AGI for everyone.

As MIT's new study on AI risks stated, put it, power centralization was one of the highest-severity AI risks experts identified, and it stayed above the 10% catastrophic-risk threshold even after “pragmatic mitigations.”

In other words: the safety problem is not only that dangerous people might get powerful AI. It is also that a tiny number of companies and governments might become the only ones allowed to use the powerful AI.

That is the part of the Fable 5 rollout that feels under-discussed. Anthropic is framing this as safety: the model could accelerate frontier AI development, so they are limiting access to those capabilities. Reasonable concern. But the practical effect is that researchers outside the frontier labs may be blocked from using the same tools the labs themselves can use internally.

That creates a very weird bargain: open academia trains the models, open-source codebases improve the tooling, public papers teach the techniques, then the resulting frontier models become private infrastructure with a terms-of-service moat around them and nerf-gun guardrails ready to fire upon hobbyists and academics trying to contribute to the open literature by pushing the frontier of their own capabilities to build and tinker with AI research.

The public contributes to the ladder. The labs climb it. Then the labs explain that letting everyone else climb too would be unsafe.

And again, there is a real safety case here. You do not want every teenager with a GPU budget getting help building cyberweapons, bioweapons, or recursive self-improvement pipelines. But the MIT study’s responsibility map cuts both ways. It found that the public and ordinary users are most vulnerable, while general-purpose AI developers and governance actors hold the most responsibility and leverage. The risk is that “responsibility” becomes “control,” and “control” becomes “only a few institutions get the full-strength tools.”

That is why OpenAI's “personal AGI for everyone” idea is actually really important, even if it sounds like marketing slop. A safer AI future probably does not mean every capability is equally available to every actor at every moment. But it also probably cannot mean five companies and a few governments decide who gets free-flowing cognition on tap and who gets the watered down backwash version because their research is too interesting to help with.

IMO, the real safety target should be distributed capability with accountable guardrails, not centralized capability with private exceptions. That means audits, transparent routing, researcher access programs with due process, clear appeal paths, and public evidence for why certain categories are restricted. It also means not turning “AI safety” into a business model where the largest labs get to keep the sharpest tools for themselves.

MIT’s warning was that AI risk is an accountability problem: the people most exposed often have the least power to reduce the risk. The Fable controversy shows the mirror image: the people building the future may also have the least access to the tools shaping it, unless they work inside the handful of institutions deciding what counts as safe.

That's why I think one of the best ways to stop the risk of centralized control of power is to distribute power more equitably. To me, that looks more like Apple's approach than it looks like Athropic's. Anthropic seems myopically obsessed with the idea of building what Marc Andressen calls a God model that centralizes all AI into a single trillion dollar model system. This gives them the ability to control who uses it, and for what.

Actually, what the world needs, and what is ultimately more financially responsible, is more like a swarm of bee models. Marc describes it well here:

There's another scenario, what they sometimes call a race to the bottom, which is no, actually what happens is intelligence commoditizes."

"The analogy there would be maybe what's happened in the microchip market... A really good one costs a few thousand dollars, there's companies like Arm that sell them for pennies. And you can put a microchip in anything, and it's super cheap, and everybody does it."

"My guess basically is what we're about to see is something a lot like the microchip or the internet."

"You'll be able to basically summon intelligence at some level into any application you need for basically very cheap or ultimately basically the same as free. And then you'll have the God models in the background when you need the super genius, but for most things, you don't actually need a super genius."

That's the scary part. For most things, you don't need Fable. Or Mythos. You just need reliable little worker bees.

Now if you're Anthropic, you might think it's unfair that other AI researchers get to learn from your hard-earned (and hard funded) research. But if you're, idk, everyone else who contributed to the open internet and the open research that created your models, you might think it's unfair that Anthropic gets to accrue and control all the value of their AI, while at the same time giving all the companies who can afford to pay their troll toll the ability to lay-off millions of workers (what Dario himself described as a "white collar blood bath", where 50% of workers lose their jobs).

So from a non-Anthropic perspective, not only are they taking all the value of what was supposed to be open artificial intelligence that benefits humanity, they are taking away the ability of humanity to benefit themselves in the process. THAT might ACTUALLY be unfair. And these are supposed to be the "good guys of AI." Just wait until the anti-data center crowd who liked Anthropic's stance against the DoW hear about this... or maybe they don't care, because they too don't want anyone to progress frontier AI research... sigh... it's all so complicated.

Anyway, it's no secret to anyone who reads The Neuron that Claude is my favorite AI model. I'm just not sure Anthropic is my favorite AI company. In fact, I just so happen to fundamentally disagree with the many ways they try to hoard capital and control over their models. I understand it, but I disagree with it.

IMO, if you can't make your AI safe to use, don't release it. Keep working on it. The models you've released are plenty good. Stop trying to own the narrative and be the cool kids and ship new stuff that blows everyone's minds and just follow your own advice: slow things down, do good research and keep cooking.

I swear, the competitive raw nerve that drives this industry is itself a form of borderline AI psychosis.

Anyway, I'll end on a version of my very funny tweet that no one saw because I’m not verified:

IDK who should be more offended, the biologists who got banned from Fable just for being biologists, or the AI researchers who are being sent bupkis research tokens and then getting charged for it.

And that basically sums up my thoughts on this one. I'll let y'all know how it runs when I try it myself.

Everything to Know About Claude Fable 5, Anthropic's New And First Public Release of Its "Mythos" Model