All posts

AI Models

Claude Sonnet 5: Almost Opus 4.8, at a Third of the Price

Anthropic just shipped Sonnet 5, codename Fennec. The headline writes itself: this is the cheap model that finally feels expensive. Near-Opus-4.8 quality, a million-token context window, and a price tag that makes running agents all day actually affordable.

Jithin Kumar PalepuJune 30, 202611 min read

On June 30, 2026, Anthropic released Claude Sonnet 5 (internal codename Fennec), and the one-sentence version is this: it does most of what Opus 4.8 does, for roughly a third of the money. That is the entire pitch, and for the kind of always-on agent work people actually build today, it is a bigger deal than another point on a benchmark.

Anthropic calls it “the most agentic Sonnet model yet,” which is the sort of phrase you should normally squint at. But the numbers back it up. Sonnet 5 plans, it drives browsers and terminals, and it runs long autonomous loops at a level that, a few months ago, you needed a flagship model to touch. The Sonnet tier used to be the place you went when you wanted cheap and fast and were willing to give up some smarts. Sonnet 5 mostly removes the third part of that trade.

The cheap tier just got smart enough that “use the cheap one” stopped being a compromise for most agent workloads.

What actually shipped?

Sonnet 5 is available everywhere from day one. It is the default model on the Free and Pro plans, and it is live for Max, Team, and Enterprise users. Developers call it through the API with the identifier claude-sonnet-5, so dropping it into an existing app is a one-line change. There is nothing to migrate and no new SDK to learn.

The “most agentic Sonnet” framing comes down to three things you will actually feel. It makes better plans before it starts swinging. It uses tools, browsers, terminals, file systems, more reliably across a long chain without losing the thread. And it can keep going on a multi-step task without the mid-run collapse that used to make cheaper models risky to leave alone. For anyone wiring up agent loops, that last point is the one that changes what you are willing to hand off.

How close to Opus 4.8 is it, really?

Closer than the price gap suggests, which is the whole story. Sonnet 5 posts 82.1% on SWE-bench, a serious agentic-coding result for a mid-tier model. Anthropic's own framing is that performance is “close to” Opus 4.8 while costing far less, and on the agentic-coding eval the company highlighted, the gap is real but narrow: Sonnet 5 lands at 63.2% where Opus 4.8 sits at 69.2% and the previous Sonnet 4.6 managed 58.1%.

BenchmarkSonnet 5Opus 4.8Sonnet 4.6
Agentic coding (Anthropic eval)63.269.258.1
SWE-bench (agentic coding)82.1
Anthropic-reported scores, June 2026. Source: anthropic.com.

Six points on a coding eval is not nothing, and Anthropic is honest that Opus 4.8 still buys you greater accuracy when you are willing to pay for it. But here is the framing that matters: Opus 4.8 costs about 2.5x more per token. So the question is not “is Sonnet 5 as good as Opus?” It is “is the last six points worth paying 2.5x for, on this task?” For a huge slice of real work, browsing, routine coding, knowledge tasks, customer-facing agents, the answer is no, and that is exactly the gap Anthropic is selling into.

The right question is never “is the cheap model as good?” It is “is the difference worth the multiplier?” On Sonnet 5, usually it isn't.

The sleeper feature: a 1M-token context window

A million tokens, on a Sonnet

Sonnet 5 ships with a 1 million token context window, matching Opus 4.8 and a major jump for the mid-tier line. This is the quietly huge part. Long context is most valuable precisely on the model you run constantly, the one chewing through whole repos, long transcripts, or a day of agent history. Putting a million-token window on the cheap, fast model means you can do whole-codebase reasoning and giant-document work without reaching for the expensive flagship every time.

Why it matters Long-context used to be a flagship-only perk. Now the cheap tier has it too.

It pairs naturally with patterns we have written about before, like feeding an agent raw files instead of a vector store. If you have read our piece on direct corpus interaction, a cheap model with a million-token window is exactly the engine that approach wants.

The catch nobody mentions: a new tokenizer

Here is the asterisk on all that pricing math. Sonnet 5 ships with an updated tokenizer that chops text into tokens differently from its predecessors. Anthropic notes the same text can come out to roughly 1.0x to 1.35x the token count you would have gotten before. Read that slowly, because it matters: a lower per-token price does not automatically mean a lower bill if each request now counts more tokens.

Pricing, and the clock on the discount

This is the part that earns the headline. Sonnet 5 launches with introductory pricing of $2 per million input tokens and $10 per million output tokens, and that rate runs through August 31, 2026. After that it settles to its standard $3 in / $15 out. Either way, set against Opus 4.8's $5 in / $25 out, you are looking at roughly a third to well under half the per-token cost for a model that does most of the same work.

Sonnet 5 (intro)

$2 / M in

$10 / M out

Through Aug 31, 2026

Sonnet 5 (standard)

$3 / M in

$15 / M out

After the window

Opus 4.8

$5 / M in

$25 / M out

The flagship

There is a strategic read here too. Multiple outlets framed the steep, time-boxed discount as Anthropic playing for volume and developer mindshare as it races toward an IPO. Whatever the motive, the practical effect for you is a two-month window where running serious agent workloads on a near-flagship model is unusually cheap. If you have a migration in mind, the calendar is part of the decision.

The safety gains that matter for agents

The alignment numbers are the part that should interest anyone putting a model near production, and they moved in the right direction. Against Sonnet 4.6, Sonnet 5 shows a lower rate of “undesirable behaviors” like cooperating with misuse and deception. It is better at refusing malicious requests and at sidestepping prompt-injection and hijack attempts, the exact failure mode that gets scary the moment your model is driving a browser or a terminal. It also hallucinates less and is less sycophantic than its predecessor.

So, should you switch?

For most workloads, make Sonnet 5 your default and reach up to Opus 4.8 only when a task earns it. That inverts the old habit of defaulting to the flagship and reaching down to save money. The reasons line up cleanly: near-Opus quality, a million-token context, materially better safety, and a price that is a third of the flagship's during the intro window. Route the genuinely hard, accuracy-critical jobs, the gnarly refactors, the high-stakes reasoning, to Opus, and let Sonnet 5 carry the everyday volume.

The two things to verify first: run your real prompts through the new tokenizer so the cost math is honest, and benchmark the six-point coding gap on your tasks rather than Anthropic's. If your work lives in that gap, pay for Opus. If it doesn't, and for most teams it doesn't, Sonnet 5 is the new default driver. And if it is raw speed and inference cost you care about, it is worth reading how the other side of the stack is moving too, in our piece on speculative decoding and DeepSeek DSpark.

Everything that matters in AI,
straight to your inbox.

Join 12,000+ readers — daily, free, no spam.