Anthropic released Claude Sonnet 5 on June 30, positioning it as the most agentic Sonnet model yet. The headline: near-Opus capability at Sonnet pricing. The model scores close to Opus 4.8 on agentic evaluations while costing roughly 40% less per token. For developers who have been waiting for agentic reliability to drop below the Opus price floor, this is the release that matters.
Sonnet 5 is available across all Anthropic plans today. It is the default model for Free and Pro users and accessible to Max, Team, and Enterprise subscribers. It also lands on Amazon Bedrock and the Claude Platform on AWS. Introductory pricing runs through August 31, 2026: $2 per million input tokens and $10 per million output tokens. After that, standard pricing kicks in at $3 and $15 respectively. Opus 4.8, for reference, costs $5 per million input tokens and $25 per million output tokens.
The pricing story is not just about cheaper tokens. It is about what those tokens buy. Anthropic’s own evaluations show Sonnet 5 matching Opus 4.8 on agentic search (BrowseComp) and computer use (OSWorld-Verified) at higher effort levels. The cost-performance curves published in the release show Sonnet 5 covering a wider range of tradeoffs than Opus 4.8, with the orange line (Sonnet 5) consistently above the gray line (Sonnet 4.6) and sometimes touching the yellow line (Opus 4.8). At medium effort, the efficiency gain is substantial.
This is the first Sonnet model of Anthropic’s latest generation. The previous Sonnet generation — 3.5, 3.6, 3.7 — defined the agentic AI era for many developers. They showed that a mid-tier model could handle coding and tool use well enough to build real products. But over the past year, the clearest agentic gains came from Opus-class models. Sonnet 5 narrows that gap. Anthropic says its performance is “close to that of Opus 4.8” on reasoning, tool use, coding, and knowledge work.
The early access feedback in the release is unusually concrete. Lovable’s co-founder notes that Sonnet 5 “refuses unsafe requests cleanly and consistently” and calls a model that knows when to say no as important as one that knows how to build. A tester from an unnamed company describes asking Sonnet 5 to investigate a bug: “Unprompted, it wrote a reproducing test, implemented the fix, then stashed it to confirm the bug came back without the change. All in a single pass.” Another tester from a legal AI startup says Sonnet 5 “sits on the Pareto frontier for Eve’s plaintiff-law tasks” and that the price-to-performance ratio made migration easy.
These quotes point to a specific capability that matters more than benchmark scores: follow-through. Multiple testers describe Sonnet 5 finishing tasks where previous Sonnet models would stop short. It checks its own output without being asked. It holds a plan across stages. For developers building autonomous agents, this is the difference between a demo and a production system.
Safety evaluations show a mixed picture. Sonnet 5 is an improvement over Sonnet 4.6 on agentic safety: better at refusing malicious requests, better at resisting prompt injection hijacks, lower rates of hallucination and sycophancy. On Anthropic’s automated behavioral audit, which tests misaligned behaviors like cooperation with misuse and deception, Sonnet 5 scored lower overall than Sonnet 4.6. But it scored higher than Opus 4.8 and Claude Mythos Preview on the same audit. The model also shows slightly higher partial success rates on cybersecurity exploit development than Sonnet 4.6, though it never produced a full working exploit in evaluations. Anthropic has deployed cyber safeguards by default, matching the safeguards on Opus 4.7 and 4.8.
The AWS launch is significant. Amazon Bedrock is where a large share of enterprise AI inference happens. The AWS blog post emphasizes that Sonnet 5 is designed for “coding, agents, and everyday professional work at scale.” It calls out specific industry use cases: financial services teams using Sonnet 5 for spreadsheet modeling and financial analysis agents that audit their own numbers; productivity workflows like report building and document drafting; browser and desktop automation via computer use. The post includes code examples for invoking Sonnet 5 through the Bedrock Converse API and the Anthropic Messages API via the Bedrock Mantle endpoint.
What this means for the AI economy is straightforward. The Sonnet line has historically been the volume driver for Anthropic. It is the model most developers reach for first. If Sonnet 5 delivers near-Opus agentic performance at Sonnet pricing, it compresses the market. Teams that previously needed Opus for reliable multi-step agentic work can now use Sonnet 5. Teams that were on Sonnet 4.6 get a meaningful upgrade at the same price point. The introductory pricing makes the decision even easier for the next two months.
The competitive pressure on other model providers is real. OpenAI’s GPT-4o and Google’s Gemini 2.5 Pro both sit in a similar price-performance band. Anthropic is effectively saying: you can get Opus-level agentic behavior without paying Opus prices. If that claim holds in production, it reshapes the cost calculus for building autonomous systems. The question is whether Sonnet 5’s agentic reliability holds up under sustained, multi-hour workflows — the kind that stress test a model’s ability to stay on plan without drifting or collapsing.
Anthropic has published a system card with full evaluation details. The changelog at the bottom of the release notes an important correction: the original BrowseComp chart used a simpler methodology that underestimated Sonnet 5’s performance. The updated chart uses a 10 million token budget with compaction and programmatic tool calling. That is a transparent correction, and it matters because BrowseComp is one of the key agentic search evaluations.
The model is available now. The introductory pricing window runs through August 31. For developers building agentic systems, the next two months are a low-risk window to test whether Sonnet 5’s agentic follow-through matches the benchmarks in production.