Introducing Claude Opus 4.8

What does it look like when an AI model can confidently write production-ready code, orchestrate multi-tool agents across hours-long sessions, and catch its own mistakes before you do? On May 28, 2026, Anthropic answered that question with Claude Opus 4.8—and if you're building software today, you should pay attention.

What Is Claude Opus 4.8?

Claude Opus 4.8 is Anthropic's most capable generally available model to date. It's a hybrid reasoning model—meaning it blends fast, intuitive responses with deep, deliberate thinking—and features a 1 million token context window. It's the successor to Opus 4.7 (April 2026) and brings across-the-board improvements for coding, agentic workflows, and enterprise document work.

The headline feature is adaptive thinking: Opus 4.8 automatically calibrates how much reasoning effort to spend on a problem. A simple refactor gets a quick, direct response. A complex multi-service architecture migration gets careful, step-by-step deliberation. You pay for intelligence where it matters, not where it doesn't.

Pricing starts at $5 / million input tokens and $25 / million output tokens, with up to 90% savings via prompt caching and 50% via batch processing. Use claude-opus-4-8 via the Claude API, AWS, Google Cloud, or Microsoft Foundry.

Advanced Coding: Senior-Engineer-Level Output

The biggest upgrade developers will feel immediately is in coding quality. Opus 4.8 doesn't just write code—it plans carefully before it writes, sustains effort across large codebases, and self-corrects before handing work back to you.

Cursor's co-founder Michael Truell put it directly: "On CursorBench, Claude Opus 4.8 exceeds prior Opus models across every effort level. Tool calling is meaningfully more efficient, using fewer steps for the same intelligence."

For developers using Tarsk, this matters a lot. When you spin up a new thread and delegate a complex feature to an AI agent, Opus 4.8 will plan the implementation, navigate your codebase, and push through edge cases—without you needing to hold its hand through each step.

AI Agents: Built for Long-Running Autonomy

Short bursts of AI assistance are easy. The hard problem is reliability across a 30-minute, multi-tool workflow where one wrong decision cascades into a broken pipeline. Opus 4.8 was explicitly built for this.

Persistent memory across sessions — the model carries context from previous interactions to learn how you and your codebase work.
Deliberate planning — it maps out a strategy before executing, reducing mid-task derailments.
Consistent tool use — Devin's CEO Scott Wu noted it "uses tools cleanly and follows instructions with the consistency our autonomous engineering workloads need to keep running unattended."

On one Super-Agent benchmark, Opus 4.8 was the only model to complete every case end-to-end, beating prior Opus models and GPT-5.5 at parity on cost.

Enterprise Workflows: Context That Doesn't Drop

With a 1M token context window and the ability to carry context across sessions, Opus 4.8 can manage multi-day projects end-to-end. For legal, financial, and document-heavy workloads:

It scored the highest on legal agent benchmarks—the first model to break 10% on the all-pass standard (Leya).
On financial document workflows, it delivered the same quality as Opus 4.7 with better citation precision and token efficiency (Hebbia).
For creative work, Staff Writer Katie Parrott described it as "faster, easier to collaborate with, and better at carrying context and style direction across a long session."

How It Fits Into Tarsk

Tarsk is built around the idea that your AI agent should be as capable as your best collaborator. With Opus 4.8 as the engine, you get a model that:

Runs reliably inside long-lived threads without losing context
Delegates work across tools with minimal hand-holding
Asks the right questions before making big changes
Pushes back when a plan isn't sound

Tom Pritchard, Staff Engineer at a leading AI coding tool, summarized it well: "Claude Opus 4.8 has noticeably better judgment. It asks the right questions, catches its own mistakes, pushes back when a plan isn't sound, and builds up confidence around complex, multi-service explorations before making big changes."

You can use Claude Opus 4.8 in Tarsk today via the Anthropic provider. Head to Settings → Providers, connect your API key, and select claude-opus-4-8 as your model.

Key Takeaways

Adaptive thinking means the model spends reasoning effort proportional to task complexity—faster for simple work, deeper for hard problems.

Production-grade code output with self-correction and multi-service awareness lets senior engineers delegate their hardest work with confidence.

Reliable agentic autonomy across long-running sessions makes Opus 4.8 the go-to choice for complex, multi-step automated workflows.

Try Claude Opus 4.8 in Tarsk

Download Tarsk and connect your Anthropic API key to start delegating your hardest coding tasks to the most capable model available.

Download Tarsk Read the Docs