Anthropic Releases Opus 4.6
Anthropic released Claude Opus 4.6 on Thursday, the latest version of its most advanced AI model. Opus 4.5 launched only last November, and with 4.6 the company has sought to broaden its model's capabilities beyond coding and into a wider range of knowledge work.
Opus 4.6 is better at coding, sustaining tasks for longer, and creating higher-quality professional work products. It posts gains across most major benchmarks, with a particularly strong showing on ARC AGI 2, a test focused on problems that are easy for humans but hard for AI systems. It nearly doubled Opus 4.5's score and outperformed both Gemini 3 Pro and GPT-5.2. It also beat GPT-5.2 on GDPval-AA, which measures performance on economically valuable knowledge work in finance, legal, and other domains. That said, Opus 4.6 shows small regressions in a couple of areas, including standardized software engineering tests.
Perhaps the most notable addition is what Anthropic calls "agent teams," which let multiple agents split larger tasks into parallel jobs. Scott White, Head of Product at Anthropic, compared the new feature to having a talented team of humans working for you, noting that the segmenting of agent responsibilities allows them "to coordinate in parallel [and work] faster." To demonstrate the feature, Anthropic researcher Nicholas Carlini tasked 16 agents with writing a C compiler from scratch, capable of compiling the Linux kernel. Over nearly 2,000 Claude Code sessions and $20,000 in API costs, the agent team produced a 100,000-line compiler that can build Linux 6.9 on multiple architectures.
Opus 4.6 is the first Opus model to support a 1 million-token context window, available in beta through the company's developer platform. That's a 5x increase over the 200,000-token window in Opus 4.5. The upgrade allows the model to process up to 1,500 pages of text, 30,000 lines of code, or over an hour of video in a single prompt.
Opus 4.6 is also the first Anthropic model to use adaptive thinking, which allows it to consider contextual clues to determine how much effort to invest in a prompt. Developers get more control over this with an effort parameter to make explicit tradeoffs between quality, speed, and cost. Previously, the option was only to enable or disable extended thinking. For API users, Claude can now use compaction to summarize context, allowing it to handle longer-running tasks without hitting its context limits.
The new version also integrates Claude directly into PowerPoint as a side panel. Previously, a user could tell Claude to create a PowerPoint deck, but the file would then have to be transferred to PowerPoint to edit the presentation. Now the presentation can be crafted within PowerPoint, with direct help from Claude.
In security testing, Anthropic's frontier red team gave Opus 4.6 access to Python and vulnerability analysis tools in a sandboxed environment but no specific instructions. Claude found more than 500 previously unknown zero-day vulnerabilities in open-source code, each validated by an Anthropic team member or outside security researcher. Axios reported that Claude uncovered a flaw in GhostScript that could cause it to crash, along with buffer overflow flaws in OpenSC and CGIF.
"I think that we are now transitioning almost into vibe working."
That's how Scott White described the shift to CNBC. White told TechCrunch that Opus has evolved from a model that was highly capable in one particular domain, software development, into a program that could be "really useful for a broader set" of knowledge workers. "We noticed a lot of people who are not professional software developers using Claude Code simply because it was a really amazing engine to do tasks," he said.
That enterprise bet is working. Claude Code reached $1 billion in run rate revenue only six months after becoming generally available in May 2025. Major enterprise deployments include Uber, Salesforce, Accenture, Spotify, Rakuten, Snowflake, Novo Nordisk, and Ramp. Anthropic now has over 300,000 paying business customers, which make up roughly 80% of its business.
Opus 4.6 is also rolling out in GitHub Copilot, where early testing shows it excels in agentic coding on especially hard tasks requiring planning and tool calling. It will be available to Copilot Pro, Pro+, Business, and Enterprise users.
Opus 4.6 is available on Claude for Pro, Max, Team, and Enterprise users. For developers, the model is available on the Claude Developer Platform as well as Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. Pricing remains the same as Opus 4.5: $5/$25 per million input/output tokens. US-only inference is also now an option, at a 10% premium.