Anthropic Dispatch: Opus 4.7 Drops and Your Coworker Got Smarter

The Upgrade You Feel Before You Read the Changelog

On April sixteenth, twenty twenty-six, Anthropic shipped Claude Opus four point seven. If you have been using Claude Code daily, you probably noticed something before you read the blog post. The model just started catching things it used to miss. Tasks you had to babysit suddenly came back done, and done right. That is not marketing copy. That is what the early testers actually reported, and it tracks with what people on the ground have been saying for weeks.

The headline numbers are real but they are not the interesting part. Cursor's benchmark went from fifty-eight percent on Opus four point six to over seventy percent on four point seven. Hex, the data platform, said that low-effort Opus four point seven matches medium-effort four point six. Read that again. You can literally ask less of the model and get the same quality output you used to get when you pushed harder. For anyone managing token budgets or rate limits, that is not a minor detail. That is your operating cost dropping while your output quality holds steady.

What Actually Changed Under the Hood

Anthropic describes Opus four point seven as a notable improvement in advanced software engineering, with particular gains on the most difficult tasks. The key phrase there is "most difficult." This is not about writing better hello-world scripts. This is about the long, gnarly, multi-file refactors where previous models would lose the thread halfway through and start hallucinating function signatures.

The model now devises ways to verify its own output before reporting back. That sentence from the announcement deserves its own moment. Previous Opus would confidently hand you broken code and move on. Four point seven apparently stops, checks itself, and tells you when something does not add up. Devin's CEO Scott Wu put it bluntly.

Claude Opus four point seven takes long-horizon autonomy to a new level. It works coherently for hours, pushes through hard problems rather than giving up, and unlocks a class of deep investigation work we could not reliably run before.

Hours. Not minutes. Hours of coherent autonomous work. If you have been running background agents or headless Claude Code sessions, this is the upgrade that makes those workflows actually reliable instead of theoretically possible.

The Vision Upgrade Nobody Is Talking About

Buried in the announcement is a detail that matters more than it sounds. Opus four point seven has substantially better vision with higher resolution image support. For most developers, that is a footnote. But if you are doing anything with visual content, with screenshots, with diagrams, with scanned documents, with captioning pipelines, this is material.

Think about what that means for multimodal workflows. Reading chemical structures, interpreting technical diagrams, understanding dense visual layouts. Solve Intelligence, which works on life sciences patents, called it a major improvement. And for anyone running local VLM captioning because managed APIs block person-captioning, higher resolution upstream means better training data downstream.

The Mythos Shadow

Here is the part Anthropic probably did not want to lead with but put right there in the announcement anyway. Opus four point seven is not their most powerful model. That title belongs to Claude Mythos Preview, which is still under limited release through Project Glasswing. Anthropic explicitly said they experimented with differentially reducing Opus four point seven's cyber capabilities compared to Mythos. They shipped it with automatic safeguards that detect and block prohibited cybersecurity uses.

This is Anthropic doing something genuinely unusual in the industry. They are publicly saying we have a stronger model, we are not releasing it broadly yet, and we are using this slightly less capable model as a testing ground for the safety measures we will eventually need on the bigger one. Whether you think that is responsible caution or competitive disadvantage depends on where you sit. But the transparency is notable.

Security researchers who want the full cyber capabilities can apply to the new Cyber Verification Program. Everyone else gets a model that is better at everything except breaking into systems. For most of us, that is the right trade.

What This Means If You Ship Things

The pricing did not change. Five dollars per million input tokens, twenty-five per million output. Same as four point six. So you get a meaningfully better model at the same cost, or the same quality model at lower effort and therefore lower token spend. Either way, the economics improved.

For Claude Code users on Max or Pro plans, this landed automatically. You did not have to opt in. Your coding assistant just got more reliable overnight. For API users, the model string is claude-opus-four-seven. Drop it in, run your evals, and see what moves.

The real test is not benchmarks though. The real test is whether you find yourself checking the model's work less often. Whether you can hand it the messy refactor and go make coffee. Whether the background session you kicked off at lunch is actually done and correct by the time you get back. That is what the early reports suggest, and that is what makes this a genuine step change rather than an incremental bump.