What's new in Claude Opus 4.7?

Claude Opus 4.7 follows instructions more literally, holds focus across longer runs, verifies its own outputs, lifts coding benchmarks by 13% over Opus 4.6, resolves 3× more production tasks on Rakuten-SWE-Bench, and ships a new xhigh effort level, task budgets, file-system memory, and the /ultrareview code-review command.

Do I need to update my Godmode skills for Opus 4.7?

No edits required. Switch your IDE or CLI to claude-opus-4-7 and every skill in the Godmode suite runs sharper immediately. The only watch-out: if your custom skill prompts relied on loose wording, Opus 4.7 may now take them literally — retune the thresholds.

Which Godmode skill benefits most from the Opus 4.7 upgrade?

One-Shot benefits most — its scoring loop depends on literal enforcement of numeric thresholds, so first-pass success rises. Evolution is close behind because self-verification now catches its own weak rationale.

Release April 18, 2026 ⏱️ 5 min read

Claude Opus 4.7 Is Here — Your Godmode Skills Just Got Sharper

TL;DR

🚀 Shipped: Anthropic released Claude Opus 4.7 on April 16.
🎯 The unlock: 4.7 reads skill rubrics more literally and holds focus across longer runs.
✅ What you do: nothing — switch models and every skill in the suite runs sharper for free.

INSTRUCTION-FOLLOWING ACROSS 100 SKILL TASKS

CURRENT TASK Initialising stream… 0 / 30

96 / 84 / 71 / 62 · HOVER A BAR FOR FAILURE EXAMPLES

🚀 What shipped on April 16

Two days ago Anthropic dropped Claude Opus 4.7. We re-ran our own skills against it yesterday — and the rubrics held tighter on every run.

The headline changes are less about raw benchmarks and more about behaviour: how the model treats instructions, how long it holds focus, and how often it checks its own work before returning.

⚡

Literal instruction following

4.7 takes prompts literally. Where 4.6 would interpret loosely or quietly skip parts, 4.7 does exactly what the letter says.

🧠

Long-run consistency

Complex, multi-step tasks hold rigour end-to-end. Plus: the model now devises ways to verify its own outputs before returning them.

📏

Task Budgets & xhigh

A new xhigh effort level between high and max, plus task budgets (beta) that guide token allocation across long runs.

👁️

3× bigger vision + /ultrareview

Images up to 2,576 px on the long edge — more than triple prior Claude. New /ultrareview for dedicated code review.

On the numbers front: a 13% lift in resolution on Anthropic's 93-task coding benchmark, and 3× more production tasks resolved on Rakuten-SWE-Bench versus Opus 4.6. The visual-acuity benchmark jumped from 54.5% to 98.5%. Pricing is unchanged at $5 / $25 per million tokens.

WHERE EACH MODEL FAILS — TASK DISTRIBUTION ACROSS 100

OPUS 4.7

OPUS 4.6

OPUS 4.5

SONNET 4.6

CLICK A RING TO SEE WHICH TASKS THAT MODEL FAILED

🧠 Why a better model makes skills better, not redundant

Godmode is a suite of Claude Code skills — structured protocols that force the model through ideation, research, build, test, hardening, and polish on every task. A skill is a set of hard rules we hand the model at the start of a run.

A raw-prompt user gains some benefit from Opus 4.7 — the model is just smarter. But someone running a skill gains far more, because the rules we were already enforcing are now enforced harder.

Think of it like this: a skill is a checklist. Opus 4.7 is a contractor who actually reads the checklist instead of skimming it. Same checklist. Different house.

🧰 Skill-by-skill — what gets better today

Four headline skills, four different shapes of upgrade. The pattern is the same across all of them: fewer loop re-runs, tighter adherence to each phase, better self-checks before the model hands anything back.

Skill	Before 4.7	On Opus 4.7
Godmode	8-layer protocol; sometimes drifted on long runs	Layers hold end-to-end without fraying
Godmode+	Planning + parallel sub-agents, single effort level	`xhigh` tunes each phase's effort separately
Evolution	Self-scoring occasionally generous on weak rationale	Self-verification catches its own loose reasoning
One-Shot	Loop frequently re-ran because one dimension underscored	More first-pass passes; fewer loop iterations
Gas · Ignition · Blog-Post-GM · Market-Research · Evo-Loop	All follow briefs, templates, and rubrics more literally — template fidelity goes up across the board.

👉 Don't own the suite yet? Godmode Lite is the free 4-layer starter — drop it into Claude Code and feel the 4.7 jump on your next task.

🔬 The literal-instruction-following unlock

The single biggest behavioural change is how the model reads rubrics. One-Shot's scoring loop is the cleanest example — it depends on a numeric threshold being enforced strictly across six dimensions.

📄 Prompt: "Score each dimension ≥ 0.85."
↓
🤖 Opus 4.6: scored 4 of 6 strictly; looser on the other 2.
↓
🔄 Loop re-ran because one dimension quietly underscored.
↓
🤖 Opus 4.7: all 6 dimensions scored against the exact threshold.
↓
✅ Loop exits on first pass — token spend drops.

SAME SKILL · THREE MODELS · CHARACTER-BY-CHARACTER DIVERGENCE

SKILL: extract-versions.md · identical input to all 3 models

1. Read package.json
2. Find the "version" field
3. Format it as: v{version}
4. Print the formatted result
5. Exit cleanly with code 0

opus 4.7LITERAL

opus 4.6SKIPS A STEP

opus 4.5REORDERS & "IMPROVES"

SAME 5-STEP SKILL · THREE DIFFERENT EXECUTIONS

Multiply that across every rubric, checklist, and template in the suite. Tighter rule-following compounds in a skill that has 15+ distinct rules stacked on top of each other.

⚠️ The one gotcha — retune loose prompts

The flip side of literal reading: prompts that worked by accident on 4.6 may now do exactly what we literally said.

"Users should re-tune their prompts and harnesses." — Anthropic, Opus 4.7 release notes

Do

Re-read every rubric. Make thresholds numeric.

Name outputs explicitly ("return JSON with keys X, Y, Z").

Replace soft words — should becomes MUST.

Don't

Assume old loose wording still works.

Write "make it good" — "good" isn't a spec.

Leave ambiguous success criteria in finish-line gates.

EDIT YOUR OWN SKILL — WATCH THE THREE MODELS DIVERGE

YOUR SKILL · one numbered step per line

Type or edit above · each model interprets your steps in real time below

OPUS 4.7 · literal

OPUS 4.6 · quietly drops a step

OPUS 4.5 · reorders & improves

SAME INPUT · THREE OUTPUTS · AUTHOR FOR 4.7

📊 Vanilla Opus 4.7 vs Opus 4.7 + Godmode

A sharper model still needs structure. Without a skill, Claude picks its own process every run — sometimes it researches, sometimes it improvises; sometimes it tests, sometimes it ships.

Task stage	Vanilla Opus 4.7	Opus 4.7 + Godmode
Research	Best-effort — depends on the prompt	Enforced research phase with verified assets
Verify	Model decides when it's done	9-phase scoring loop + visual audit
Polish	Often skipped on long runs	Dedicated polish phase runs last, every time

✅ Try the upgrade today

Two paths, depending on where you're starting from:

If you already run a Godmode skill

🔧 1. Switch your IDE or CLI to claude-opus-4-7
↓
🧪 2. Run your favourite skill on a task you've run before
↓
📊 3. Compare scores, loop iterations, first-pass success

If you haven't tried a Godmode skill yet

📦 1. Download Godmode Lite (free account required)
↓
📁 2. Drop it into ~/.claude/skills/
↓
🚀 3. Run the 4-layer protocol on any task and feel the difference

Download Godmode Lite and run it on your next task

Free 4-layer protocol. No signup. Drops into Claude Code in 60 seconds — and runs sharper on Opus 4.7 than it ever did on 4.6.

Download Lite (Free) See the full suite →

← Prev: The Vault All posts →