Claude Opus 4.7 Is Here — Your Godmode Skills Just Got Sharper
🚀 Shipped: Anthropic released Claude Opus 4.7 on April 16.
🎯 The unlock: 4.7 reads skill rubrics more literally and holds focus across longer runs.
✅ What you do: nothing — switch models and every skill in the suite runs sharper for free.
INSTRUCTION-FOLLOWING ACROSS 100 SKILL TASKS
What shipped on April 16
Two days ago Anthropic dropped Claude Opus 4.7. We re-ran our own skills against it yesterday — and the rubrics held tighter on every run.
The headline changes are less about raw benchmarks and more about behaviour: how the model treats instructions, how long it holds focus, and how often it checks its own work before returning.
Literal instruction following
4.7 takes prompts literally. Where 4.6 would interpret loosely or quietly skip parts, 4.7 does exactly what the letter says.
Long-run consistency
Complex, multi-step tasks hold rigour end-to-end. Plus: the model now devises ways to verify its own outputs before returning them.
Task Budgets & xhigh
A new xhigh effort level between high and max, plus task budgets (beta) that guide token allocation across long runs.
3× bigger vision + /ultrareview
Images up to 2,576 px on the long edge — more than triple prior Claude. New /ultrareview for dedicated code review.
On the numbers front: a 13% lift in resolution on Anthropic's 93-task coding benchmark, and 3× more production tasks resolved on Rakuten-SWE-Bench versus Opus 4.6. The visual-acuity benchmark jumped from 54.5% to 98.5%. Pricing is unchanged at $5 / $25 per million tokens.
WHERE EACH MODEL FAILS — TASK DISTRIBUTION ACROSS 100
Why a better model makes skills better, not redundant
Godmode is a suite of Claude Code skills — structured protocols that force the model through ideation, research, build, test, hardening, and polish on every task. A skill is a set of hard rules we hand the model at the start of a run.
A raw-prompt user gains some benefit from Opus 4.7 — the model is just smarter. But someone running a skill gains far more, because the rules we were already enforcing are now enforced harder.
Think of it like this: a skill is a checklist. Opus 4.7 is a contractor who actually reads the checklist instead of skimming it. Same checklist. Different house.
Skill-by-skill — what gets better today
Four headline skills, four different shapes of upgrade. The pattern is the same across all of them: fewer loop re-runs, tighter adherence to each phase, better self-checks before the model hands anything back.
| Skill | Before 4.7 | On Opus 4.7 |
|---|---|---|
| Godmode | 8-layer protocol; sometimes drifted on long runs | Layers hold end-to-end without fraying |
| Godmode+ | Planning + parallel sub-agents, single effort level | xhigh tunes each phase's effort separately |
| Evolution | Self-scoring occasionally generous on weak rationale | Self-verification catches its own loose reasoning |
| One-Shot | Loop frequently re-ran because one dimension underscored | More first-pass passes; fewer loop iterations |
| Gas · Ignition · Blog-Post-GM · Market-Research · Evo-Loop | All follow briefs, templates, and rubrics more literally — template fidelity goes up across the board. | |
The literal-instruction-following unlock
The single biggest behavioural change is how the model reads rubrics. One-Shot's scoring loop is the cleanest example — it depends on a numeric threshold being enforced strictly across six dimensions.
↓
🤖 Opus 4.6: scored 4 of 6 strictly; looser on the other 2.
↓
🔄 Loop re-ran because one dimension quietly underscored.
↓
🤖 Opus 4.7: all 6 dimensions scored against the exact threshold.
↓
✅ Loop exits on first pass — token spend drops.
SAME SKILL · THREE MODELS · CHARACTER-BY-CHARACTER DIVERGENCE
1. Read package.json
2. Find the "version" field
3. Format it as: v{version}
4. Print the formatted result
5. Exit cleanly with code 0
Multiply that across every rubric, checklist, and template in the suite. Tighter rule-following compounds in a skill that has 15+ distinct rules stacked on top of each other.
The one gotcha — retune loose prompts
The flip side of literal reading: prompts that worked by accident on 4.6 may now do exactly what we literally said.
Do
Re-read every rubric. Make thresholds numeric.
Name outputs explicitly ("return JSON with keys X, Y, Z").
Replace soft words — should becomes MUST.
Don't
Assume old loose wording still works.
Write "make it good" — "good" isn't a spec.
Leave ambiguous success criteria in finish-line gates.
EDIT YOUR OWN SKILL — WATCH THE THREE MODELS DIVERGE
Vanilla Opus 4.7 vs Opus 4.7 + Godmode
A sharper model still needs structure. Without a skill, Claude picks its own process every run — sometimes it researches, sometimes it improvises; sometimes it tests, sometimes it ships.
| Task stage | Vanilla Opus 4.7 | Opus 4.7 + Godmode |
|---|---|---|
| Research | Best-effort — depends on the prompt | Enforced research phase with verified assets |
| Verify | Model decides when it's done | 9-phase scoring loop + visual audit |
| Polish | Often skipped on long runs | Dedicated polish phase runs last, every time |
Try the upgrade today
Two paths, depending on where you're starting from:
If you already run a Godmode skill
claude-opus-4-7
↓
🧪 2. Run your favourite skill on a task you've run before
↓
📊 3. Compare scores, loop iterations, first-pass success
If you haven't tried a Godmode skill yet
↓
📁 2. Drop it into
~/.claude/skills/
↓
🚀 3. Run the 4-layer protocol on any task and feel the difference
Download Godmode Lite and run it on your next task
Free 4-layer protocol. No signup. Drops into Claude Code in 60 seconds — and runs sharper on Opus 4.7 than it ever did on 4.6.
Download Lite (Free) See the full suite →