Built by /blog-post-GM — a Claude Code skill we evolved with our own Evolution engine to write every post in the Godmode voice.

Get free skill (account)

Engineering April 7, 2026 ⏱️ 5 min read

One Skill to Rule Them All: How We Made One-Shot Universal

TL;DR

🎯 What: One-Shot Scripts now handles code, writing, audits, and migrations equally well
📈 Impact: Writing task scores jumped from 0.48 to 0.75 — code stayed at 0.78
🧬 Method: 14 evo-loop iterations, 6 accepted mutations, zero regressions
🏗️ Bonus: Evolution + Evo-Loop both decomposed into the same scripts/ architecture

Code0 Writing0 Audit0 Shipped0/30

ONE PIPELINE · THREE LANES · ROUTE FORKS BY TASK TYPE

⚠️ The Problem: A Skill That Only Spoke Code

One-Shot Scripts was our modular execution protocol — same power as One-Shot Beta, broken into individually-evolvable script files. It handled code tasks beautifully. Ask it to build an API, fix a bug, or refactor a module? Flawless.

Ask it to write an ADR, document an API, or draft a postmortem? It fell apart. Seven of eight scoring dimensions failed on writing tasks. The composite score was 0.48 — an F.

Core insight: The skill had detailed checklists for code (handle null, test concurrent access, check for N+1 queries) but a single line for writing: "Produce complete deliverable." That's like giving a chef a 47-step recipe for steak and a Post-it note that says "also make dessert."

🔬 The Diagnosis: Where Writing Tasks Died

We pointed Evo-Loop at One-Shot Scripts and ran it against 12 benchmarks — 3 writing tasks, 2 audits, 7 code tasks. The first iteration exposed the damage:

Phase	Code Task	Writing Task
Phase 2 — Build	14 specific checklist items	1 line: "produce complete deliverable"
Phase 4 — Harden	OWASP, N+1, race conditions, ReDoS	1 line: "hostile-reviewer pass"
Phase 5 — Document	Function docs, README, changelog	Dead — output IS the document
Phase 6 — Verify	Syntax check, run app, trace flow	Dead — no writing equivalent
Delivery Gate	12 web/UI checks	Zero writing checks

Five of nine phases were either dead or useless for non-code tasks. The skill wasn't broken — it was incomplete.

🧬 The Fix: 6 Mutations in 14 Iterations

Evo-Loop ran the skill against benchmarks, scored it, flagged weaknesses, and proposed mutations. Each mutation required human approval before applying. Here's what changed:

📝

Phase 2: Writing Block

Research requirements, outline before drafting, draft with specificity, structural completeness check, source verification. Parallels the code checklist in depth.

🔍

Phase 4: Stress Test

Play the hostile reviewer. Check logical consistency, claim accuracy, completeness gaps, bias detection, audience mismatch, code sample verification.

✅

Phase 6: Writing Verify

Requirement-by-requirement coverage audit. Internal consistency check. Audience calibration. Readability pass. Full start-to-finish read as the intended audience.

🗺️

SKILL.md: Routing Table

Explicit task-type routing. Each phase marked Critical, Light, or Skip per task type. No more guessing which phases apply to writing vs. code.

🔄

Loop: Phase Mapping

When a dimension scores low, the loop now knows exactly which phase to re-run — for code AND writing tasks. No more "re-run everything and hope."

🚪

Delivery: Universal Gate

Pre-delivery checklist split into Universal + Code + Web/UI + Writing sections. Every task type gets its own quality gate before shipping.

ROUTING TABLE · LIVE · TASK TYPE → SCRIPT FAMILY

📊 The Results

Writing task composite: 0.48 → 0.75. That's every single dimension going from fail to pass.

Code task composite: 0.78 → 0.78. Zero regression. The writing additions didn't dilute the code instructions.

WRITING · 0.48 → 0.75 CODE · 0.78 → 0.78 6 MUTATIONS · 14 ITERS

PARTICLE ACCRETION · EVERY BURST = ONE ACCEPTED MUTATION

❌ Iter 1: 0.48 — 7/8 dimensions fail
↓
🔧 Mutation 1: Phase 2 writing block
↓
🔧 Mutation 2: Phase 4 stress test
↓
🔧 Mutation 3: Phase 6 writing verify
↓
🔧 Mutation 4: Task-type routing table
↓
🔧 Mutation 5: Loop phase mapping
↓
🔧 Mutation 6: Universal delivery gate
↓
✅ Iter 14: 0.76 — 0/8 dimensions fail — plateau — done

Core insight: The skill didn't need more instructions. It needed equal instructions. Every phase already knew how to handle code in exhaustive detail. The fix was giving writing the same level of care — not inventing a separate system.

🏗️ Bonus: Architecture Decomposition

While we were at it, we decomposed the two remaining monolithic skills — Evolution and Evo-Loop — into the same scripts/ pattern that One-Shot Scripts uses.

	Before	After
File layout	8 companion files loose in skill directory	All scripts in a `scripts/` subdirectory
Discovery	Guess which file does what	Manifest table in SKILL.md maps every script
References	By filename only	Explicit `Read scripts/X.md` instructions

Evolution went from 9 loose files to a 188-line orchestrator + 8 scripts. Evo-Loop went from 4 loose files to a 113-line orchestrator + 3 scripts. All files under 200 lines. All cross-references updated.

Monolith avg

0ms

Scripts/ avg

0ms

SAME REQUEST · ONE FILE VS SIXTEEN · ROUTER PICKS THE RIGHT SCRIPT

💡 Why This Matters for You

If you use One-Shot Scripts, it now handles any task type you throw at it — not just code. Write a postmortem. Draft API docs. Plan a migration. The skill knows which phases to run, which to skip, and how to verify the output.

If you use Evolution or Evo-Loop, the new scripts/ architecture means individual scripts can be evolved independently. Mutate just the scoring engine without touching the mutation engine. Evolve just the loop setup without risking the report template.

Core insight: Modular skills evolve faster than monoliths. When everything's in one file, a mutation anywhere risks breaking everything. When each concern lives in its own script, you can iterate surgically.

Get the Universal One-Shot

One-Shot Scripts v1.6.0 ships with all 6 universality mutations. Handles code, writing, audits, and migrations out of the box.

See Pricing Try Lite (free account)

← Prev: Vanilla vs Skills Showdown Outcome Tracking →