Built by /blog-post-GM — a Claude Code skill we evolved with our own Evolution engine to write every post in the Godmode voice.
Get free skill (account)
Engineering ⏱️ 5 min read

One Skill to Rule Them All: How We Made One-Shot Universal

TL;DR

🎯 What: One-Shot Scripts now handles code, writing, audits, and migrations equally well
📈 Impact: Writing task scores jumped from 0.48 to 0.75 — code stayed at 0.78
🧬 Method: 14 evo-loop iterations, 6 accepted mutations, zero regressions
🏗️ Bonus: Evolution + Evo-Loop both decomposed into the same scripts/ architecture
Code0 Writing0 Audit0 Shipped0/30
ONE PIPELINE · THREE LANES · ROUTE FORKS BY TASK TYPE

⚠️ The Problem: A Skill That Only Spoke Code

One-Shot Scripts was our modular execution protocol — same power as One-Shot Beta, broken into individually-evolvable script files. It handled code tasks beautifully. Ask it to build an API, fix a bug, or refactor a module? Flawless.

Ask it to write an ADR, document an API, or draft a postmortem? It fell apart. Seven of eight scoring dimensions failed on writing tasks. The composite score was 0.48 — an F.

Core insight: The skill had detailed checklists for code (handle null, test concurrent access, check for N+1 queries) but a single line for writing: "Produce complete deliverable." That's like giving a chef a 47-step recipe for steak and a Post-it note that says "also make dessert."

🔬 The Diagnosis: Where Writing Tasks Died

We pointed Evo-Loop at One-Shot Scripts and ran it against 12 benchmarks — 3 writing tasks, 2 audits, 7 code tasks. The first iteration exposed the damage:

Phase Code Task Writing Task
Phase 2 — Build 14 specific checklist items 1 line: "produce complete deliverable"
Phase 4 — Harden OWASP, N+1, race conditions, ReDoS 1 line: "hostile-reviewer pass"
Phase 5 — Document Function docs, README, changelog Dead — output IS the document
Phase 6 — Verify Syntax check, run app, trace flow Dead — no writing equivalent
Delivery Gate 12 web/UI checks Zero writing checks

Five of nine phases were either dead or useless for non-code tasks. The skill wasn't broken — it was incomplete.

🧬 The Fix: 6 Mutations in 14 Iterations

Evo-Loop ran the skill against benchmarks, scored it, flagged weaknesses, and proposed mutations. Each mutation required human approval before applying. Here's what changed:

📝

Phase 2: Writing Block

Research requirements, outline before drafting, draft with specificity, structural completeness check, source verification. Parallels the code checklist in depth.

🔍

Phase 4: Stress Test

Play the hostile reviewer. Check logical consistency, claim accuracy, completeness gaps, bias detection, audience mismatch, code sample verification.

Phase 6: Writing Verify

Requirement-by-requirement coverage audit. Internal consistency check. Audience calibration. Readability pass. Full start-to-finish read as the intended audience.

🗺️

SKILL.md: Routing Table

Explicit task-type routing. Each phase marked Critical, Light, or Skip per task type. No more guessing which phases apply to writing vs. code.

🔄

Loop: Phase Mapping

When a dimension scores low, the loop now knows exactly which phase to re-run — for code AND writing tasks. No more "re-run everything and hope."

🚪

Delivery: Universal Gate

Pre-delivery checklist split into Universal + Code + Web/UI + Writing sections. Every task type gets its own quality gate before shipping.

ROUTING TABLE · LIVE · TASK TYPE → SCRIPT FAMILY

📊 The Results

Writing task composite: 0.48 → 0.75. That's every single dimension going from fail to pass.

Code task composite: 0.78 → 0.78. Zero regression. The writing additions didn't dilute the code instructions.

WRITING · 0.48 → 0.75 CODE · 0.78 → 0.78 6 MUTATIONS · 14 ITERS
PARTICLE ACCRETION · EVERY BURST = ONE ACCEPTED MUTATION
❌ Iter 1: 0.48 — 7/8 dimensions fail

🔧 Mutation 1: Phase 2 writing block

🔧 Mutation 2: Phase 4 stress test

🔧 Mutation 3: Phase 6 writing verify

🔧 Mutation 4: Task-type routing table

🔧 Mutation 5: Loop phase mapping

🔧 Mutation 6: Universal delivery gate

✅ Iter 14: 0.76 — 0/8 dimensions fail — plateau — done

Core insight: The skill didn't need more instructions. It needed equal instructions. Every phase already knew how to handle code in exhaustive detail. The fix was giving writing the same level of care — not inventing a separate system.

🏗️ Bonus: Architecture Decomposition

While we were at it, we decomposed the two remaining monolithic skills — Evolution and Evo-Loop — into the same scripts/ pattern that One-Shot Scripts uses.

Before After
File layout 8 companion files loose in skill directory All scripts in a scripts/ subdirectory
Discovery Guess which file does what Manifest table in SKILL.md maps every script
References By filename only Explicit Read scripts/X.md instructions

Evolution went from 9 loose files to a 188-line orchestrator + 8 scripts. Evo-Loop went from 4 loose files to a 113-line orchestrator + 3 scripts. All files under 200 lines. All cross-references updated.

Monolith avg
0ms
Scripts/ avg
0ms
SAME REQUEST · ONE FILE VS SIXTEEN · ROUTER PICKS THE RIGHT SCRIPT

💡 Why This Matters for You

If you use One-Shot Scripts, it now handles any task type you throw at it — not just code. Write a postmortem. Draft API docs. Plan a migration. The skill knows which phases to run, which to skip, and how to verify the output.

If you use Evolution or Evo-Loop, the new scripts/ architecture means individual scripts can be evolved independently. Mutate just the scoring engine without touching the mutation engine. Evolve just the loop setup without risking the report template.

Core insight: Modular skills evolve faster than monoliths. When everything's in one file, a mutation anywhere risks breaking everything. When each concern lives in its own script, you can iterate surgically.

Get the Universal One-Shot

One-Shot Scripts v1.6.0 ships with all 6 universality mutations. Handles code, writing, audits, and migrations out of the box.

See Pricing Try Lite (free account)