/blog-post-GM — a Claude Code skill we evolved with our own Evolution engine to write every post in the Godmode voice.
One Skill to Rule Them All: How We Made One-Shot Universal
🎯 What: One-Shot Scripts now handles code, writing, audits, and migrations equally well
📈 Impact: Writing task scores jumped from 0.48 to 0.75 — code stayed at 0.78
🧬 Method: 14 evo-loop iterations, 6 accepted mutations, zero regressions
🏗️ Bonus: Evolution + Evo-Loop both decomposed into the same
scripts/ architecture
The Problem: A Skill That Only Spoke Code
One-Shot Scripts was our modular execution protocol — same power as One-Shot Beta, broken into individually-evolvable script files. It handled code tasks beautifully. Ask it to build an API, fix a bug, or refactor a module? Flawless.
Ask it to write an ADR, document an API, or draft a postmortem? It fell apart. Seven of eight scoring dimensions failed on writing tasks. The composite score was 0.48 — an F.
Core insight: The skill had detailed checklists for code (handle null, test concurrent access, check for N+1 queries) but a single line for writing: "Produce complete deliverable." That's like giving a chef a 47-step recipe for steak and a Post-it note that says "also make dessert."
The Diagnosis: Where Writing Tasks Died
We pointed Evo-Loop at One-Shot Scripts and ran it against 12 benchmarks — 3 writing tasks, 2 audits, 7 code tasks. The first iteration exposed the damage:
| Phase | Code Task | Writing Task |
|---|---|---|
| Phase 2 — Build | 14 specific checklist items | 1 line: "produce complete deliverable" |
| Phase 4 — Harden | OWASP, N+1, race conditions, ReDoS | 1 line: "hostile-reviewer pass" |
| Phase 5 — Document | Function docs, README, changelog | Dead — output IS the document |
| Phase 6 — Verify | Syntax check, run app, trace flow | Dead — no writing equivalent |
| Delivery Gate | 12 web/UI checks | Zero writing checks |
Five of nine phases were either dead or useless for non-code tasks. The skill wasn't broken — it was incomplete.
The Fix: 6 Mutations in 14 Iterations
Evo-Loop ran the skill against benchmarks, scored it, flagged weaknesses, and proposed mutations. Each mutation required human approval before applying. Here's what changed:
Phase 2: Writing Block
Research requirements, outline before drafting, draft with specificity, structural completeness check, source verification. Parallels the code checklist in depth.
Phase 4: Stress Test
Play the hostile reviewer. Check logical consistency, claim accuracy, completeness gaps, bias detection, audience mismatch, code sample verification.
Phase 6: Writing Verify
Requirement-by-requirement coverage audit. Internal consistency check. Audience calibration. Readability pass. Full start-to-finish read as the intended audience.
SKILL.md: Routing Table
Explicit task-type routing. Each phase marked Critical, Light, or Skip per task type. No more guessing which phases apply to writing vs. code.
Loop: Phase Mapping
When a dimension scores low, the loop now knows exactly which phase to re-run — for code AND writing tasks. No more "re-run everything and hope."
Delivery: Universal Gate
Pre-delivery checklist split into Universal + Code + Web/UI + Writing sections. Every task type gets its own quality gate before shipping.
The Results
Writing task composite: 0.48 → 0.75. That's every single dimension going from fail to pass.
Code task composite: 0.78 → 0.78. Zero regression. The writing additions didn't dilute the code instructions.
↓
🔧 Mutation 1: Phase 2 writing block
↓
🔧 Mutation 2: Phase 4 stress test
↓
🔧 Mutation 3: Phase 6 writing verify
↓
🔧 Mutation 4: Task-type routing table
↓
🔧 Mutation 5: Loop phase mapping
↓
🔧 Mutation 6: Universal delivery gate
↓
✅ Iter 14: 0.76 — 0/8 dimensions fail — plateau — done
Core insight: The skill didn't need more instructions. It needed equal instructions. Every phase already knew how to handle code in exhaustive detail. The fix was giving writing the same level of care — not inventing a separate system.
Bonus: Architecture Decomposition
While we were at it, we decomposed the two remaining monolithic skills — Evolution and Evo-Loop — into the same scripts/ pattern that One-Shot Scripts uses.
| Before | After | |
|---|---|---|
| File layout | 8 companion files loose in skill directory | All scripts in a scripts/ subdirectory |
| Discovery | Guess which file does what | Manifest table in SKILL.md maps every script |
| References | By filename only | Explicit Read scripts/X.md instructions |
Evolution went from 9 loose files to a 188-line orchestrator + 8 scripts. Evo-Loop went from 4 loose files to a 113-line orchestrator + 3 scripts. All files under 200 lines. All cross-references updated.
Why This Matters for You
If you use One-Shot Scripts, it now handles any task type you throw at it — not just code. Write a postmortem. Draft API docs. Plan a migration. The skill knows which phases to run, which to skip, and how to verify the output.
If you use Evolution or Evo-Loop, the new scripts/ architecture means individual scripts can be evolved independently. Mutate just the scoring engine without touching the mutation engine. Evolve just the loop setup without risking the report template.
Core insight: Modular skills evolve faster than monoliths. When everything's in one file, a mutation anywhere risks breaking everything. When each concern lives in its own script, you can iterate surgically.
Get the Universal One-Shot
One-Shot Scripts v1.6.0 ships with all 6 universality mutations. Handles code, writing, audits, and migrations out of the box.
See Pricing Try Lite (free account)