Introducing /one-shot-orchestra — Our Most Powerful Skill Yet
🚀 What it is: A Claude skill that acts as a conductor — spawning parallel Claude processes, each with its own fresh 1M-token memory
🎼 How it works: Conductor writes briefs → launches worker terminals → reads back
result.json → synthesises delivery✨ Why it matters: Big builds stop hitting the memory ceiling. Each worker starts empty, gets exactly what it needs, and reports back
The Launch
Today we're shipping /one-shot-orchestra — the biggest architectural jump we've made on the max-effort build skill. It's live inside Godmode now.
Every prior version (/one-shot, /one-shot-beta, /one-shot-scripts) was a single Claude instance running a long protocol in one working memory. Orchestra is different: it's a conductor pattern. One Claude orchestrates, many Claudes build.
Analogy: /one-shot-scripts was one chef cooking every dish in sequence at a single workbench. /one-shot-orchestra is a head chef running a kitchen of line cooks — each at their own station, each given a written order, plating back to the pass when done.
Why We Built It
Claude's working memory is large but finite. When the main instance has loaded the skill, read the project, made a plan, spawned sub-agents, and started building — memory fills up. On truly big builds, quality drops before the work is done.
The obvious fix is parallelism. But the in-session Agent tool still runs inside the conductor's memory. Sub-agents help, but they don't buy you more memory — they share the same ceiling.
Orchestra breaks the ceiling by launching genuinely separate Claude Code processes in their own terminal windows. Each one boots with a fresh ~1M-token context. None of them inherit the conductor's loaded state.
How It Works Under the Hood
Orchestra's execution loop is a file-based protocol. No sockets, no message bus — just directories and JSON files that both sides can read.
/one-shot-orchestra build X
↓
🎼 Conductor plans — decomposes the task, writes one brief per worker
↓
🖥️ orchestra-spawn.sh launches a
mintty terminal per brief
↓
🔧 Each terminal runs
claude fresh, loads its brief, works in isolation
↓
📄 Worker writes
work/result-<Name>.json — status, artefacts, notes
↓
👻 orchestra-wait-and-kill.sh polls for result, then force-closes the terminal
↓
🎯 Conductor synthesises — reads all results, verifies, delivers
Every worker gets a shared run directory: a sandbox containing briefs/, work/, media/, a live chat.md, and a rolling telemetry.jsonl. You can watch the whole orchestra cooperate in real time by tailing those files.
Fresh Memory Is the Point
Every token a Claude instance reads costs working memory. By the time a skill has loaded, the project has been scanned, and the plan has been drafted, a meaningful chunk is already spent on knowing what to do instead of doing it.
A fresh-spawn worker starts empty. It gets a one-page brief describing only what it needs. The rest of its memory is free for the actual work — reading files, running tests, building output.
Before (single process)
Plan + execution share one 1M-token memory. Large plans eat into the space available for the build itself.
After (orchestra)
Conductor holds the plan. Each worker holds only its slice. N workers × 1M tokens of useful capacity.
Two Modes, One Conductor
Not every task wants a fresh spawn. Some work benefits from a collaborator who's already read the plan. Orchestra supports both modes and has a written rule for when to pick which.
Fresh-spawn (mintty)
Blind judges, heavy independent file reads, parallel builders that shouldn't share plans. Full 1M memory per worker.
In-session Agent
Critic passes, quick scans, teamwork that needs the conductor's context. Shares memory, starts instantly, no terminal overhead.
The decision table lives at the top of orchestra-protocol.md. Every phase script points to it — so when the conductor hits "need a reviewer here," it knows exactly which mode fits.
The Worker Contract
Every worker, regardless of mode, honours the same contract. When it's done, it writes one JSON file. That's it.
{
"status": "success",
"worker": "BuilderA",
"artifact_paths": ["work/variant-A.txt"],
"notes": "Drafted a single-sentence pitch, no markdown.",
"token_usage": { "total": 48213 }
}
{
"status"status · polled by orchestra-wait-and-kill.sh in a loop. The instant this field appears, the conductor terminates the worker terminal.: "success",
"worker"worker · the brief↔result match key. Conductor uses it to find which brief just completed and route output to the right downstream phase.: "Builder-A",
"artifact_paths"artifact_paths · file paths the conductor reads next. Verifier and Polisher phases pick up exactly these files; nothing else is opened.: ["work/variant-A.txt"],
"notes"notes · free-form summary spliced into chat.md so the human watching the live feed sees what the worker did without reading every artefact.: "Drafted a single-sentence pitch.",
"token_usage"token_usage · written to telemetry.jsonl for the audit trail. Used by /godmode-evolution to grade efficiency dim across runs.: { "total": 48213 }
}
The conductor waits on that file, reads it, kills the worker's terminal, and moves on. No ambiguity about "is it done yet?" — the file either exists or it doesn't.
Smoke-Tested On Arrival
Orchestra ships with a test/TEST-BENCH.md containing three real smoke tests. We ran them before flipping the skill to live.
| Smoke | What it proves |
|---|---|
| 1. Single-worker spawn-and-kill | Fresh spawn works, result.json gets read, terminal gets killed, telemetry logged. |
| 2. Two builders + judge | Parallel spawn + shared workspace + result-JSON hand-off (judge reads both builders' outputs). |
| 3. Timeout & kill | A worker that intentionally hangs gets killed at the deadline with telemetry flagged. |
Audited By Evolution
Every new skill we ship gets graded by /godmode-evolution — the meta-skill whose entire job is making other skills better. Orchestra's scorecard:
| Dimension | Score |
|---|---|
| Contradiction removal | 0.95 |
| Decision clarity | 0.92 |
| Doc coherence | 0.90 |
| Delegation hygiene | 0.88 |
| Smoke test coverage | 0.85 |
| Backwards compat | 0.90 |
| File size discipline | 0.90 |
| Mutation scope | 0.93 |
| Process quality | 0.92 |
| Polish | 0.88 |
| Composite | 0.903 / 1.00 |
orchestra-protocol.md.TEST-BENCH.md exercise the protocol? Golden path is covered; timeout/backoff edge cases aren't yet. Lowest dim, shipped anyway.The two lowest scores are the honest ones. Smoke tests were written and executed on the golden path but not every timeout/backoff edge case. Delegation hygiene is solid at the protocol level but we'd still like deeper integration across a couple of phase scripts. Both are logged for the next loop.
Why publish the weak dims: if a rubric only grades what shipped, it learns to ship less and say more. Evolution grades the gaps too — and we print them. It's the difference between a launch and a pitch deck.
What You Get When You Use It
- Bigger builds without memory drops — fresh-spawn workers let Orchestra attack work that would have stalled a single-process skill.
- Live visibility —
chat.mdupdates as workers sign off. You can watch the build happen in any markdown viewer. - Two-mode flexibility — fresh-spawn when isolation matters, in-session agent when speed does.
- Deterministic hand-off — every worker writes
result.json. No guessing, no polling for "is it done." - Auditable — telemetry for every spawn, every kill, every result, in one
jsonlyou can replay.
Run Orchestra On Your Next Big Build
Ships inside Godmode. Works with Claude Code on Windows, macOS, and Linux.
Get Godmode See How Evolution Grades It →