The Orchestrator Became A Daemon: Orchestra v2
💥 The problem: v1 made the main Claude session BOTH conduct phases AND hold the score — context kept growing across the run
🔧 The fix: Move the orchestrator into a Node daemon. The session becomes a thin narrator that drives a CLI
✅ The result: Every LLM call is a fresh worker. The session never accumulates. Workers each get a fresh ~1M window
What v1 Got Right (And Why It Wasn’t Enough)
Orchestra v1 already solved the original problem. one-shot-scripts ran every phase inside one session, and the session bloated. Orchestra v1 spawned each phase as a fresh terminal worker so the heavy work happened in clean ~1M context windows.
That worked. But the orchestrator itself — the conductor session sequencing all the phases — still accumulated context across the run. Every phase added a brief, a result, a score, a judgment.
Real-world analogy: Imagine a conductor who, between every movement, has to memorise the next sheet music, the last performance review, and the rehearsal notes. By the third movement they’re drowning in paper. v1’s orchestrator was that conductor. v2 moves the paperwork into a separate office.
The Inversion
v2 inverts the relationship. The orchestrator is no longer the Claude session. It’s a Node.js daemon at ~/.claude/skills/one-shot-orchestra/runner/ that owns the state machine, the briefs, the scoring, and the looping.
The Claude session does only three things now: call the CLI to start a run, poll status in a loop, and surface the final artifacts when the daemon says delivered.
How The Daemon Works
The runner is a Node CLI with a handful of subcommands. Each one prints a single JSON envelope on stdout describing what state the run is in.
Here’s the contract:
{
"state": "phase_running",
"run_id": "build-login-2026-04-25T...",
"narrate": "Spawned Builder (phase 2). Watching.",
"needs_orchestrator": null
}
The Claude session reads narrate, shows it to the user, then reacts to state. If the state expects a spawn (routing, scoring_pending, judging_pending, looping), the session calls orchestra spawn. Otherwise it polls again.
That’s the whole loop. The session never reads phase scripts, never writes briefs, never scores anything itself.
The State Machine
Every run walks the same set of states. The session doesn’t need to memorise them — the daemon tells it where things stand on every poll.
The branch off scoring_running is where the auto-loop logic lives. Composite at or above 0.85 ships. Below 0.50 auto-loops without asking. Anything in between fires a Loop-Judge worker that decides whether to ship or loop.
Why this matters: the looping decision used to be the orchestrator session reading a scorecard and making a judgment call — which meant that judgment lived inside the same context as the next loop’s briefs. Now the judgment happens in its own fresh worker, writes a verdict, and the daemon acts on it.
What The Session Actually Sees
v1 sessions saw everything: every brief, every result, every scorecard, every loop verdict. v2 sessions see four things, and that’s it.
Envelopes
One JSON line per CLI call. State + narrate + run_id. A few hundred bytes each.
Running notes
A cross-phase file the user can append to mid-run. The session writes via orchestra notes append.
Final scorecard
Composite, weakest dim, judge verdict. Arrives once, at the end of the run.
Delivered artifacts
A list of {worker, path} pairs. The user opens them. The session does not read them.
The deep stuff — phase scripts, worker briefs, individual result.json files, the chat.md transcript — lives on disk under the run directory. The user can browse it. The session never has to.
The Run Loop, Step By Step
Here’s what a v2 run looks like from the session’s side. Six lines of pseudo-bash, basically.
↓
👁️ auto-open chat.md (live transcript)
↓
🔁 loop: orchestra status <run-id>
↓
⚡ if state needs spawn: orchestra spawn <run-id>
↓
💬 pass `narrate` to the user
↓
✅ break when state == delivered
Compare that to v1, where the orchestrator session ran the full execution-assess-fix-deliver protocol from SKILL.md directly — reading scoring.md, loop-mechanics.md, delivery.md, every phase brief, every result.
v1 vs v2 At A Glance
| Concern | v1 (session-as-orchestrator) | v2 (daemon-as-orchestrator) |
|---|---|---|
| Who owns the state machine | Claude session, in prose | Node daemon, in code |
| Who writes worker briefs | Claude session | Daemon, from templates |
| Who scores the run | Session reads scoring.md, scores | Fresh Scorer workers per dimension |
| Who decides ship vs loop | Session reads loop-mechanics.md, decides | Fresh Loop-Judge worker writes a verdict |
| Session context per run | Grows with each phase + loop | Stays small — envelopes only |
| Plan-mode approval gate | Supported | Not yet — falls back to v1 |
| Parallel build workers | Up to 12 concurrent | One worker per phase (for now) |
The Right Mental Model
Do think of v2 as
A build server with a thin Slack bot in front of it. The bot announces what just happened. The server does the work.
Don’t think of v2 as
A smarter version of v1. It’s a different shape. The session is no longer the brain — it’s the megaphone.
What v2 Doesn’t Do Yet
The honest list of trade-offs, because the upgrade isn’t free.
- No plan-mode gate. v1 supports a single approval gate after research and before building. v2 doesn’t yet — if the user asks for plan mode, fall back to v1.
- Single worker per phase. v1 can spawn parallel build workers across independent file domains. v2 spawns one. Big multi-domain builds still belong to v1.
- No live tree viewer wiring. v1’s animated D3 tree polls a telemetry feed the daemon doesn’t emit yet. The user watches
chat.mdinstead, which auto-opens at start. - Showcase and A/B integration target v1 paths. Tools like
/godmode-showcasehaven’t been ported yet.
When you hit one of these, the SKILL tells you to fall back to v1 and explain why. Both versions ship side by side.
Why This Was The Right Inversion
v1 was good. v2 isn’t a fix — it’s the natural endpoint of the path v1 was already walking. Each Orchestra release moved more responsibility off the orchestrator session. Spawn enforcement stopped it from cheating with in-session agents. Reading-by-reference stopped it from pulling phase scripts into its own memory. v2 finishes the job: the session no longer holds the state machine either.
The pattern: when you build a delegation system, every release of it should narrow the delegator’s job. v1 said “don’t do the work.” v2 says “don’t even hold the plan.”
The session’s only job now is to be present — so the user has someone to talk to while a Node process and a small army of fresh Claude workers do the actual run.
Run Orchestra v2 On Your Next Build
One prompt in, finished product out. Lean session, fresh workers, automatic scoring and looping.
Get Godmode How Orchestra Works