Deep Dive ⏱️ 5 min read

The Orchestrator Became A Daemon: Orchestra v2

TL;DR

💥 The problem: v1 made the main Claude session BOTH conduct phases AND hold the score — context kept growing across the run
🔧 The fix: Move the orchestrator into a Node daemon. The session becomes a thin narrator that drives a CLI
The result: Every LLM call is a fresh worker. The session never accumulates. Workers each get a fresh ~1M window
A wireframe code-conductor on a dark stage, with the music score replaced by a glowing terminal window
The conductor still conducts — but the score lives outside its head now
// orbit/v2 — daemon at center, fresh workers ephemeral drag to rotate
[reduced motion] DAEMON at center; six fresh-worker terminals orbit it. Every LLM call spawns a new worker that runs once and dissolves — the session never accumulates context.
Daemon at center — six fresh workers orbit, one spawns and dissolves every cycle

🎭 What v1 Got Right (And Why It Wasn’t Enough)

Orchestra v1 already solved the original problem. one-shot-scripts ran every phase inside one session, and the session bloated. Orchestra v1 spawned each phase as a fresh terminal worker so the heavy work happened in clean ~1M context windows.

That worked. But the orchestrator itself — the conductor session sequencing all the phases — still accumulated context across the run. Every phase added a brief, a result, a score, a judgment.

Real-world analogy: Imagine a conductor who, between every movement, has to memorise the next sheet music, the last performance review, and the rehearsal notes. By the third movement they’re drowning in paper. v1’s orchestrator was that conductor. v2 moves the paperwork into a separate office.

🔁 The Inversion

v2 inverts the relationship. The orchestrator is no longer the Claude session. It’s a Node.js daemon at ~/.claude/skills/one-shot-orchestra/runner/ that owns the state machine, the briefs, the scoring, and the looping.

The Claude session does only three things now: call the CLI to start a run, poll status in a loop, and surface the final artifacts when the daemon says delivered.

Side-by-side architecture diagram. Left: v1 with one main session connected to stacked phase boxes. Right: v2 with a thin narrator pipe over a hexagonal daemon, surrounded by fresh worker nodes.
v1: session is the conductor. v2: daemon is the conductor, session is the announcer.

⚙️ How The Daemon Works

The runner is a Node CLI with a handful of subcommands. Each one prints a single JSON envelope on stdout describing what state the run is in.

Here’s the contract:

{
  "state": "phase_running",
  "run_id": "build-login-2026-04-25T...",
  "narrate": "Spawned Builder (phase 2). Watching.",
  "needs_orchestrator": null
}
// envelope stream ~1 envelope/sec — readable
[reduced motion] Envelopes travel daemon → session as JSON cards (state, narrate, run_id). Spawn pulses travel session → daemon. Six states cycle: bootstrapping → routing → phase_running → scoring_pending → scoring_running → delivered.
Session ↔ Daemon CLI dance — envelopes in, spawns out

The Claude session reads narrate, shows it to the user, then reacts to state. If the state expects a spawn (routing, scoring_pending, judging_pending, looping), the session calls orchestra spawn. Otherwise it polls again.

That’s the whole loop. The session never reads phase scripts, never writes briefs, never scores anything itself.

🗺️ The State Machine

Every run walks the same set of states. The session doesn’t need to memorise them — the daemon tells it where things stand on every poll.

State machine diagram showing bootstrapping → diagrammer_running → routing → phase_running → scoring_pending → scoring_running → branches to delivered, looping, or judging
The runner’s state machine. The session reacts; the daemon decides.
// runner state machine — 3D walk drag to rotate
[reduced motion] State machine: bootstrapping → diagrammer_running → routing → phase_running → scoring_pending → scoring_running → {looping → routing} or judging_pending → judging → delivered. Daemon walks it in code; session reads where it stands.
A run walking the state machine — phases, scoring, judging, loop, ship

The branch off scoring_running is where the auto-loop logic lives. Composite at or above 0.85 ships. Below 0.50 auto-loops without asking. Anything in between fires a Loop-Judge worker that decides whether to ship or loop.

Why this matters: the looping decision used to be the orchestrator session reading a scorecard and making a judgment call — which meant that judgment lived inside the same context as the next loop’s briefs. Now the judgment happens in its own fresh worker, writes a verdict, and the daemon acts on it.

📝 What The Session Actually Sees

v1 sessions saw everything: every brief, every result, every scorecard, every loop verdict. v2 sessions see four things, and that’s it.

📡

Envelopes

One JSON line per CLI call. State + narrate + run_id. A few hundred bytes each.

✏️

Running notes

A cross-phase file the user can append to mid-run. The session writes via orchestra notes append.

🏁

Final scorecard

Composite, weakest dim, judge verdict. Arrives once, at the end of the run.

📂

Delivered artifacts

A list of {worker, path} pairs. The user opens them. The session does not read them.

The deep stuff — phase scripts, worker briefs, individual result.json files, the chat.md transcript — lives on disk under the run directory. The user can browse it. The session never has to.

🔁 The Run Loop, Step By Step

Here’s what a v2 run looks like from the session’s side. Six lines of pseudo-bash, basically.

📜 orchestra start "<task>"

👁️ auto-open chat.md (live transcript)

🔁 loop: orchestra status <run-id>

⚡ if state needs spawn: orchestra spawn <run-id>

💬 pass `narrate` to the user

✅ break when state == delivered

Compare that to v1, where the orchestrator session ran the full execution-assess-fix-deliver protocol from SKILL.md directly — reading scoring.md, loop-mechanics.md, delivery.md, every phase brief, every result.

📉 v1 vs v2 At A Glance

Concernv1 (session-as-orchestrator)v2 (daemon-as-orchestrator)
Who owns the state machineClaude session, in proseNode daemon, in code
Who writes worker briefsClaude sessionDaemon, from templates
Who scores the runSession reads scoring.md, scoresFresh Scorer workers per dimension
Who decides ship vs loopSession reads loop-mechanics.md, decidesFresh Loop-Judge worker writes a verdict
Session context per runGrows with each phase + loopStays small — envelopes only
Plan-mode approval gateSupportedNot yet — falls back to v1
Parallel build workersUp to 12 concurrentOne worker per phase (for now)
// session memory — 11 phases v1 stacks — v2 stays flat
[reduced motion] v1 session: stacks 11 phase blocks, crosses warn line at P5, crit line at P8, OVERFLOW by P10 (115%). v2 session: 8% by phase 11 — envelopes only.
Session memory across an 11-phase run — v1 stacks, v2 stays flat

💡 The Right Mental Model

Do think of v2 as

A build server with a thin Slack bot in front of it. The bot announces what just happened. The server does the work.

Don’t think of v2 as

A smarter version of v1. It’s a different shape. The session is no longer the brain — it’s the megaphone.

⚠️ What v2 Doesn’t Do Yet

The honest list of trade-offs, because the upgrade isn’t free.

When you hit one of these, the SKILL tells you to fall back to v1 and explain why. Both versions ship side by side.

🔮 Why This Was The Right Inversion

v1 was good. v2 isn’t a fix — it’s the natural endpoint of the path v1 was already walking. Each Orchestra release moved more responsibility off the orchestrator session. Spawn enforcement stopped it from cheating with in-session agents. Reading-by-reference stopped it from pulling phase scripts into its own memory. v2 finishes the job: the session no longer holds the state machine either.

The pattern: when you build a delegation system, every release of it should narrow the delegator’s job. v1 said “don’t do the work.” v2 says “don’t even hold the plan.”

The session’s only job now is to be present — so the user has someone to talk to while a Node process and a small army of fresh Claude workers do the actual run.

Run Orchestra v2 On Your Next Build

One prompt in, finished product out. Lean session, fresh workers, automatic scoring and looping.

Get Godmode How Orchestra Works