The Orchestra Learned To Celebrate
💥 The problem: The orchestra tree viewer was a sterile log — coloured dots, status labels, no reason to keep watching
🎮 The fix: Judge scores became points, achievements fire live, killstreak banners slide in MOBA-style, the whole thing pulses neon synthwave
✨ The result: Watching agents work now feels like watching Halo — AND the agents see the rules in their briefs, so they actually try to score
Same Data, Different Show
The orchestra runner already exposes everything you'd want to see while a run is happening. telemetry.jsonl ticks every spawn, every result, every score. state.json tells you which phase is running. scoreboard.json is fresh every two seconds.
The original tree viewer rendered all of that as a green-orange-red dot graph. Functional. Sterile. You could read it, but you wouldn't want to watch it.
Real-world analogy: Air traffic radar shows you exactly where every plane is. Sports broadcast shows you exactly where every player is. Same kind of data — but one is a tool you use because you have to, and the other is something you'd pay to watch on a Saturday night. The tree viewer was a radar. We turned it into a broadcast.
The Hybrid Scoring Model
The trick was to keep quality as the floor while letting drama sit on top of it. Inventing points for things that don't matter would have rewarded the wrong behaviours. So scoring is a sandwich.
The base score comes from the existing judge dimensions — correctness, robustness, completeness. Achievements fire from telemetry events. Combos multiply.
+
⚡ Achievements fire from live telemetry (SPEED_DEMON, CLEAN, FLAWLESS, ONE_SHOT)
+
🔥 Combos multiply — 2 in a row ×1.2, 3 in a row ×1.5, 5 in a row ×1.8
=
🏆 Worker total — floats up over the node as a +N popup
Quality drives the score. There are no points for cutting corners. Speed only counts when the output is correct.
Achievements That Fire Live
Each achievement is a single object in achievements.js — an ID, a condition function, a points value, an FX cue. Tunable in one file.
| ID | Trigger | Reward |
|---|---|---|
| FIRST_BLOOD | First worker to end with success | +15 · banner |
| SPEED_DEMON | Worker finishes in <50% of phase timeout | +20 · cyan whoosh |
| FLAWLESS | Judge dim score equals 1.0 | +30 · sparkle arpeggio |
| CLEAN | Zero warnings in result digest | +15 · cyan particles |
| ONE_SHOT | Success on attempt 1 | +25 · bullseye thud |
| DOUBLE_KILL → GODLIKE | 2→7 successes in a row | combo ×1.2 → ×2.0 · full-width banner |
| FLAWLESS_VICTORY | Run delivered, zero failures, avg ≥0.95 | +200 · full-stage takeover |
The scoring engine watches every result-*.json the runner emits, computes the score, applies any combo multiplier, and writes a score_event line into telemetry.jsonl. The viewer was already polling that file every two seconds.
Why this matters: no protocol changes. The viewer just got a new event type to render. The runner just got a new module. Two layers, loose coupling, ship independently.
Two Layers, One File Each
The whole thing is two new modules and one modified one. Below is the actual scoring entry point that fires on every worker end.
// runner/src/lib/scoreboard.js
export function recordScoreFor(rd, name, resultPath, started, timeoutSec) {
const result = readJSON(resultPath);
const score = computeWorkerScore(name, result, started, Date.now(), timeoutSec);
const combo = applyCombos(loadScoreboard(rd), score);
appendTelemetry(rd, {
event: 'score_event',
worker: name,
points: score.total,
breakdown: score.bonuses,
achievements: score.achievements,
combo: combo.current,
multiplier: combo.multiplier,
comboBanner: combo.banner,
ts: Date.now() / 1000,
});
writeScoreboardAtomic(rd, applyToBoard(loadScoreboard(rd), score, combo));
}
The viewer side is just a switch on event === 'score_event'. Spawn a popup at the worker's SVG coords, increment the scoreboard, fire the banner if a combo flagged it.
Telling The Agents The Rules
The most important detail isn't the visuals at all. Every worker brief gets a === SCORING === block stitched in at composition time, listing the points table in plain language.
=== SCORING ===
Base: judge dim score × 100.
Bonuses you can earn:
• SPEED_DEMON +20 (finish in <50% of phase timeout, with correct output)
• FLAWLESS +30 (judge dim score 1.0)
• CLEAN +15 (zero warnings in your digest)
• ONE_SHOT +25 (success on first attempt — no retries)
Quality drives the score. There are no points for cutting corners.
Speed only counts when the output is correct.
Agents don't see the killstreak banners. The user does. But agents see the points, which means they have a goal beyond “complete the task.” The viewer is the broadcast. The brief is the rules.
Try It Yourself
This is the actual scoreboard widget rendered live in your browser, hooked up to the same FX. Press the buttons. Watch the numbers go up.
Run Scoreboard
- researcher-10
- builder-20
- tester-30
- scorer-40
The Pipeline End-To-End
Below is the path a single point takes from a worker's result.json all the way to the +N popup on screen. Click any node for what it does.
It's a 200-line scoring engine and ~600 lines of CSS / JS in the viewer. The runner already had everything else.
Before vs After
| Concern | Before | After |
|---|---|---|
| Visual feedback | Coloured dots, status text | Neon nodes, +N popups, killstreak banners, particles |
| Scoring | Final scorecard at delivery | Live score events every worker end — with breakdowns |
| What agents know | The task description | The task + the points table + the combo rules |
| Sound | None | Web Audio API tones — pickup chime, achievement cues, victory fanfare |
| Watchability | You alt-tab away | You watch it land |
Two Bugs We Caught Out Loud
The first build of the FLAWLESS VICTORY overlay used a fixed 140px font size. At 900px wide, the giant neon letters punched straight off the right edge of the screen. So we caught the bug, complained about it on Twitter in our heads, and shipped the fix.
Then we re-tested and found the killstreak banners did the same thing — 92px fixed for TRIPLE KILL!, 110px for GODLIKE!, both clipping at narrow viewports. clamp() with a viewport-relative middle value fixed both.
Do
font-size: clamp(40px, 8.5vw, 92px) — floor, fluid middle, ceiling. Fits everything from a 320px phone to a 4K monitor.
Don't
font-size: 92px — ships at desktop, falls off the screen on a laptop, looks ridiculous on mobile.
The Victory Easter Egg
When a run delivers with no failures and an average judge score above 0.95, the viewer takes the whole stage. Click the trophy below to see it.
Reach 100% for a perfect detonation.
What This Buys You
You actually watch
Long agent runs become spectator-friendly. You stop alt-tabbing.
Agents try harder
The scoring block in every brief gives them a target beyond “done.”
Bugs are visible
A worker that limps to a low score shows up loud — you catch it before delivery.
Demos don't suck
Showing someone an orchestra run used to mean “here's a log file.” Now it's a screen-share.
Watch your agents earn it
One-Shot Orchestra is still in development — the gamification layer ships with it. Until then, browse the rest of the Godmode skill suite.
All Skills & Pricing