Orphan Zombies: 151 Idle Node Processes In 24 Hours
💥 The bug: Windows accumulated 151 zombie
node.exe processes in 24 hours — orphaned MCP children from past Claude Code sessions, each idle and refusing to die.🔍 The cause: Windows doesn't reap orphans, and orchestra workers spawn through
wt.exe new-tab — which re-parents them under WindowsTerminal.exe, defeating any Job Object on the spawning shell.✅ The fix: A 15-minute PowerShell sweep that kills orphaned MCP processes by signature. Bundled into the orchestra skill so every new user gets it for free.
The Discovery
A user noticed the laptop felt heavier than usual — CPU and RAM creeping up between sessions. Task Manager showed a stack of node.exe processes, all idle, none owned by anything currently running.
One tasklist later: 151 zombie node processes accumulated in the last 24 hours, each holding ~1 MB of resident memory. Together, about 150 MB of nothing.
They were all the same shape: MCP server children that had outlived their Claude Code parent. Every closed terminal, every crash, every kill had left a few behind. The pile just kept growing.
Think of it like leaving a tap dripping: a single drip costs nothing. But you didn't turn it off — and 151 drips later, the bucket is full and you can't remember when you last opened it.
Why Windows Lets Children Survive
On Linux and macOS, when a parent process dies its children get re-parented to PID 1, which actively reaps them. That's why you almost never see orphan zombies on Unix — the system has a janitor.
Windows has no janitor. When a parent exits abnormally, its children just keep running with no parent at all. They become orphaned but stay alive, accumulating across sessions until something explicitly kills them or you reboot.
Unix orphans
Re-parented to init/launchd, which calls wait() and reaps them. Orphan zombies are an OS bug, not a feature.
Windows orphans
No reaper. The child keeps running until someone tells it to stop. If the spawning code crashed before cleanup, that someone is — nobody.
The official Windows answer is Job Objects: wrap the parent and its descendants in a kernel-level container with the KILL_ON_JOB_CLOSE flag. When the parent's handle to the job goes away, every process inside dies with it.
That's the playbook. It just doesn't work for orchestra workers — for two reasons.
Why Job Objects Couldn't Save Us
Reason one: wt.exe new-tab doesn't actually run the new tab as your child. It hands the request off to the long-lived WindowsTerminal.exe service, which spawns the tab under itself.
The orchestra's spawning shell is the thing wrapped in the Job Object. The new tab's process tree lives somewhere else entirely. Even with the perfect kill flag set on the perfect handle, the kill signal is anchored on the wrong tree.
Reason two, just to twist the knife: even when we built an isolated repro that did wrap the grandchildren in the same job, KILL_ON_JOB_CLOSE wasn't reliable on Win11. Some grandchildren survived handle close anyway. Stop-Process uses TerminateProcess, which skips finally blocks, so an explicit TerminateJobObject fallback isn't reliable either.
Core insight: The textbook fix for Windows orphans assumes you control the process tree. The moment you spawn through a re-parenting service like Windows Terminal, that assumption breaks — and the kill-flag is anchored on the wrong handle.
The Fix: A Periodic Sweep
If you can't kill them at parent-death, you kill them later by signature. A small PowerShell script runs every 15 minutes via Windows Task Scheduler. Three filters, in order:
- Parent PID is dead — the process has no living ancestor.
- Command line matches an MCP signature —
/mcp/,-mcp/,modelcontextprotocol,mcp-veo3, etc. - Process is older than 5 minutes — so a freshly-spawned worker isn't caught mid-handshake.
If all three match, kill it. The signature filter is what makes the sweep safe to run system-wide — it can't touch a node script the user actually started, like worker.js or watch.js (those are explicitly whitelisted too).
The sweep doesn't care how the orphan was created — terminal close, crash, kill, daemon failure. If it looks like an MCP child and its parent is gone, it goes. That's the whole reason this approach is more reliable than Job Objects: it works regardless of the spawn path.
Distribution: Bundled Into The Skill
The clever bit isn't the sweep itself — PowerShell scripts have existed for two decades. The clever bit is how new users get it.
The orchestra runner's spawnWorker() function fires an idempotent install hook on its first call. Windows-only, fail-soft. It writes the script to %LOCALAPPDATA%, registers the scheduled task, marks itself installed in a flag file, and never runs again.
// runner/lib/spawn.js (paraphrased)
async function spawnWorker(spec) {
if (process.platform === 'win32') {
await ensureMcpSweepInstalled(); // idempotent, fail-soft
}
// ... existing spawn logic ...
}
No README step. No "remember to run the installer." No reliance on the user's Claude memory or local config. The first time anyone runs an orchestra task on Windows, the protection installs itself.
Think of it like a smoke detector that ships pre-installed: the homeowner doesn't have to know it exists. They just notice they never had a fire.
Aside: The Wallpaper Was The Actual CPU Hog
After running the sweep we sampled the host's top CPU consumers, expecting the cleared zombies to show up as a delta. They didn't — because they were never burning CPU in the first place. They were just camping on RAM.
The actual top consumer was Lively Wallpaper, burning ~44% CPU across two processes to render an animated desktop background nobody was looking at.
Top CPU consumers, post-sweep
The investigation that found a real bug also found a sillier one. We turned the wallpaper off. Five seconds of work, twice the CPU recovered.
The Lessons
Three things from this one worth keeping:
| Assumption | Reality on Windows |
|---|---|
| "Children die when parents die" | Only on Unix. Windows has no reaper — orphans just keep running. |
| "Job Objects with KILL_ON_JOB_CLOSE solve it" | Only if all descendants are actually in your job. wt.exe new-tab re-parents the tab away. |
| "Detection is the user's problem" | It shouldn't be. Bundle the cleanup into the tool that creates the mess. |
The third one is the biggest lift. The fix didn't just ship as a PowerShell script in a downloads folder — it shipped as part of spawnWorker(), so every new user gets the protection on their first orchestra run without ever knowing it happened.
Core insight: If your tool creates a class of mess that the OS won't clean up, the cleanup belongs in the tool. Don't write a "remember to install this" line in a README that nobody reads — ship the install on the first invocation.
What Shipped
- The sweep —
%LOCALAPPDATA%\Godmode\sweep-mcp-orphans.ps1, runs every 15 min via Task Scheduler. Logs to%TEMP%\mcp-sweep.log. - The install hook — bundled into
spawnWorker()in the orchestra runner. Idempotent. Fail-soft. Windows-only. - The signature filter — matches
/mcp/,-mcp/,modelcontextprotocol,mcp-veo3, plus an explicit whitelist forworker.jsandwatch.js. - The 5-minute age threshold — protects freshly-spawned MCP children that haven't finished their handshake yet.
Five minutes after the sweep ran the first time, the zombie count dropped from 151 to 0. The next morning it was still 0. The mess that took 24 hours to accumulate now never has a chance to start.
Visuals created with one-shot-scripts
Run Orchestra On Windows — Cleanly
One-Shot Orchestra-v2 spawns fresh-context Claude Code workers across terminal tabs, and on Windows it now ships with a self-installing sweep so old workers don't pile up.
See One-Shot Read More Posts