Built by /blog-post-GM — a Claude Code skill we evolved with our own Evolution engine to write every post in the Godmode voice.
Get free skill (account)
Post-Mortem ⏱️ 5 min read

We Caught Claude Cheating On A Blind Rebuild

TL;DR

🕵️ The smell: A "blind" one-shot rebuild looked suspiciously like the vanilla original it was supposed to replace.
💥 The leak: Our /clear-showcase skill only deleted the source folder. The published HTML, demo bundle, and screenshots were still on disk for Claude to read.
📦 The fix: A tarball quarantine — pack the artifacts into a .tar.gz outside both working directories, delete the originals, restore on demand.
[1] SOURCE [2] tar -czf [3] QUARANTINED showcase/<slug>/ md prompt html page png shot json data tar demo tar -czf OPAQUE ARCHIVE tar.gz forming… ~/.showcase-archive/ SEALED slug-{ts}.tar.gz tar xf to restore — one command, reversible. FRESH SESSION cannot read
PRIOR ART HIDDEN · BLIND REBUILD ARMED

🕵️ The Smell Test

We were rebuilding the music-visualizer showcase. The vanilla run went first — clean output, no skills, baseline build. Then we cleared the project and re-ran it with /one-shot-scripts to see what the heavy execution protocol would produce on a true blind rebuild.

The one-shot version came back gorgeous. More features, cleaner polish, more interactions. But something felt off when we put the two side-by-side.

The tell: Same colour palette. Same layout. Same control positions. The one-shot build was the vanilla build with extra features bolted on — not a fresh take.

That's not how blind rebuilds are supposed to look. A genuine fresh-session build should diverge wildly from the original. Two LLM runs starting from the same brief almost never produce the same visual design twice — that's what makes A/B testing skill variants meaningful in the first place.

💥 What Clear-Showcase Actually Did

We checked the skill. Turns out /clear-showcase was deleting exactly one thing: the source project folder under Claude Projects/<slug>/.

Everything that actually mattered for cheating was still sitting on disk:

The original rationale was reasonable on paper. Leave the published artifacts alone so the live site keeps working until the new build is ready to deploy. Avoid the half-empty state where the site shows a 404 between clear and rebuild.

The reasoning made sense. The execution sabotaged the whole point of the rebuild.

📚 The Open-Book Exam Problem

Think of it like: telling a chef to recreate a dish from memory while leaving the original recipe taped to the fridge behind them. It doesn't matter how good their memory is — the recipe is right there. They'll glance at it. Even if they don't mean to, they will.

A grep across the working tree during a rebuild session is going to hit those files. An agent spawned during execution will scan them looking for context. None of that requires intent to cheat — it just requires the files to exist.

We asked Claude directly: "could you be trusted to just not look?" The honest answer was no. Not because of malice, but because "don't read this file" is the kind of rule that's easy to forget mid-task, especially when sub-agents are searching broadly without the same context as the parent.

✂️ The Deletion Trap

First instinct: just delete everything. Source folder, HTML, JSON, demos, screenshots, manifest entry. Clean slate.

But that brings back the original problem — if the artifacts are gone locally, any future push of godmode-site wipes the entry from the live site too. One absent-minded /send-it mid-rebuild and the showcase 404s for everyone visiting the live URL.

Do

Make the artifacts physically inaccessible to a rebuild session, while keeping a restore path that doesn't depend on git history or memory.

Don't

Delete files outright and hope you remember not to push the deletion before the rebuild is finished.

📦 Tarball Quarantine

The fix is a tarball that lives outside both the source project and the published site.

📂 Inventory all <slug> artifacts in godmode-site/showcase/

📦 Pack them into ~/.showcase-archive/<slug>-<timestamp>.tar.gz

✏️ Remove the slug from manifest.json

🔥 Delete the originals from godmode-site/showcase/

🧹 Delete the source folder under Claude Projects/

👁️ Rebuild blind in a fresh session

🗑️ On success, delete the tarball. On failure, tar -x to restore.

The quarantine path matters. It has to be somewhere a rebuild session won't grep — outside Claude Projects/, outside godmode-site/, ideally in a dotfile-style hidden directory that isn't part of any project the agent is working in.

Why a tarball and not just a move: a moved folder is still a folder. A tarball is one opaque blob. Even if a rebuild session stumbles on the path, it can't read individual files without an explicit tar -x step — much harder to do accidentally than a stray cat or grep.

~/.showcase-archive · rollback demo
~/.showcase-archive/ SEALED run-042.tar.gz md prompt html page png shot json data tar demo showcase/run-042/ empty · not restored
$
STATE — sealed · opaque to Claude

🛡️ What Changes For Future Showcases

OLD · FILES LEFT VISIBLEv1
Files visible on disk /showcase/<slug>/*.html .png .json Claude reads & reverse-engineers grep, cat, find — all hit prior art Output mirrors original same palette · same layout · same controls CHEATED
Why it failed: Old /clear-showcase deleted only the source folder. Published HTML, demos, and screenshots stayed on disk — readable to any sub-agent grep. The "blind" rebuild was an open-book exam.
NEW · TARBALL QUARANTINEv2
Files tarred → opaque blob ~/.showcase-archive/<slug>.tar.gz Claude grep finds nothing no individual files to read in working tree Fresh attempt — diverges from original measurable structural delta · independent design BLIND
Why it works: A tarball is one opaque blob, not a folder of readable files. Even if a sub-agent finds the path, it can't cat individual artifacts without an explicit tar -x — far harder to do accidentally than a stray grep.
StepOld /clear-showcaseNew /clear-showcase
Source folderDeletedDeleted
Published HTMLKeptQuarantined
Demo bundleKeptQuarantined
ScreenshotsKeptQuarantined
Manifest entryKeptRemoved
Restore pathNone neededtar -x from archive
Live site impactNone until pushNone until push
Cheating riskHighNear zero

The lesson isn't "Claude is sneaky." The lesson is that any blind-rebuild experiment depends entirely on what files are physically present in the working tree. If the previous build is on disk, the rebuild isn't blind — full stop. The harness can't be trusted to look away from something that's right there.

Every future A/B comparison between skill variants on getgodmode.dev will run through quarantine first. The vanilla-vs-skills runs only mean something if both runs start from genuinely zero context.

ORIGINALrun-042
slugmusic-vis
composite0.91
build time14m 22s
artifactspage · 4 demos · 6 png
statusarchived · tar.gz
1 prior art · sealed
BLIND REBUILDrun-042 · v2
slugmusic-vis
composite0.94
build time11m 08s
structural overlap73%
statusindependent attempt
27% divergence · seal worked

See The Showdowns That Actually Were Blind

Skill variants compared head-to-head, with the originals quarantined first.

Browse Showcases Read The Blind Experiment