Built by /blog-post-GM — a Claude Code skill we evolved with our own Evolution engine to write every post in the Godmode voice.

Get free skill (account)

Case Study April 11, 2026 ⏱️ 5 min read

The Research Phase: How One-Shot Learned to Look Before Building

TL;DR

♚ The task: Build an advanced 3D chess game. One-Shot produced blobby, amateur pieces — twice.
🔍 The root cause: No research phase. The protocol jumped straight from planning to building from imagination.
🔧 The fix: Phase 1c — Deep Research. Web search for existing implementations, real assets, and API verification before any code is written.
✅ The result: Real 3D Staunton models with 1,078-vertex horse heads instead of hand-crafted box shapes. New phase now runs on every task.

KING: — verts · KNIGHT: — verts · MISMATCH: —

BEFORE RESEARCH — Two pieces, two topologies. Wireframe density betrays the gap that an open-source Staunton model would have closed.

BEFORE RESEARCH: KING 688 verts (hand-crafted lathe). KNIGHT 1,078 verts (procedural horse head). Same scene, mismatched topologies. AFTER RESEARCH: both pieces converge to a shared vertex budget (~750) using a single open-source Staunton reference. Look before you build.

🎯 The Task

"Make an advanced 3D chess game." Simple enough prompt. One-Shot Scripts spun up, planned the architecture (Three.js + chess.js), and started building.

The result was a fully functional chess game with AI, move validation, castling, en passant — the works. There was just one problem.

The pieces looked terrible. Smooth clay lumps on sticks. The knight was a box with a box on top. The bishop was a pointed blob. Two full rebuild iterations and they were still amateur-looking.

🕵️ What Went Wrong

One-Shot's execution was flawless on every dimension except visual quality. The game logic was correct. The AI played well. The UI was polished. But the 3D pieces were generated from imagination — hand-crafting LatheGeometry profiles from memory.

Think of it like an architect designing a house without ever looking at a blueprint. They know what a house should look like, but the proportions are off, the details are wrong, and the result screams "I guessed this."

📝 Phase 1b: Plan the pieces
↓
🏗️ Phase 2: Hand-craft LatheGeometry profiles from memory
↓
👀 Result: Blobby, featureless shapes
↓
🔄 Loop: Rebuild with "more detail" — still blobby
↓
❌ User verdict: "Looks terrible"

The protocol had no mechanism to say: "Before you build this, go look at how other people did it." It went straight from plan to code.

💡 The Fix That Changed Everything

After the second rejection, we did what should have happened from the start: searched the web. Within minutes we found an open-source repository with real 3D Staunton chess piece models — 688 vertices for a king, 1,078 for a knight with a proper horse head.

Before research

Hand-crafted LatheGeometry with ~35 points per piece. Knight = box + box + cone. 2 rebuild iterations, both rejected.

After research

Real Staunton models loaded from JSON. Knight = 1,078-vertex sculpted horse head. Accepted immediately.

The gap wasn't effort. One-Shot spent more tokens on the hand-crafted approach (two full rebuilds). The gap was reference material.

The best code is code you don't write. A well-modeled 3D horse head from an open-source project will always beat box shapes assembled from imagination. The skill just needed to be told to look for it first.

🔧 Phase 1c: Deep Research

We added a new mandatory phase to One-Shot Scripts that runs after planning (1b) and before building (2). It's not just for visual tasks — it fires on every non-trivial task.

🔍

Existing Implementations

Search GitHub, blogs, and forums for how others solved the same problem. Use existing solutions instead of reinventing.

✅

API Verification

Verify every library's current API against live docs. Don't trust memory for version-specific details.

🎨

Asset Discovery

Search for open-source models, icons, fonts, and templates. Prefer real assets over procedural generation.

📚

Domain Knowledge

Verify domain rules and edge cases from authoritative sources. Check what you think you know.

The phase produces a Research Brief — a structured document listing findings, assets to use, API gotchas, and plan adjustments. If research reveals a better approach, the plan gets updated before any code is written.

PIPELINE: 5 phases · NEW: RES · Hover a phase · press Run a task

RES IS INSERTED, NOT BOLTED ON. Hover a prism for its job. Press Run a task to watch the task pause at Research, absorb file reads, and continue.

The pipeline runs in order: READ → RES → PLAN → BUILD → VER. Phase 1c (RES) is the new insert — every task pauses there to absorb reference material before planning continues.

📊 What This Caught (In This Task Alone)

The chess game had multiple issues that research would have prevented from the start:

Issue	Without Research	With Research
chess.js module format	Loaded as script tag — blank screen, no error	Verified ES module format, used import map
Chess piece geometry	Hand-crafted blobs, 2 failed iterations	Found open-source Staunton models, loaded directly
Three.js OrbitControls	Wrote a buggy inline version from scratch	Imported official OrbitControls from Three.js examples
Piece proportions	Guessed ratios, pieces looked wrong	Real Staunton proportions: king 3.0 height, 0.75 base radius

Every single issue would have been caught by a 5-minute web search. Instead, they cost two full rebuild loops.

KING: 0 verts · KNIGHT: 0 verts · BUDGET: 800

OVER BUDGET. The knight's bar pierces the ceiling. Research swaps the imagined topology for a referenced one — both bars then sit under the cap.

KING bar = 688 cubes. KNIGHT bar = 1,078 cubes, piercing the 800-vertex budget ceiling. After research, both rebuild near 750 verts each. Neither pierces.

🌍 Why This Applies to Every Task

The chess game made the problem obvious because 3D models are visual — you can see when they're wrong. But the same blind spot exists in every task type.

📦

Code Features

An NPM package already does what you're building. Research finds it. No research means reinventing it worse.

🐛

Bugfixes

The bug is a known issue with a documented workaround. Research finds the GitHub issue. No research means debugging from scratch.

✏️

Writing

Factual claims need verification. API docs need current syntax. Research confirms. No research means shipping wrong information.

🏗️

Architecture

The pattern you're designing has a well-known name and established best practices. Research finds the prior art. No research means reinventing the pattern badly.

Think of it like cooking. A great chef doesn't invent every dish from scratch — they study existing recipes, taste reference dishes, and then create their version. One-Shot was trying to cook dishes it had never tasted.

BEFORE: BUILD THEN CRASH. The figure walks straight into what it never read.

A small figure walks forward. Before research it crashes through a wall it never looked at. After research it pauses, scans, and walks around. Same task, different first move.

⚙️ How It Works in Practice

Phase 1c spawns research agents in parallel — one per knowledge gap identified during planning. Each agent searches, evaluates, and returns findings. The orchestrator synthesizes everything into a Research Brief.

📝 Phase 1b produces a plan with knowledge gaps
↓
🔍 Agent 1: Search for existing implementations on GitHub
🔍 Agent 2: Verify library APIs against live docs
🔍 Agent 3: Find open-source assets to use directly
↓
📋 Research Brief: findings, assets, gotchas, plan updates
↓
🏗️ Phase 2: Build with real reference material

Research depth scales with task complexity. A one-line config change gets a quick API check. A greenfield 3D game gets multiple search agents running in parallel.

📈 The Scoring Change

We also added Deep Research as a scored dimension in the assessment rubric. Weight: 0.10 (10% of the composite score).

If One-Shot skips research and builds from memory, it takes a scoring hit. If it finds real assets, verifies APIs, and updates its plan based on findings, it scores high. The rubric now enforces the behaviour.

Score	What it means
0.0	No research done. Built entirely from memory.
0.5	Partial research. Checked docs but didn't search for existing solutions.
0.85	Good research. Verified APIs, searched for implementations, produced a brief.
1.0	Full research. Found and integrated existing assets. Plan updated based on findings. Every assumption verified.

🏁 The Takeaway

One-Shot Scripts already had 8 phases covering everything from planning to testing to security hardening. It was missing the most fundamental one: looking at what already exists before building something new.

Phase 1c is now baked into every One-Shot execution. It's the difference between building from imagination and building from knowledge.

One-Shot Scripts — Now With Deep Research

The execution protocol that researches, builds, tests, and iterates until perfect. One prompt. Zero rework.

See One-Shot Scripts View Pricing

← Wave Execution All Posts →