/blog-post-GM — a Claude Code skill we evolved with our own Evolution engine to write every post in the Godmode voice.
The Research Phase: How One-Shot Learned to Look Before Building
♚ The task: Build an advanced 3D chess game. One-Shot produced blobby, amateur pieces — twice.
🔍 The root cause: No research phase. The protocol jumped straight from planning to building from imagination.
🔧 The fix: Phase 1c — Deep Research. Web search for existing implementations, real assets, and API verification before any code is written.
✅ The result: Real 3D Staunton models with 1,078-vertex horse heads instead of hand-crafted box shapes. New phase now runs on every task.
The Task
"Make an advanced 3D chess game." Simple enough prompt. One-Shot Scripts spun up, planned the architecture (Three.js + chess.js), and started building.
The result was a fully functional chess game with AI, move validation, castling, en passant — the works. There was just one problem.
The pieces looked terrible. Smooth clay lumps on sticks. The knight was a box with a box on top. The bishop was a pointed blob. Two full rebuild iterations and they were still amateur-looking.
What Went Wrong
One-Shot's execution was flawless on every dimension except visual quality. The game logic was correct. The AI played well. The UI was polished. But the 3D pieces were generated from imagination — hand-crafting LatheGeometry profiles from memory.
Think of it like an architect designing a house without ever looking at a blueprint. They know what a house should look like, but the proportions are off, the details are wrong, and the result screams "I guessed this."
↓
🏗️ Phase 2: Hand-craft LatheGeometry profiles from memory
↓
👀 Result: Blobby, featureless shapes
↓
🔄 Loop: Rebuild with "more detail" — still blobby
↓
❌ User verdict: "Looks terrible"
The protocol had no mechanism to say: "Before you build this, go look at how other people did it." It went straight from plan to code.
The Fix That Changed Everything
After the second rejection, we did what should have happened from the start: searched the web. Within minutes we found an open-source repository with real 3D Staunton chess piece models — 688 vertices for a king, 1,078 for a knight with a proper horse head.
Before research
Hand-crafted LatheGeometry with ~35 points per piece. Knight = box + box + cone. 2 rebuild iterations, both rejected.
After research
Real Staunton models loaded from JSON. Knight = 1,078-vertex sculpted horse head. Accepted immediately.
The gap wasn't effort. One-Shot spent more tokens on the hand-crafted approach (two full rebuilds). The gap was reference material.
The best code is code you don't write. A well-modeled 3D horse head from an open-source project will always beat box shapes assembled from imagination. The skill just needed to be told to look for it first.
Phase 1c: Deep Research
We added a new mandatory phase to One-Shot Scripts that runs after planning (1b) and before building (2). It's not just for visual tasks — it fires on every non-trivial task.
Existing Implementations
Search GitHub, blogs, and forums for how others solved the same problem. Use existing solutions instead of reinventing.
API Verification
Verify every library's current API against live docs. Don't trust memory for version-specific details.
Asset Discovery
Search for open-source models, icons, fonts, and templates. Prefer real assets over procedural generation.
Domain Knowledge
Verify domain rules and edge cases from authoritative sources. Check what you think you know.
The phase produces a Research Brief — a structured document listing findings, assets to use, API gotchas, and plan adjustments. If research reveals a better approach, the plan gets updated before any code is written.
What This Caught (In This Task Alone)
The chess game had multiple issues that research would have prevented from the start:
| Issue | Without Research | With Research |
|---|---|---|
| chess.js module format | Loaded as script tag — blank screen, no error | Verified ES module format, used import map |
| Chess piece geometry | Hand-crafted blobs, 2 failed iterations | Found open-source Staunton models, loaded directly |
| Three.js OrbitControls | Wrote a buggy inline version from scratch | Imported official OrbitControls from Three.js examples |
| Piece proportions | Guessed ratios, pieces looked wrong | Real Staunton proportions: king 3.0 height, 0.75 base radius |
Every single issue would have been caught by a 5-minute web search. Instead, they cost two full rebuild loops.
Why This Applies to Every Task
The chess game made the problem obvious because 3D models are visual — you can see when they're wrong. But the same blind spot exists in every task type.
Code Features
An NPM package already does what you're building. Research finds it. No research means reinventing it worse.
Bugfixes
The bug is a known issue with a documented workaround. Research finds the GitHub issue. No research means debugging from scratch.
Writing
Factual claims need verification. API docs need current syntax. Research confirms. No research means shipping wrong information.
Architecture
The pattern you're designing has a well-known name and established best practices. Research finds the prior art. No research means reinventing the pattern badly.
Think of it like cooking. A great chef doesn't invent every dish from scratch — they study existing recipes, taste reference dishes, and then create their version. One-Shot was trying to cook dishes it had never tasted.
How It Works in Practice
Phase 1c spawns research agents in parallel — one per knowledge gap identified during planning. Each agent searches, evaluates, and returns findings. The orchestrator synthesizes everything into a Research Brief.
↓
🔍 Agent 1: Search for existing implementations on GitHub
🔍 Agent 2: Verify library APIs against live docs
🔍 Agent 3: Find open-source assets to use directly
↓
📋 Research Brief: findings, assets, gotchas, plan updates
↓
🏗️ Phase 2: Build with real reference material
Research depth scales with task complexity. A one-line config change gets a quick API check. A greenfield 3D game gets multiple search agents running in parallel.
The Scoring Change
We also added Deep Research as a scored dimension in the assessment rubric. Weight: 0.10 (10% of the composite score).
If One-Shot skips research and builds from memory, it takes a scoring hit. If it finds real assets, verifies APIs, and updates its plan based on findings, it scores high. The rubric now enforces the behaviour.
| Score | What it means |
|---|---|
| 0.0 | No research done. Built entirely from memory. |
| 0.5 | Partial research. Checked docs but didn't search for existing solutions. |
| 0.85 | Good research. Verified APIs, searched for implementations, produced a brief. |
| 1.0 | Full research. Found and integrated existing assets. Plan updated based on findings. Every assumption verified. |
The Takeaway
One-Shot Scripts already had 8 phases covering everything from planning to testing to security hardening. It was missing the most fundamental one: looking at what already exists before building something new.
Phase 1c is now baked into every One-Shot execution. It's the difference between building from imagination and building from knowledge.
One-Shot Scripts — Now With Deep Research
The execution protocol that researches, builds, tests, and iterates until perfect. One prompt. Zero rework.
See One-Shot Scripts View Pricing