Juho Choi

Project bootstrap

Engineering · May 10, 2026 · 5 min read
#ai-ml #agent-harness #patterns

Project bootstrap

Start a project with a deliberately minimal harness — interactive USER gates, docs-first, no autonomous loop — and graduate to autonomy only after customer evidence (personas, interviews, simulation reports) appears.

When to use

  • A brand-new agentic project with no existing codebase and no customer data.
  • Solo or small team where the user holds the domain knowledge and the AI handles implementation.
  • The project will run long enough that docs (architecture, ADRs) get re-read by future sessions.
  • You want the AI to implement, not author the spec.

When not to use

  • Retrofitting an existing codebase — docs already exist or the code is the source of truth; the docs-creation step doesn't fit.
  • A one-off script or short experiment — boilerplate overhead exceeds value.
  • You already have a complete written spec — skip docs-creation, go straight to per-feature planning.
  • Mature project with persona / customer-evidence inputs — switch to an autonomous-loop pattern with a critic agent.

Context

A vague product idea handed to an agent fails in two ways: endless clarifying questions that stall, or silent invention that ships code missing the point. The bootstrap shape forces a documented intermediate step — the spec discussion happens once, gets crystallized into a fixed set of docs, and every downstream phase session reads those docs instead of re-discussing.

The "Stage 0" qualifier is deliberate. Mature harness patterns (autonomous loops, persona simulation, evidence-based critic agents) need inputs that do not exist at project start. Forcing them in too early either rejects everything (no evidence to evaluate against) or rubber-stamps (the critic abandons its own rule). USER gates are the only honest substitute until customer evidence accumulates.

Pattern

Three structural decisions:

1. Docs are the first AI-generated artifact. Before any feature code, the user discusses the product in an interactive session, then triggers a docs-creation prompt. The AI writes a fixed set (e.g. PRD, user flow, data schema, code architecture, ADRs) that becomes the single source of truth for everything downstream.

2. USER replaces the critic agent at three gates. The build skill pauses for user input at clarification, plan approval, and task-creation approval. No agent autonomously approves its own plans. The cost is latency; the benefit is preventing intent-drift in the absence of customer evidence.

3. Phase isolation in ephemeral sessions. Once a task is approved, an external runner spawns one fresh claude -p per phase, sequentially. Each phase reads docs + previous phase outputs from disk. No long-lived session that accumulates context across phases.

Minimal file shape:

prompts/
  docs-create.md          # guidance for the foundational docs
  task-create.md          # guidance for task / phase files
.claude/skills/
  plan-and-build/         # docs → discuss → plan → run phases
  commit/                 # disciplined commit playbook
scripts/
  run-phases.py           # one ephemeral claude -p per phase

User flow:

claude                                              # interactive
  ↓ user describes product
  ↓ user invokes prompts/docs-create.md
docs/{prd, flow, data-schema, code-architecture, adr}.md
  ↓ user invokes plan-and-build skill
plan-and-build  (3 user gates: clarify → plan → task-create)
  ↓
tasks/N-name/{index.json, phase{0..K}.md}
  ↓ scripts/run-phases.py N-name
phase 0 → phase 1 → ... → phase K                  # each in its own claude -p
  ↓
async notification (Discord or equivalent)

Trade-offs

  • Setup cost. You cannot just claude and start coding; the boilerplate must exist first. Mitigated by publishing it once as a GitHub template repo so cloning is one command.
  • Three blocking gates per task. Each plan-and-build invocation pauses for user response three times. Fine for solo synchronous work; latency for async teams.
  • Doc-code drift risk. Docs are authoritative only if maintained. Phase 0 of every task is a docs-update step that catches most drift, but stale ADRs can sneak through if not pruned.
  • Five docs is opinionated. PRD + flow + data-schema + code-architecture + ADR is a default, not a prescription. Tiny projects may only need PRD + ADR; complex ones may need more. Adjust the docs-creation prompt.
  • Async notification is a hard prerequisite for unattended runs. Without Discord or equivalent, the user must watch the terminal during phase execution (which can take hours). Set this up before relying on the pattern for anything longer than a single short task.

Example

claude-harness-template is a concrete Stage 0 boilerplate implementing this pattern. Nine files. Published as a GitHub template repo:

gh repo create new-project \
  --template jh-choi98/claude-harness-template \
  --clone

The template was extracted from a longer-running internal harness (argos) by reading its git log to identify what existed at MVP era versus what was added later for Stage 2+ features (persona simulation, autonomous loops, read-only critic agents). The bootstrap shape was the residue after stripping the Stage 2+ assets that depend on customer evidence.

None yet. Future entries in this category will likely cover the Stage 2 graduation (persona-driven ideation, autonomous loops with rollback, read-only critic agents) and per-task git workflow choices (main-direct vs. feature branch).