Sources#
Summary#
Matt Pocock's grill-me skill — a relentless interviewer prompt that walks down decision-tree branches one question at a time, with recommended answers — replaces "ask the agent for a plan" with "reach a shared understanding before any plan exists." The point is alignment, not output. The goal-state is what Frederick P. Brooks calls the design concept in The Design of Design: a shared idea held by all participants in the work. A PRD or a plan is downstream of the design concept; producing one without alignment first guarantees rework.
The skill (verbatim)#
"Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the decision tree, resolving dependencies one by one. For each question provide your recommended answer. Ask the questions one at a time…"
That's it. The skill is short on purpose — minimum surface area, maximum behavior change.
Why grill before plan#
Pocock observed that agents in plan mode "really eagerly try to produce a plan" — say "I think I've got enough" and ship a plan that papers over open questions. The plan reads fine, but it is wrong in ways that don't surface until implementation. Forcing the agent to interview first exposes the open questions while there is still time to answer them cheaply.
The recommendation-with-each-question pattern is load-bearing: it lets the user say "yes, agreed" most of the time, only debating where they actually disagree. A pure question-only interview wastes user attention on obvious calls.
Counter to the "specs to code" movement#
Pocock's strongest negative thesis: specs-to-code is vibe coding by another name. Defenders say "write a careful spec, hand to AI, fix the spec when the code is wrong, never look at the code." Pocock has tried it: it doesn't work.
Reasons:
- The code is the battleground, not the spec
- Specs that don't engage with code degrade into wish-lists
- The feedback loop runs through a layer (spec ⇄ AI ⇄ code) instead of where the bugs actually live (code ⇄ tests)
- Without code engagement, the developer's mental model of the system rots
Grilling sits on the opposite discipline: spec is downstream of alignment, alignment is upstream of any artifact, the developer keeps a hand in the code throughout.
Outputs of a grilling session#
A grilling session can run anywhere from 10 to 100 questions; Pocock has had sessions that went an hour. The artifact at the end is the conversation history itself — kept around as raw material for the PRD step. Pocock's write-a-PRD skill consumes this history (along with another short interview) to produce a destination document.
He explicitly does not review the PRD afterwards:
"What am I testing at this point? What are the failure modes I'm trying to test for? I know that LLMs are great at summarization. I have reached the same wavelength as the LLM. So all I'm doing is checking the LLM's ability to summarize."
This is only safe because the grilling session did the alignment work. Skip grilling and you must read the PRD.
Two essential documents#
After grilling, Pocock generates exactly two documents:
- PRD (destination doc) — what the finished thing looks like, user stories, definition of done, out-of-scope list, implementation decisions, testing decisions, modules to be modified
- Kanban (journey doc) — vertical slices into independently grabbable tickets (see Vertical Slice Tracer Bullets)
He then deletes (or closes) the PRD after implementation completes — see doc rot.
Module map appears in the PRD#
The PRD includes "modules to be modified" — concrete identification of which existing modules change and which new ones are introduced. This connects planning to architecture (see Deep Modules for Agents). The point is to keep the codebase shape in mind throughout planning, not as an afterthought during implementation.
When to skip grilling#
Grilling is for human-in-the-loop tasks. For a short well-scoped change ("rename this function across the codebase"), the overhead is wasted. The discipline scales with stakes: bigger feature, fuzzier brief, higher cost of going the wrong way → grill harder.
Connections#
- Matt Pocock — author of the skill
- Vertical Slice Tracer Bullets — the Kanban that follows the PRD
- Deep Modules for Agents — module map in PRD ties planning to architecture
- Agent Loop Pattern — grilling sits at the human-in-loop top of the funnel; loop drains the AFK bottom
- Context Window Smart Zone — grilling uses sub-agents to keep parent context small
- Agent Harness Engineering — "enforce invariants" at the planning layer is "reach alignment before any plan"
- Claude Code Best Practices — the explore→plan→code workflow has the same shape; grill-me is the more aggressive variant of the "explore" step
- Interaction Models — grilling is collaborative real-time iteration; turn-based interfaces are precisely what makes it clunky today, and an interaction model is the substrate that would make grilling-style collaboration feel native
- HTML as the New Markdown — brainstorm → let Claude interview you → plan is the grilling shape; Thariq's HTML plan is a richer destination artifact than a markdown PRD, traded against harder versioning
- Agentic Technical Debt — grilling produces the design concept that goes into CLAUDE.md; the strongest upstream defense against debt-by-session-re-derivation
- Zero-Friction Scope Creep — a strong design concept reached via grilling resists scope sprawl in a way written PRDs alone often don't
- Evals as Product Spec — grilling produces the design concept; evals encode whether it was achieved. Matt's "verification loops" and Cat's "ten great evals" are the same primitive at planning's other end
- Building Is Cheap, Arguing Is Expensive — productive tension: Fiona Fung's "generate three PRs and compare" relocates design into built artifacts; reconciled here as the prototype is the medium of the design concept, not a replacement for reaching one
Open questions#
- Can grilling be run AFK against another agent that holds the user's preferences? Pocock's answer in 2026 is "no, this part has to be human-in-the-loop" — but the question is open as agents get better at modeling their principal.
- How does grilling change for team work where multiple humans need to align? Pocock's hint: pair-program with the agent in the room, treat it as a third interlocutor.
Derived#
- The PRD-Replacement Spectrum at AI-Native Speed — the left pole of the spectrum: maximal pre-build alignment, PRD as deleted destination doc
- Where Does the Why Live? — the grilling session is an authoring-time home for the why; but the destination PRD that carries it is deleted, so the why is orphaned for future readers
Sources#
Cited by 26
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
- Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- Building Is Cheap, Arguing Is Expensive
"In technical debate, code wins": generate three PRs vs whiteboard; prototype over design doc; reduce design docs
- Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- Context Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
- Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
- Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
- HTML as the New Markdown
Thariq Shihipar's thesis: as models improve, thousand-line markdown plans overwhelm the *human*; HTML artifacts (visual…
- Human-in-the-Loop Boundaries
Humans belong at allocation, understanding, design-concept, risk, and accountability boundaries; they slow the system d…
- Interaction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
- Learning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
- LLM-as-Compiler Knowledge Base
Karpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4…
- Matt Pocock
Independent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad…
- AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
- Model Introspection Feedback
Cat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticis…
- Model Spec Midtraining (MSM)
New training phase between pretrain and AFT: train base model on synthetic docs discussing the Model Spec; controls AFT…
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
- The PRD-Replacement Spectrum at AI-Native Speed
Four positions (grill-then-PRD → lighter-PRD → build-to-decide → prototype-is-spec) are one spectrum once you decompose…
- Prototype Over PRD
Dan Carey's prototype-replaces-PRD method: record a why-not-what conversation, transcribe it, hand the transcript to Cl…
- Turn-Based Interface Bottleneck
Why current AI interfaces limit collaboration: single-thread turn-taking is a bandwidth bottleneck; humans pushed out b…
- Vertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
- Where Does the Why Live?
Rationale (the 'why') is well-homed at authoring time — it's the recorded why-not-what conversation and the grilling se…
- Zero-Friction Scope Creep
MVP failure mode when agentic coding removes the cost-based forcing function against scope creep; antidote is written s…
Related articles
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Context Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
- Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
