Sources#
Summary#
LLMs do not degrade linearly as context grows; they degrade quadratically because attention relationships scale O(n²) with token count. Matt Pocock (citing Dex Hardy of Human Layer) frames this as a smart zone / dumb zone split: the first ~100K tokens of any session is the smart zone where the model performs well; beyond that the model gets "dumber and dumber" regardless of advertised window size. Practical implication: context budget is a real, hard resource — and the agent harness is responsible for keeping individual sessions within the smart zone.
The constraint#
"Every time you add a token to an LLM, it's kind of like you're adding a team to a football league. The number of matches goes up quadratically."
"It doesn't matter whether you're using 1 million context window or 200K, it's always going to be about [100K]. It starts to just get dumber."
The 1M-token context windows shipping in 2026 don't move the smart zone — they "just shipped a lot more dumb zone." Long context is useful for retrieval (find a fact in five copies of War and Peace) but not for reasoning (write code that depends on all of it).
Memento metaphor#
Each session is a fresh start. There is no memory across sessions; the model resets to the system prompt every time. This is a constraint but also a feature — clearing context restores smart-zone behavior cheaply. Persistent state must live somewhere the next session can read it (repo, filesystem, a the index-style catalog).
Compaction is worse than clearing#
Claude Code's /compact command summarizes the running session into a smaller history. Pocock prefers /clear:
- Compacted history accumulates "sediment" — distortions and lossy summaries — that degrades subsequent work
- Clear-and-restart returns to a known-clean baseline (the system prompt)
- The cost of clearing is paid back by working in the smart zone
The disagreement isn't universal — many developers like compaction because it preserves continuity. The right call depends on whether your task can be resumed cleanly from a written record (then prefer clear) or needs in-flight conversational context (then compaction wins).
Implications for harness design#
- System prompt budget. Anything always-in-context comes off the smart-zone budget. "I have seen people put 250K tokens [in the system prompt], then you're just going into the dumb zone before you can even do anything." Keep CLAUDE.md / AGENTS.md as a table of contents, not an encyclopedia (see Agent Harness Engineering on AGENTS.md as ToC).
- Sub-agents preserve parent context. A sub-agent runs in its own context window; only its summary returns. Pocock's
grill-meskill ran a 93.7K-token sub-agent yet his main session still had ~25K tokens unused. - Fragment work into many sessions. Loops (see Agent Loop Pattern) and vertical slices (see Vertical Slice Tracer Bullets) work because each iteration starts fresh in the smart zone.
- Reviewer should run in fresh context. If the implementer used 80K tokens in the smart zone, asking it to review its own work pushes the reviewer into the dumb zone. Cleared context = smart-zone reviewer (see Deep Modules for Agents on push-vs-pull and reviewer placement).
- Push vs pull instructions. Always-in-context instructions cost smart-zone tokens; pull-on-demand (skills) costs nothing until invoked.
Status-line token counter as an essential tool#
Pocock recommends a status-line widget showing the exact running token count of each session — without it, developers don't know when they're approaching the dumb zone. He treats this as "absolutely essential information."
Connections#
- Matt Pocock — popularizer of the smart-zone framing
- Agent Harness Engineering — system-prompt minimalism and AGENTS.md-as-ToC are restatements of the smart-zone principle
- Agent Loop Pattern — fragmenting work to stay in smart zone is why loops are powerful
- Vertical Slice Tracer Bullets — keeping each task small enough to fit in smart zone
- Design Concept Grilling — the grilling session uses a sub-agent so the parent context stays small
- Deep Modules for Agents — clearing-before-review is a smart-zone discipline
- Harness Shrinkage as Models Improve — the smart zone may grow ("the dumb zone has become less dumb lately") but quadratic attention still constrains it
- AI Brain Fry — human-side analog of the smart zone: oversight has its own degradation curve past capacity, mirroring attention degradation past ~100K tokens
- Interaction Models — continuous audio/video at 200ms granularity accumulates context fast; TML names long-session context management as an open problem — the same constraint in a new modality
- HTML as the New Markdown — the human-attention analog: a reader degrades past some volume of undifferentiated markdown the way a model degrades past ~100K tokens; HTML raises the human's effective smart zone by spending tokens on legibility
- Agentic Technical Debt — founders' persistent-context discipline (CLAUDE.md) competes with smart-zone budget; over-long context files become their own problem
Open questions#
- Does the smart-zone marker scale with model size, or is it bounded by attention architecture? Pocock observes "the dumb zone has become less dumb lately" but pegs it at 100K through 2026.
- When sparse-attention or memory-augmented architectures ship, does the smart zone become a soft constraint?
- How should harnesses surface remaining smart-zone budget to the user — token count, percentage, or a richer signal?
Sources#
- Full Walkthrough: Workflow for AI Coding — Matt Pocock — primary articulation
Cited by 24
- Agent Context Files
The cross-vendor markdown-as-control-plane pattern: repo-versioned plaintext (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLO…
- Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
- AI Brain Fry
Kropp et al. 2026/03: mental fatigue from excessive AI oversight increases minor errors +11%, major errors +39%; cognit…
- Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
- Design Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- Where Does Agent Harness Work Remain Durable as Models Improve?
Durable harness work lives at external-reality boundaries: repo-local source of truth, mechanical verification, context…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- HTML as the New Markdown
Thariq Shihipar's thesis: as models improve, thousand-line markdown plans overwhelm the *human*; HTML artifacts (visual…
- Does the Human-Facing Harness (HTML Artifacts) Hit Its Own Bloat Ceiling?
Yes — HTML raises and reshapes the human-attention ceiling but can't remove it; bloat relocates from document-length to…
- Interaction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
- Learning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
- Matt Pocock
Independent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad…
- AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Time-Aligned Micro-Turns
The core interaction-model move: input/output as continuous streams in ~200ms interleaved chunks, no turn boundaries; s…
- TML-Interaction-Small
TML's first interaction model: 276B MoE / 12B active, audio+video+text in / text+audio out, 200ms micro-turns, async ba…
- Turn-Based Interface Bottleneck
Why current AI interfaces limit collaboration: single-thread turn-taking is a bandwidth bottleneck; humans pushed out b…
- Vertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
Related articles
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Design Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
