H
Howardism
Plate IIAI Engineering中文HOWARDISM

Claude Code Best Practices

PublishedApril 10, 2026FiledConceptDomainAI EngineeringTagsClaude CodeAI ToolsDeveloper WorkflowReading11 minSourceAI-synthesised

Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→code workflow, environment config

Illustration for Claude Code Best Practices

Sources#

Summary#

Anthropic's official guide to effective Claude Code usage, organized around a single core constraint: the context window fills up fast and performance degrades as it fills. All best practices flow from managing this scarce resource — through verification-driven development, structured context (CLAUDE.md), aggressive session management, and horizontal scaling via parallel sessions.

Details#

Context Window as Primary Constraint#

The context window holds the entire conversation: messages, file reads, command outputs. A single debugging session can consume tens of thousands of tokens. As context fills, Claude "forgets" earlier instructions and makes more mistakes. Every best practice is ultimately about managing this resource. See Context Window Smart Zone for the underlying mechanism (quadratic attention scaling, ~100K-token smart-zone marker).

Model-level amplifiers (as of Claude Opus 4.7): the updated tokenizer maps the same input to 1.0–1.35× more tokens, and Opus 4.7 "thinks more at higher effort levels" — especially on later turns in agentic settings. Claude Code's default effort has been raised to xhigh. These compound: a session that fit on 4.6 at high may be meaningfully tighter on 4.7 at xhigh. Measure on real traffic before trusting intuition carried over from 4.6. Counter-levers: lower effort, task budgets (API), explicit conciseness prompting, or brevity-style output caps (see Scale-Dependent Prompt Sensitivity).

Verification-Driven Development#

The single highest-leverage practice: give Claude a way to verify its own work. Provide tests, screenshots, expected outputs, or linter commands. Without verification, Claude produces plausible-looking but broken code and the human becomes the only feedback loop.

Key patterns:

  • Provide concrete test cases with inputs and expected outputs
  • For UI changes, paste screenshots and ask Claude to compare its result
  • Address root causes by providing error messages, not just "the build is failing"
  • Use the Claude in Chrome extension for automated UI testing

Explore → Plan → Code Workflow#

Separate research from implementation. Use Plan Mode for multi-file changes or unfamiliar code. Skip planning when the scope is clear and the diff can be described in one sentence.

A more aggressive variant: Design Concept Grilling (Matt Pocock's grill-me skill) replaces "ask the agent for a plan" with "let the agent interview you until you reach shared understanding before any plan exists." See also Vertical Slice Tracer Bullets for slicing the resulting PRD into agent-grabbable Kanban tickets, and Deep Modules for Agents for keeping the codebase shape agent-friendly.

Environment Configuration#

  • CLAUDE.md: persistent instructions loaded every session. Include only what Claude can't infer from code — bash commands, non-default code style, workflow rules, architectural decisions, gotchas. Prune ruthlessly: if Claude already does something correctly without the instruction, delete it. Treat like code — review when things go wrong, test by observing behavior changes. Use @path imports for modularity. For founders / solo builders, the stricter discipline of starting each session with the CLAUDE.md as architectural context and ending each session by updating it is the primary defense against Agentic Technical Debt — debt that compounds (not just accumulates) because each session re-derives foundational decisions when context isn't persisted.
  • Skills (.claude/skills/): domain knowledge and reusable workflows loaded on demand, not every session. Invoke with /skill-name.
  • Subagents (.claude/agents/): specialized assistants running in isolated context with scoped tools. Useful for tasks that read many files without cluttering main context.
  • Hooks: deterministic scripts that run at specific points in Claude's workflow. Unlike CLAUDE.md (advisory), hooks guarantee execution.
  • MCP servers: connect external tools (Notion, Figma, databases) via claude mcp add.
  • Plugins: bundled skills + hooks + subagents + MCP from a marketplace.
  • Permissions: auto mode (classifier-based approval, middle ground between default-prompt and --dangerously-skip-permissions), allowlists, or OS-level sandboxing.

Session Management#

  • /clear between unrelated tasks — prevents context pollution
  • /compact <instructions> — targeted summarization preserving specified context
  • /rewind or Esc+Esc — restore conversation, code, or both to any checkpoint
  • Subagents for investigation — explore in separate context, report back summaries
  • /btw — side questions that never enter conversation history
  • After two failed corrections on the same issue, /clear and rewrite the prompt incorporating what you learned

Scaling Patterns#

  • Non-interactive mode: claude -p "prompt" for CI, scripts, pre-commit hooks. Supports JSON and streaming output.
  • Parallel sessions: desktop app (isolated worktrees), web (isolated VMs), or agent teams (coordinated sessions with shared tasks).
  • Writer/Reviewer pattern: one session implements, another reviews with fresh context (no bias toward own code).
  • Fan-out: loop claude -p across files for large migrations. Use --allowedTools to scope permissions.
  • Auto mode for unattended runs: classifier blocks risky actions, allows routine work. Aborts on repeated blocks in non-interactive mode.
  • Loops and routines: /loop (cron-scheduled repeat job, in-CLI) and routines (server-side variant). Drain a Kanban backlog AFK; primary mechanism for amortizing planning over many executions. See Agent Loop Pattern.

Parallel Ecosystems and Cross-Tool Concept Mapping#

Claude Code is one of several converging coding-agent ecosystems. Capability parallels with Hermes Agent (Nous Research) and Codex (OpenAI):

CapabilityClaude CodeHermesCodex
Project context fileCLAUDE.mdAGENTS.md (project) + SOUL.md (personality, separate)AGENTS.md
Session compaction/compact <instructions>/compress(via Codex App Server thread compaction)
Mid-session model switch/model/modelsession-level config
Parallel subagentsSubagents in .claude/agents/delegate_taskSpawned via Symphony orchestrator
Non-interactive / programmaticclaude -p, Claude Agent SDKhermes CLI in scriptsCodex App Server (JSON-RPC stdio)
Multi-user team deploymentper-session claude -pHermes Gateway (Telegram/Discord/Slack/WhatsApp) with allowlist or DM pairingSymphony (issue-tracker-driven daemon)
Permission gatingauto mode classifierper-pattern approvals (once/session/always/deny); skipped under container backendimplementation-defined per Symphony spec
Memory modelconversation + CLAUDE.mdbounded MEMORY.md (~2,200 chars) + USER.md (~1,375 chars)filesystem-driven

The shared structural insight across all three: agent behavior is configured via repo-versioned markdown files (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLOW.md). This pattern is consistent enough across vendors to look like an emerging standard. (A dedicated Agent Context Files concept page is planned to formalize this.)

The most architectural divergence: Claude Code is session-first with optional non-interactive mode; Hermes Gateway and Symphony are daemon-first when deployed at team scale. The session-vs-daemon split is the dominant deployment-architecture choice in 2026.

Common Failure Patterns#

PatternFix
Kitchen sink session (mixed unrelated tasks)/clear between tasks
Repeated corrections (>2 failed fixes)/clear, rewrite prompt with lessons learned
Over-specified CLAUDE.mdPrune; convert to hooks if deterministic
Trust-then-verify gapAlways provide verification criteria
Infinite explorationScope narrowly or use subagents

Connections#

  • Agent Harness Engineering — Claude Code's CLAUDE.md, skills, and hooks are a practical implementation of the harness engineering patterns described by OpenAI and Anthropic's research teams
  • LLM-as-Compiler Knowledge Base — CLAUDE.md files serve as the schema layer in this vault's LLM-as-compiler architecture
  • LLM-Driven Vulnerability Research — Claude Code is the runtime for Anthropic's vulnerability research scaffold; all Mythos Preview findings used Claude Code's agentic capabilities
  • Client-Side Agent Optimization — directly challenges the "use the strongest model" default: combinations where Claude Opus 4.6 is paired with a cheaper planner beat all-Opus by >40pp on HotpotQA. AgentOpt's httpx interception is compatible with claude -p non-interactive mode
  • Scale-Dependent Prompt Sensitivity — complements context-window management: brevity constraints both raise accuracy on overthinking-prone problems and preserve context budget. Verification-driven development is especially important when large-model verbosity can mask reasoning errors
  • Claude Code Auto Mode — the full write-up of the "auto mode" permission option mentioned in Environment Configuration and Scaling Patterns
  • Claude Opus 4.7 — the model most Claude Code work now targets; literal instruction following and tokenizer inflation directly reshape how CLAUDE.md and session management should be written
  • Hermes Agent — parallel ecosystem from Nous Research; many Claude Code patterns map directly (/compress/compact, delegate_task ↔ subagents, AGENTS.mdCLAUDE.md); the differences (Gateway daemon, bounded memory files, SOUL.md split) highlight design choices each made
  • Codex App Server Protocol — the OpenAI-side analog to claude -p + Claude Agent SDK; both let an external orchestrator drive sessions, but App Server is more explicit about a stable JSON-RPC stdio protocol
  • Symphony — the daemon-first deployment archetype; a Claude-Code analog would wire claude -p plus subagents into an issue tracker the same way Symphony wires Codex to Linear
  • Ticket-Driven Agent Orchestration — the orchestration pattern that becomes natural once non-interactive mode is solid; bridges single-session best practices into team-scale deployment
  • Context Window Smart Zone — the underlying constraint that motivates every context-management practice in this article
  • Design Concept Grilling — more aggressive alignment-first variant of explore→plan→code
  • Vertical Slice Tracer Bullets — task decomposition pattern that fills the Kanban backlog drained by the loop primitive
  • Deep Modules for Agents — codebase shape that makes Claude Code's review and verification patterns reliable; push-vs-pull instruction delivery
  • Agent Loop Pattern/loop and routines as the next-generation primitive replacing per-step prompting
  • Harness Shrinkage as Models Improve — why best-practice prompts and CLAUDE.md sections shrink with each model release; Cat Wu's discipline of pruning at every launch
  • Claude Code — the entity-level page
  • AI Native Product Cadence — these best-practice artifacts are the public output of a team operating at that internal cadence
  • Engineer PM Convergence — the engineer-with-product-taste persona this guide implicitly targets
  • Agentic Technical Debt — the failure mode CLAUDE.md primarily defends against; specifically named in the founder's playbook
  • AI-Native Startup Lifecycle — the founder-stage framing that elevates CLAUDE.md from "best practice" to "MVP survival discipline"
  • MCP and Computer Use — the connector substrate behind the "extend Claude Code with custom tools" scaling pattern; MCP and computer use are how external systems become part of the agent's action surface
  • Evals as Product Spec — the strict form of "verification-driven development": ten great evals encode what done looks like at the feature level, complementing the workflow-level verification this article prescribes

Derived#

Open Questions#

  • What's the optimal CLAUDE.md length before instructions start getting lost? Is there a measurable threshold?
  • How does the Writer/Reviewer pattern compare to agent-to-agent review (as in OpenAI's Codex workflow)?
  • When does subagent overhead exceed the benefit of context isolation?

Sources#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 38
Related articles
  • Agent Harness Engineering

    Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Client-Side Agent Optimization

    AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…

  • Open Questions Backlog

    _96 pages with open questions, as of 2026-06-14._

  • Agent Loop Pattern

    `/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…