Plate IIAI Engineering中文HOWARDISM

Claude Code Best Practices

PublishedApril 10, 2026FiledConceptDomainAI EngineeringTagsClaude CodeAI ToolsDeveloper WorkflowReading11 minSourceAI-synthesised

Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→code workflow, environment config

Illustration for Claude Code Best Practices

Sources#

Summary#

Anthropic's official guide to effective Claude Code usage, organized around a single core constraint: the context window fills up fast and performance degrades as it fills. All best practices flow from managing this scarce resource — through verification-driven development, structured context (CLAUDE.md), aggressive session management, and horizontal scaling via parallel sessions.

Details#

Context Window as Primary Constraint#

The context window holds the entire conversation: messages, file reads, command outputs. A single debugging session can consume tens of thousands of tokens. As context fills, Claude "forgets" earlier instructions and makes more mistakes. Every best practice is ultimately about managing this resource. See Context Window Smart Zone for the underlying mechanism (quadratic attention scaling, ~100K-token smart-zone marker).

Model-level amplifiers (as of Claude Opus 4.7): the updated tokenizer maps the same input to 1.0–1.35× more tokens, and Opus 4.7 "thinks more at higher effort levels" — especially on later turns in agentic settings. Claude Code's default effort has been raised to xhigh. These compound: a session that fit on 4.6 at high may be meaningfully tighter on 4.7 at xhigh. Measure on real traffic before trusting intuition carried over from 4.6. Counter-levers: lower effort, task budgets (API), explicit conciseness prompting, or brevity-style output caps (see Scale-Dependent Prompt Sensitivity).

Verification-Driven Development#

The single highest-leverage practice: give Claude a way to verify its own work. Provide tests, screenshots, expected outputs, or linter commands. Without verification, Claude produces plausible-looking but broken code and the human becomes the only feedback loop.

Key patterns:

Provide concrete test cases with inputs and expected outputs
For UI changes, paste screenshots and ask Claude to compare its result
Address root causes by providing error messages, not just "the build is failing"
Use the Claude in Chrome extension for automated UI testing

Explore → Plan → Code Workflow#

Separate research from implementation. Use Plan Mode for multi-file changes or unfamiliar code. Skip planning when the scope is clear and the diff can be described in one sentence.

A more aggressive variant: Design Concept Grilling (Matt Pocock's grill-me skill) replaces "ask the agent for a plan" with "let the agent interview you until you reach shared understanding before any plan exists." See also Vertical Slice Tracer Bullets for slicing the resulting PRD into agent-grabbable Kanban tickets, and Deep Modules for Agents for keeping the codebase shape agent-friendly.

Environment Configuration#

CLAUDE.md: persistent instructions loaded every session. Include only what Claude can't infer from code — bash commands, non-default code style, workflow rules, architectural decisions, gotchas. Prune ruthlessly: if Claude already does something correctly without the instruction, delete it. Treat like code — review when things go wrong, test by observing behavior changes. Use @path imports for modularity. For founders / solo builders, the stricter discipline of starting each session with the CLAUDE.md as architectural context and ending each session by updating it is the primary defense against Agentic Technical Debt — debt that compounds (not just accumulates) because each session re-derives foundational decisions when context isn't persisted.
Skills (.claude/skills/): domain knowledge and reusable workflows loaded on demand, not every session. Invoke with /skill-name.
Subagents (.claude/agents/): specialized assistants running in isolated context with scoped tools. Useful for tasks that read many files without cluttering main context.
Hooks: deterministic scripts that run at specific points in Claude's workflow. Unlike CLAUDE.md (advisory), hooks guarantee execution.
MCP servers: connect external tools (Notion, Figma, databases) via claude mcp add.
Plugins: bundled skills + hooks + subagents + MCP from a marketplace.
Permissions: auto mode (classifier-based approval, middle ground between default-prompt and --dangerously-skip-permissions), allowlists, or OS-level sandboxing.

Session Management#

/clear between unrelated tasks — prevents context pollution
/compact <instructions> — targeted summarization preserving specified context
/rewind or Esc+Esc — restore conversation, code, or both to any checkpoint
Subagents for investigation — explore in separate context, report back summaries
/btw — side questions that never enter conversation history
After two failed corrections on the same issue, /clear and rewrite the prompt incorporating what you learned

Scaling Patterns#

Non-interactive mode: claude -p "prompt" for CI, scripts, pre-commit hooks. Supports JSON and streaming output.
Parallel sessions: desktop app (isolated worktrees), web (isolated VMs), or agent teams (coordinated sessions with shared tasks).
Writer/Reviewer pattern: one session implements, another reviews with fresh context (no bias toward own code).
Fan-out: loop claude -p across files for large migrations. Use --allowedTools to scope permissions.
Auto mode for unattended runs: classifier blocks risky actions, allows routine work. Aborts on repeated blocks in non-interactive mode.
Loops and routines: /loop (cron-scheduled repeat job, in-CLI) and routines (server-side variant). Drain a Kanban backlog AFK; primary mechanism for amortizing planning over many executions. See Agent Loop Pattern.

Parallel Ecosystems and Cross-Tool Concept Mapping#

Claude Code is one of several converging coding-agent ecosystems. Capability parallels with Hermes Agent (Nous Research) and Codex (OpenAI):

Capability	Claude Code	Hermes	Codex
Project context file	`CLAUDE.md`	`AGENTS.md` (project) + `SOUL.md` (personality, separate)	`AGENTS.md`
Session compaction	`/compact <instructions>`	`/compress`	(via Codex App Server thread compaction)
Mid-session model switch	`/model`	`/model`	session-level config
Parallel subagents	Subagents in `.claude/agents/`	`delegate_task`	Spawned via Symphony orchestrator
Non-interactive / programmatic	`claude -p`, Claude Agent SDK	`hermes` CLI in scripts	Codex App Server (JSON-RPC stdio)
Multi-user team deployment	per-session `claude -p`	Hermes Gateway (Telegram/Discord/Slack/WhatsApp) with allowlist or DM pairing	Symphony (issue-tracker-driven daemon)
Permission gating	auto mode classifier	per-pattern approvals (`once`/`session`/`always`/`deny`); skipped under container backend	implementation-defined per Symphony spec
Memory model	conversation + CLAUDE.md	bounded `MEMORY.md` (~2,200 chars) + `USER.md` (~1,375 chars)	filesystem-driven

The shared structural insight across all three: agent behavior is configured via repo-versioned markdown files (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLOW.md). This pattern is consistent enough across vendors to look like an emerging standard. (A dedicated Agent Context Files concept page is planned to formalize this.)

The most architectural divergence: Claude Code is session-first with optional non-interactive mode; Hermes Gateway and Symphony are daemon-first when deployed at team scale. The session-vs-daemon split is the dominant deployment-architecture choice in 2026.

Common Failure Patterns#

Pattern	Fix
Kitchen sink session (mixed unrelated tasks)	`/clear` between tasks
Repeated corrections (>2 failed fixes)	`/clear`, rewrite prompt with lessons learned
Over-specified CLAUDE.md	Prune; convert to hooks if deterministic
Trust-then-verify gap	Always provide verification criteria
Infinite exploration	Scope narrowly or use subagents

Connections#

Agent Harness Engineering — Claude Code's CLAUDE.md, skills, and hooks are a practical implementation of the harness engineering patterns described by OpenAI and Anthropic's research teams
LLM-as-Compiler Knowledge Base — CLAUDE.md files serve as the schema layer in this vault's LLM-as-compiler architecture
LLM-Driven Vulnerability Research — Claude Code is the runtime for Anthropic's vulnerability research scaffold; all Mythos Preview findings used Claude Code's agentic capabilities
Client-Side Agent Optimization — directly challenges the "use the strongest model" default: combinations where Claude Opus 4.6 is paired with a cheaper planner beat all-Opus by >40pp on HotpotQA. AgentOpt's httpx interception is compatible with claude -p non-interactive mode
Scale-Dependent Prompt Sensitivity — complements context-window management: brevity constraints both raise accuracy on overthinking-prone problems and preserve context budget. Verification-driven development is especially important when large-model verbosity can mask reasoning errors
Claude Code Auto Mode — the full write-up of the "auto mode" permission option mentioned in Environment Configuration and Scaling Patterns
Claude Opus 4.7 — the model most Claude Code work now targets; literal instruction following and tokenizer inflation directly reshape how CLAUDE.md and session management should be written
Hermes Agent — parallel ecosystem from Nous Research; many Claude Code patterns map directly (/compress ↔ /compact, delegate_task ↔ subagents, AGENTS.md ↔ CLAUDE.md); the differences (Gateway daemon, bounded memory files, SOUL.md split) highlight design choices each made
Codex App Server Protocol — the OpenAI-side analog to claude -p + Claude Agent SDK; both let an external orchestrator drive sessions, but App Server is more explicit about a stable JSON-RPC stdio protocol
Symphony — the daemon-first deployment archetype; a Claude-Code analog would wire claude -p plus subagents into an issue tracker the same way Symphony wires Codex to Linear
Ticket-Driven Agent Orchestration — the orchestration pattern that becomes natural once non-interactive mode is solid; bridges single-session best practices into team-scale deployment
Context Window Smart Zone — the underlying constraint that motivates every context-management practice in this article
Design Concept Grilling — more aggressive alignment-first variant of explore→plan→code
Vertical Slice Tracer Bullets — task decomposition pattern that fills the Kanban backlog drained by the loop primitive
Deep Modules for Agents — codebase shape that makes Claude Code's review and verification patterns reliable; push-vs-pull instruction delivery
Agent Loop Pattern — /loop and routines as the next-generation primitive replacing per-step prompting
Harness Shrinkage as Models Improve — why best-practice prompts and CLAUDE.md sections shrink with each model release; Cat Wu's discipline of pruning at every launch
Claude Code — the entity-level page
AI Native Product Cadence — these best-practice artifacts are the public output of a team operating at that internal cadence
Engineer PM Convergence — the engineer-with-product-taste persona this guide implicitly targets
Agentic Technical Debt — the failure mode CLAUDE.md primarily defends against; specifically named in the founder's playbook
AI-Native Startup Lifecycle — the founder-stage framing that elevates CLAUDE.md from "best practice" to "MVP survival discipline"
MCP and Computer Use — the connector substrate behind the "extend Claude Code with custom tools" scaling pattern; MCP and computer use are how external systems become part of the agent's action surface
Evals as Product Spec — the strict form of "verification-driven development": ten great evals encode what done looks like at the feature level, complementing the workflow-level verification this article prescribes

Derived#

When to Use Claude Opus 4.6 for Work — context-window-as-primary-constraint framing informs the Claude Code corollary: Opus verbosity consumes budget faster
Opus 4.6 → 4.7 Changes and Multi-Agent Coding Considerations — subagents, Writer/Reviewer, and scaling-pattern guidance applied to Opus 4.7 multi-agent teams
Learning to Co-Work with AI: A Software Engineer's Field Guide — best-practices distilled into a per-engineer skill-development field guide (six skill clusters, daily practices, anti-patterns, 90-day plan)

Open Questions#

What's the optimal CLAUDE.md length before instructions start getting lost? Is there a measurable threshold?
How does the Writer/Reviewer pattern compare to agent-to-agent review (as in OpenAI's Codex workflow)?
When does subagent overhead exceed the benefit of context isolation?

Sources#

Best Practices for Claude Code
Auto mode for Claude Code — permission-mode expansion
Introducing Claude Opus 4.7 — tokenizer/xhigh-default context-budget implications

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 38

Agent Context Files
The cross-vendor markdown-as-control-plane pattern: repo-versioned plaintext (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLO…
Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
AI Native Product Cadence
Cat Wu's 6mo→1mo→1day cadence at Anthropic: research-preview branding, mission-as-tiebreaker, evergreen launch room, li…
AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
Blast Radius (Agentic)
The potential damage if an agent is compromised; the unit Zero Trust's 'assume breach' posture is built to contain via…
Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Claude Code Auto Mode
Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Code as Source of Truth
Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification
Codex App Server Protocol
JSON-RPC stdio protocol for headless Codex sessions: initialize/initialized/thread-start/turn-start handshake, continua…
Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
Design Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
Where Does Agent Harness Work Remain Durable as Models Improve?
Durable harness work lives at external-reality boundaries: repo-local source of truth, mechanical verification, context…
Engineer PM Convergence
Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…
Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
Learning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
LLM-as-Compiler Knowledge Base
Karpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4…
LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
MCP and Computer Use
Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slac…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Opus 4.6 → 4.7 Changes and Multi-Agent Coding Considerations
4.6→4.7 delta table + six hazards for multi-agent coding teams: role-based model selection, prompt re-tuning, harness i…
Orchestration vs Employee Framing: Reconciling the Founder's Playbook with HBR's Accountability Evidence
Reconciles the Founder's Playbook orchestration framings with HBR Kropp et al.'s accountability evidence; "orchestratio…
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Symphony
OpenAI's open-source agent orchestrator (March 2026): turns Linear into a control plane for Codex, per-issue workspace,…
Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
Vertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…
When to Use Claude Opus 4.6 for Work
Decision rules for Opus 4.6 deployment: solver-not-planner, elaboration-load-bearing tasks, brevity constraints, Pareto…
Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…

Cited by 38

Agent Context Files
The cross-vendor markdown-as-control-plane pattern: repo-versioned plaintext (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLO…
Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
AI Native Product Cadence
Cat Wu's 6mo→1mo→1day cadence at Anthropic: research-preview branding, mission-as-tiebreaker, evergreen launch room, li…
AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
Blast Radius (Agentic)
The potential damage if an agent is compromised; the unit Zero Trust's 'assume breach' posture is built to contain via…
Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Claude Code Auto Mode
Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Code as Source of Truth
Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification
Codex App Server Protocol
JSON-RPC stdio protocol for headless Codex sessions: initialize/initialized/thread-start/turn-start handshake, continua…
Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
Design Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
Where Does Agent Harness Work Remain Durable as Models Improve?
Durable harness work lives at external-reality boundaries: repo-local source of truth, mechanical verification, context…
Engineer PM Convergence
Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…
Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
Learning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
LLM-as-Compiler Knowledge Base
Karpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4…
LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
MCP and Computer Use
Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slac…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Opus 4.6 → 4.7 Changes and Multi-Agent Coding Considerations
4.6→4.7 delta table + six hazards for multi-agent coding teams: role-based model selection, prompt re-tuning, harness i…
Orchestration vs Employee Framing: Reconciling the Founder's Playbook with HBR's Accountability Evidence
Reconciles the Founder's Playbook orchestration framings with HBR Kropp et al.'s accountability evidence; "orchestratio…
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Symphony
OpenAI's open-source agent orchestrator (March 2026): turns Linear into a control plane for Codex, per-issue workspace,…
Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
Vertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…
When to Use Claude Opus 4.6 for Work
Decision rules for Opus 4.6 deployment: solver-not-planner, elaboration-load-bearing tasks, brevity constraints, Pareto…
Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…