資料來源#
- Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next
- Full Walkthrough: Workflow for AI Coding — Matt Pocock
- How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code)
摘要#
迴圈是一個 agent 程序,它會重複執行提示詞,直到佇列為空或達到停止條件。截至 2026 年年中,三種收斂的實作方式都指出迴圈正在成為與 single-shot session 平起平坐的基本要素:Anthropic 的 /loop 斜線指令(cron-scheduled,重複執行)、Anthropic 的 routines(伺服器端 /loop),以及 Matt Pocock 的 Ralph Wiggum loop(在 while 迴圈中使用 bash + claude --permission-mode accept-edits)。Boris Cherny 將迴圈稱為「未來」;Matt Pocock 則將它們用作他 end-to-end 工作流的 AFK 骨幹。
The two loop families#
Cron-scheduled loops (/loop, routines)#
在 Claude Code 和 Cowork 中使用。機制:agent 呼叫 cron(透過工具)來排程未來時間的工作;該工作在該時間點帶著執行任務的指令重新進入 agent。排程可以重複(每分鐘、每 5 分鐘、每天)。
Boris Cherny 提到的使用場景:
- 照看 PR — 修復 CI、auto-rebase
- 保持 CI 健康 — 修復 flaky tests
- 每 30 分鐘將 Twitter 的回饋進行聚類
- 「隨時有數十個迴圈在執行」
- 夜間:「數千個 agents」進行更深度的運作
Routines 是伺服器上的相同基本要素,因此即使筆記型電腦關閉也能繼續執行。
Backlog-draining loops (Ralph Wiggum loop)#
由 Matt Pocock 等人使用。機制:一個 shell 腳本以固定的提示詞執行 agent,該提示詞指示它從待辦清單中挑選下一個任務並完成它,然後腳本重新啟動。待辦清單是一個包含 markdown issue 檔案的目錄(或 GitHub issues)。
Pocock 的 once.sh 骨架:
issues=$(cat issues/*.md)
recent_commits=$(git log -5 --oneline)
prompt=$(cat prompt.md)
claude --permission-mode accept-edits "$prompt" --context "$issues" "$recent_commits"此「迴圈」包裝器只是重複執行 once.sh,直到 agent 發出哨兵值(no more tasks)或 harness 停止它。
提示詞強制執行 AFK-only 任務選擇 — 只有標記為 AFK(相對於 human-in-loop)的任務才符合條件。
Why loops matter#
- 將規劃成本分攤到多次執行中。 一次仔細的規劃 session(例如透過 Design Concept Grilling)會建立一個 Kanban 待辦清單(參見 Vertical Slice Tracer Bullets);迴圈會在沒有進一步人類輸入的情況下清理它。
- 長達數小時的任務變得可行。 與其使用一個巨大的 context window,迴圈將工作拆分到許多全新的 sessions 中 — 每次都保持在 Context Window Smart Zone 中。
- 平行化。 獨立的待辦清單項目會同時在不同的沙箱中執行。Pocock 的 Sandcastle 函式庫透過在 Docker 容器中為每個 issue 建立 git worktrees 來實現這一點;merger agent 隨後進行協調。
- 閒置算力。 Boris 的夜間設定是在便宜的閒置時間執行一個包含一千個 agent 的迴圈 — 雖然這項工作不值得人類花費晚上時間,但能以 agent 成本產生價值。
AFK vs human-in-loop tasks#
Matt Pocock 的關鍵區分:
- AFK 任務 — 實作、重構、測試腳手架、文件維護、CI 自動修復。agent 無需逐步核准即可成功;驗證是自動的(測試、型別、Linter)。
- Human-in-loop 任務 — 對齊、設計選擇、優先級排序、QA。這些任務沒有機械式的驗證;它們需要品味與隱性 context。
迴圈適用於 AFK 類別。嘗試對 human-in-loop 工作進行迴圈會產生 drift — agent 會做出看似合理但錯誤的決定並不斷累積它們。
Verification is the ceiling#
Pocock 更強烈的論點:feedback loops 的品質決定了迴圈能力的上限。 沒有良好的測試、型別和 Linter,迴圈就像是「盲目編寫程式碼」。這與 Agent Harness Engineering 中關於機械式強制執行的論點相同 — 迴圈只是更赤裸地暴露出成本,因為沒有人類來捕捉 drift。
Connection to model trajectory#
Boris Cherny 報告指出,Opus 4.7 會在沒有提示的情況下自發地啟動迴圈:
「我會告訴它:『去抓取這個數據查詢。』而它會回答:『嘿,我注意到數據隨時間在變化。我將啟動一個迴圈,並每 30 分鐘為你提供一份報告。』」
這契合了 Harness Shrinkage as Models Improve — 過去由 harness 注入的能力變成了模型自然展現的行為。迴圈的基本要素依然存在,但使用者不再需要手動呼叫它。
相關連結#
- Claude Code Best Practices — 最佳實踐指南將
/loop視為核心工作流基本要素 - Engineer PM Convergence — 通才 PM-engineers 將工作分發到多個迴圈中
- Boris Cherny — 主要倡導者與日常推動者
- Matt Pocock — Ralph loop + Sandcastle 範例
- Harness Shrinkage as Models Improve — 迴圈作為取代 per-step prompting 的次世代基本要素
- Context Window Smart Zone — 為什麼將工作碎片化為許多全新的 sessions 優於單一長 sessions
- Vertical Slice Tracer Bullets — 填充迴圈所清理的待辦清單之內容
- Design Concept Grilling — 證實迴圈合理性的規劃步驟
- Deep Modules for Agents — 具有強大測試邊界的模組讓迴圈變得可行
- Agent Harness Engineering — 推廣了「驗證是天花板」的論點
- Symphony — orchestration layer 中由 daemon 驅動的等價物
- Claude Code Auto Mode — 權限分類器,讓
accept-edits模式在 AFK 迴圈中保持安全 - Agentic Misalignment (AM) — 迴圈 + weak per-action oversight 正是 AM 的威脅層面;迴圈審查者仰賴模型端的對齊在無人看管的情況下維持
- AI Brain Fry — 人類端的產出倍增器限制:更多迴圈產出 → 更多審查 → 更多認知疲勞 → 錯過更多錯誤
- Human-AI Accountability Redesign — 迴圈迫使重新設計的問題出現;span-of-control 重新設計是無人值守迴圈部署所欠缺的合作夥伴
- AI-Driven Formal Proof Search — DeepMind 的基本證明搜尋 agent 簡直就是一個「Ralph loop」(huntley2025ralph):generate→compile→learn-lessons 的片段,作為平行的獨立 subagents 執行
- AlphaProof Nexus — 其基本 agent (A) 為 Ralph-loop 船隊的框架;在大多數問題上它與 bespoke system 平起平坐
- Agent-Native Infrastructure — 透過感測器/執行器運作的常駐迴圈是 Karpathy 的 agent-native 世界的 runtime
待解決的問題#
- 當模型自己排程其迴圈時(4.7 行為),誰來負責預算?Boris 回答「模型自己決定」— 但這把成本約束推給了模型的訓練,而不是 harness。
- 一個搭配足夠聰明模型的迴圈是否仍需要 Kanban 待辦清單,還是模型會自己從原始目標中選擇下一個任務?
- 迴圈產出的審查現在是 Matt Pocock 坦承的瓶頸 —「我們只需要準備好進行更多的程式碼審查。」
資料來源#
- Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next —
/loop與 routines 作為主要基本要素 - Full Walkthrough: Workflow for AI Coding — Matt Pocock — Ralph loop + Sandcastle 架構
- How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code) — 功能層級的迴圈(CI healing、程式碼審查)
Cited by 26
- Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- Agent-Native Infrastructure
The world is still built for humans and must be rewritten for agents; "what do I copy-paste to my agent?"; sensors/actu…
- Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
- Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
- AI Brain Fry
Kropp et al. 2026/03: mental fatigue from excessive AI oversight increases minor errors +11%, major errors +39%; cognit…
- AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
- Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- AlphaProof Nexus
DeepMind framework for LLM-aided Lean proof generation; four agents (basic→full-featured); proof-sketch + EVOLVE-BLOCK…
- Boris Cherny
Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Claude Code Auto Mode
Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…
- Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
- Context Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
- Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
- Design Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- Engineer PM Convergence
Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Human-AI Accountability Redesign
HBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/es…
- Learning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
- Matt Pocock
Independent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad…
- AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Symphony
OpenAI's open-source agent orchestrator (March 2026): turns Linear into a control plane for Codex, per-issue workspace,…
- Vertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
Related articles
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Context Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
