Plate II機器翻譯 · machine-translatedENHOWARDISM

Symphony

PublishedApril 28, 2026FiledEntityTagsEntityOpenaiAgent OrchestrationCodexReading9 minSourceAI-synthesised

OpenAI 的開源 agent 編排器（2026 年 3 月）：將 Linear 變成 Codex 的控制平面、每個 issue 獨立工作區、daemon 驅動、SPEC.md 即產品、附帶保留的 500% 合併 PR 宣稱

資料來源#

An open-source spec for Codex orchestration: Symphony.

摘要#

Symphony 是 OpenAI Codex 團隊（Alex Kotliarskyi、Victor Zhu、Zach Brock——同一個撰寫 harness engineering 文章的團隊）推出的開源 agent 編排規格。它將 issue 追蹤器（v1 使用 Linear）變成程式碼 agents 的控制平面：每張開啟的 ticket 都會獲得一個專屬工作區和一個持續運行的 Codex App Server session。其「產品」主要是 openai/symphony repo 中的一個 SPEC.md 檔案——OpenAI 明確表示不打算將 Symphony 作為獨立工具維護，而是將其定位為參考實作，讓使用者將自己的 coding agent 指向它。

細節#

起源故事#

Symphony 在 2026 年 3 月公開發布前六個月就已建構完成，它源自與 Agent Harness Engineering 不同的瓶頸。當 OpenAI 生產力工具團隊擁有一個可運作的 agent 友善 repo（無人工撰寫的程式碼、約 100 萬行、約 1,500 個 PR）後，新的瓶頸變成了人類注意力——一位工程師在同時管理 3–5 個 Codex session 後，上下文切換就會導致生產力崩潰。

演進路徑：

v1 — 在 tmux 中運行的 Codex session，輪詢 Linear 並為新任務產生子 agents。可以運作，但不可靠。
v2 — 存在於主專案 repo 中，利用現有的 harness。
v3 — 「用 Symphony 來建構 Symphony。」一旦核心功能存在，系統就能自舉自身的開發。
對外發布 — 提取為獨立的 SPEC.md。OpenAI 要求 Codex 用 Elixir 實作該規格，然後再用 TypeScript、Go、Rust、Java 和 Python 實作——跨實作的差異被用作規格模糊測試信號以消除歧義。（參見 LLM-as-Compiler Knowledge Base 了解為何這很重要。）

參考實作使用 Elixir，選擇它是因為其並行和監督原語。正如文章所述：「當程式碼實際上是免費的，你終於可以根據語言的優勢來選擇語言了。」

它實際做了什麼#

Symphony 是一個長期運行的 daemon，它：

輪詢 Linear（預設 30 秒）以取得處於活躍狀態（Todo、In Progress）的 issues。
為每個符合條件的 issue，在 <workspace.root>/<sanitized_identifier> 建立一個確定性的 per-issue 工作區。
在該工作區中啟動一個 Codex App Server session，從 WORKFLOW.md 渲染每個團隊的 prompt 模板。
每次 tick 進行狀態調和——終止那些 ticket 已轉換到終態的 sessions，以指數退避重試崩潰/停滯的 sessions。
不會寫入追蹤器本身。 狀態轉換、留言、PR 連結由 coding agent 使用自己的工具執行。Symphony 是「排程器/執行器和追蹤器讀取器」。

SPEC.md 就是產品#

當你打開 repo 時，首先看到的是 SPEC.md——而非原始碼。該規格定義了：

工作流程契約：WORKFLOW.md 是一個帶有 YAML front matter 的 markdown 檔案，在使用者的 repo 中進行版本控制，解析為執行時期配置（tracker、polling、workspace、hooks、agent、codex）和一個 Liquid 相容的 prompt 模板主體。
狀態機：5 個編排狀態（Unclaimed、Claimed、Running、RetryQueued、Released）和 11 個執行嘗試階段。追蹤器狀態與編排器狀態是分離的。
並行控制：全域上限（max_concurrent_agents，預設 10）、per-state 上限、可選的 per-SSH-host 上限。
工作區安全不變量：agent 只在 per-issue 工作區中運行；工作區路徑必須保持在工作區根目錄內；識別碼清理為 [A-Za-z0-9._-]。
無持久化編排器資料庫：重啟恢復由追蹤器驅動和檔案系統驅動。啟動時清理過期的終態工作區。
工作區跨次運行保留（刻意設計）。與典型 CI 的短暫性相反——有暖快取的好處，但有狀態污染的風險。

這是一種刻意的反轉：與其建構複雜的監督系統，OpenAI 定義了問題並讓 coding agents 來實作它。

Linear 作為控制平面的洞察#

Symphony 強制推動的更深層轉變在 Ticket-Driven Agent Orchestration 中有所描述：tickets 成為工作單位，而非 sessions 或 PR。在 OpenAI 的某些團隊中，合併的 PR 在前三週增加了 500%（保留性宣稱——「在某些團隊中」，無基線定義）。Linear 創辦人 Karri Saarinen 另外報告了與 Symphony 發布相關的工作區建立量激增。

OpenAI 外部的證據：在發布約 6 週內，GitHub 星數超過 15K（截至 2026 年 4 月 23 日）。

「目標，而非轉換」——重新學到的教訓#

OpenAI 第一版的 Symphony 將 agents 視為狀態機中的剛性節點——Codex 只被要求實作任務。他們發現這太過限制：

「模型越來越聰明，能解決比我們試圖將它們塞入的框框更大的問題。」

轉變：給 Codex 工具（gh CLI、CI 日誌閱讀技能等）和目標，而非狀態轉換。這在編排層重新闡述了「強制不變量，而非實作」原則——相同的想法，不同的層次。

Codex App Server 與 Token 隔離的工具注入#

Symphony 使用 Codex 的無頭模式（App Server）而非驅動 CLI。完整協議記錄在 Codex App Server Protocol。值得注意的是動態工具呼叫（實驗性）的使用：Symphony 不是直接給子 agents Linear 存取 token，而是暴露一個 linear_graphql 工具，使用編排器的認證來代理已認證的請求——token 永遠不會到達子 agent 容器。

這在精神上與 MCP 平行，但專門針對 coding-agent 執行時期。

Symphony 未解決的問題#

文章中坦誠指出的取捨：

失去飛行中的引導：當工作在 ticket 層級分配時，你無法在執行過程中微調 agent。失敗會揭示 harness/技能中的缺口，然後在系統層面修補。
並非所有任務都適合：需要強人類判斷的模糊問題仍需互動式 Codex sessions。Symphony 處理批量例行實作。
狀態機僵化（早期教訓，已修正）：見上方「目標，而非轉換」。

與其他事物的關係#

Symphony vs. Claude Code agents：平行的生態系統。Symphony 是 daemon 優先（常駐、輪詢）；Claude Code 是 session 優先，帶有可選的非互動模式（claude -p）。兩者都依賴 repo 版本控制的 markdown 作為行為配置（參見 Claude Code Best Practices CLAUDE.md、Hermes Agent AGENTS.md/SOUL.md）。
Symphony vs. Hermes Gateway：兩者都是作為 systemd/launchd 服務運行的多租戶 daemons。Symphony 的租戶單位是 issue；Hermes 的租戶單位是使用者（參見 Hermes Agent）。兩者都進行 per-tenant 隔離，並偏好使用 Docker 確保安全。

開放問題#

500% 合併 PR 宣稱是保留性的——無基線定義，僅「在某些團隊中」。跨團隊的分佈是什麼樣的？在該吞吐量下，PR 品質和回退率會發生什麼？
「工作區跨次運行保留」與典型 CI 的短暫性相反。在什麼時候，先前運行的狀態污染（過期的 node_modules、殘留分支、建構產物）開始比暖快取的好處造成更多傷害？
Symphony 不寫入追蹤器——agents 才會。這意味著追蹤器策略是 WORKFLOW.md 中的一個 prompt。當 Linear 更改其 API 時，這在實務上有多脆弱？當 agents 擁有 prompt 層級的裁量權時，如何強制一致的狀態機行為？
規格透過在 6 種語言中實作而得到簡化。這項技術的延伸是什麼？此 vault 中的 compiler-prompt.md 是否可以類似地進行交叉模糊測試？
Symphony 明確表示 agents 可以自行建立 tickets。什麼治理機制能防止失控的 ticket 圖擴張？人類對 agent 建立的 tickets 進行分類是唯一的檢查嗎？

資料來源#

An open-source spec for Codex orchestration: Symphony.

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 20

Agent Context Files
The cross-vendor markdown-as-control-plane pattern: repo-versioned plaintext (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLO…
Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
Claude's Constitution / Model Spec
Anthropic Model Spec / Constitution by Askell et al.; document specifying Claude's values + hard constraints (SP1–3, GP…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Code as Source of Truth
Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification
Codex App Server Protocol
JSON-RPC stdio protocol for headless Codex sessions: initialize/initialized/thread-start/turn-start handshake, continua…
Google DeepMind
Google's AI lab; built AlphaProof Nexus; Gemini models, AlphaProof, AlphaEvolve; opens the AI-for-mathematics domain in…
Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
LLM-as-Compiler Knowledge Base
Karpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4…
MCP and Computer Use
Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slac…
Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
Model Spec Midtraining (MSM)
New training phase between pretrain and AFT: train base model on synthetic docs discussing the Model Spec; controls AFT…
Model Spec Science
Empirical study of which Model Spec features best generalize alignment; value explanations > rules alone, specific > ge…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Thinking Machines Lab
AI research lab behind interaction models (May 2026); harness-dissolves-into-model thesis; upstreamed streaming-session…
Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…
Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…
Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…

Cited by 20

Agent Context Files
The cross-vendor markdown-as-control-plane pattern: repo-versioned plaintext (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLO…
Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
Claude's Constitution / Model Spec
Anthropic Model Spec / Constitution by Askell et al.; document specifying Claude's values + hard constraints (SP1–3, GP…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Code as Source of Truth
Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification
Codex App Server Protocol
JSON-RPC stdio protocol for headless Codex sessions: initialize/initialized/thread-start/turn-start handshake, continua…
Google DeepMind
Google's AI lab; built AlphaProof Nexus; Gemini models, AlphaProof, AlphaEvolve; opens the AI-for-mathematics domain in…
Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
LLM-as-Compiler Knowledge Base
Karpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4…
MCP and Computer Use
Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slac…
Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
Model Spec Midtraining (MSM)
New training phase between pretrain and AFT: train base model on synthetic docs discussing the Model Spec; controls AFT…
Model Spec Science
Empirical study of which Model Spec features best generalize alignment; value explanations > rules alone, specific > ge…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Thinking Machines Lab
AI research lab behind interaction models (May 2026); harness-dissolves-into-model thesis; upstreamed streaming-session…
Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…

Symphony

資料來源#

摘要#

細節#

起源故事#

它實際做了什麼#

SPEC.md 就是產品#

Linear 作為控制平面的洞察#

「目標，而非轉換」——重新學到的教訓#

Codex App Server 與 Token 隔離的工具注入#

Symphony 未解決的問題#

與其他事物的關係#

相關連結#

開放問題#

資料來源#