Plate II機器翻譯 · machine-translatedENHOWARDISM

Verifiability Thesis

PublishedMay 23, 2026FiledConceptTagsLLM ArchitectureLLM EvaluationAgent EngineeringReading5 minSourceAI-synthesised

LLM 能自動化你可以「驗證」的事物，正如電腦能自動化你可以「規格化」的事物；RL 驗證獎勵 → 鋸齒狀能力峰值；「可驗證 + 實驗室重視」；一切終將可驗證

資料來源#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

摘要#

Andrej Karpathy 對 AI 自動化什麼、何時自動化的核心主張：傳統電腦自動化你能以程式碼規格化的事物；LLM 自動化你能驗證的事物。 由於前沿實驗室在巨大的強化學習環境中以驗證獎勵訓練模型，能力在可驗證領域（數學、程式碼）急劇攀升，在其他領域則參差不齊——產生了 Jagged Intelligence (Ghosts, Not Animals)。實務上的分解方式：一項能力會出現，前提是它可驗證且實驗室有足夠動機去建構環境／納入資料。因此，可驗證性既是解釋當今鋸齒狀能力的原因，也是一個策略槓桿——如果你能建構驗證機制，你就能自己拉動 RL／微調的槓桿。

核心類比#

傳統電腦能自動化你可以用程式碼規格化的事物。這一輪最新的 LLM 能自動化你可以驗證的事物。

RL 訓練獎勵已驗證的結果，因此梯度最強烈地流向正確性可檢查的領域。數學和程式碼是典型的贏家——而這也正是 AI-Driven Formal Proof Search（Lean + 編譯器）和 agentic coding（測試 + CI）最強大的領域，絕非巧合。編譯器／測試就是驗證器；驗證器就是獎勵訊號。

「可驗證 + 實驗室重視」#

光有可驗證性還不夠——實驗室也會選擇什麼進入訓練組合：

西洋棋軼事。 GPT-3.5→GPT-4 的西洋棋能力提升遠超一般能力曲線的預測，因為「大量西洋棋資料被納入了預訓練資料集。」有人決定加入它；能力就飆升了。你「某種程度上受制於實驗室碰巧放入組合中的東西。」
對使用者的啟示。 模型「沒有使用手冊。」你必須探索它：搞清楚你處於哪些迴路中。「如果你在屬於 RL 訓練的迴路中，你會飛起來；如果你在資料分佈之外，你會掙扎」——然後你就得自己做微調。

創辦人的槓桿#

Karpathy 給追求實驗室尚未優先處理的可驗證領域的創辦人的建議：可驗證性是「直接有效的技術——你可以拉一個槓桿。」如果你能組裝多樣化的 RL 環境／範例，你就能微調並「得到實際運作得相當好的東西。」他巧妙地拒絕點名「一個非常[有價值的]領域」——這是一個刻意的不回答，暗示一個尚未被開發的可驗證利基是創業機會。（交叉參考 Seven Powers Applied to AI 和 Compounding Data Moat 中的護城河討論：專有的驗證環境是一種壟斷資源。）

一切終將可驗證#

從另一面來看——「什麼東西只能從遠處自動化？」——Karpathy 主張幾乎所有事物都能在某種程度上變得可驗證。 即使是寫作這類軟性領域，也能透過「一個 LLM 評審委員會」產出合理的結果。所以問題在於多容易或多困難，而非是否可能。這是該論點的樂觀前景：可驗證性是一個 AI 持續攀升的光譜，LLM 評審集成將獎勵訊號延伸到模糊領域。

開放問題#

「LLM 評審委員會」的可靠性邊界在哪裡——它對真正有爭議的價值判斷是否成立，還是僅適用於品質／連貫性？
「實驗室重視」的依賴性是脆弱的：能力可能因你無法控制的實驗室優先順序而出現或停滯。產品應如何對沖資料分佈被抽換的風險？

資料來源#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 22

Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
AlphaProof Nexus
DeepMind framework for LLM-aided Lean proof generation; four agents (basic→full-featured); proof-sketch + EVOLVE-BLOCK…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Build for the Next Model
Prototype the thing that almost works, not the thing that already works: bet that the next concrete model release (not…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Compounding Data Moat
Anthropic's prescription for Scale-stage defensibility: time-locked behavioral fingerprint + domain-encoded edge cases…
Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
How Do You Write Evals for Taste? Character as the Limit Case
Taste-driven features are eval-resistant but not eval-proof: the technique is conviction → dogfood-sourced failure sign…
Evolutionary Proof Search
The full-featured agent's mechanism: population DB of proof sketches, Elo via Plackett–Luce/Gibbs, P-UCB selection, LLM…
Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
Lean
Proof assistant whose compiler mechanically verifies every step; the `sorry` placeholder enables proof sketches; mathli…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Seven Powers Applied to AI
Helmer/Acquired framework re-evaluated for AI: switching costs and process power erode; network effects, scale, cornere…
Software 3.0
Karpathy's taxonomy: 1.0 code, 2.0 weights, 3.0 prompting; LLM as programmable interpreter; MenuGen "shouldn't exist";…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
When Does Verification Quality Determine Whether AI Automation Works?
Verification-quality ladder from Lean/formal proof search through software CI and vulnerability reproduction; autonomy…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…

Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

Cited by 22

Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
AlphaProof Nexus
DeepMind framework for LLM-aided Lean proof generation; four agents (basic→full-featured); proof-sketch + EVOLVE-BLOCK…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Build for the Next Model
Prototype the thing that almost works, not the thing that already works: bet that the next concrete model release (not…
Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
Compounding Data Moat
Anthropic's prescription for Scale-stage defensibility: time-locked behavioral fingerprint + domain-encoded edge cases…
Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
How Do You Write Evals for Taste? Character as the Limit Case
Taste-driven features are eval-resistant but not eval-proof: the technique is conviction → dogfood-sourced failure sign…
Evolutionary Proof Search
The full-featured agent's mechanism: population DB of proof sketches, Elo via Plackett–Luce/Gibbs, P-UCB selection, LLM…
Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
Lean
Proof assistant whose compiler mechanically verifies every step; the `sorry` placeholder enables proof sketches; mathli…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Seven Powers Applied to AI
Helmer/Acquired framework re-evaluated for AI: switching costs and process power erode; network effects, scale, cornere…
Software 3.0
Karpathy's taxonomy: 1.0 code, 2.0 weights, 3.0 prompting; LLM as programmable interpreter; MenuGen "shouldn't exist";…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
When Does Verification Quality Determine Whether AI Automation Works?
Verification-quality ladder from Lean/formal proof search through software CI and vulnerability reproduction; autonomy…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…

Verifiability Thesis

資料來源#

摘要#

核心類比#

「可驗證 + 實驗室重視」#

創辦人的槓桿#

一切終將可驗證#

相關連結#

開放問題#

資料來源#