Plate IILLM Architecture中文HOWARDISM

Jagged Intelligence (Ghosts, Not Animals)

PublishedMay 23, 2026FiledConceptDomainLLM ArchitectureTagsLLM ArchitectureAI SafetyMental ModelReading6 minSourceAI-synthesised

"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the loop, treat as tools

Illustration for Jagged Intelligence (Ghosts, Not Animals)

Sources#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

Summary#

Andrej Karpathy's mental model for what LLMs are: not animal intelligences shaped by evolution, intrinsic motivation, curiosity, or empowerment, but "ghosts" — jagged, statistical simulation circuits, summoned from internet data and bolted-on RL. "Jaggedness" names the empirical fact that the same model can refactor a 100K-line codebase or find zero-days, yet tell you to walk to a car wash 50m away to wash your car. The framing matters because a correct model of the entity makes you more competent at directing it: you stop expecting human-shaped failure modes and start staying in the loop where the jaggedness bites.

The jaggedness examples#

Strawberry letters. The classic "how many R's in strawberry" failure (now patched).
The car wash. Current SOTA: "I want to drive to a car wash 50m away to wash my car — should I drive or walk?" → models say walk, missing that the car is the thing being washed. "How is it possible that Opus 4.7 will refactor a 100K-line codebase or find zero-days, yet tell me to walk to the car wash? This is insane."
MenuGen email-matching. His agent cross-correlated Stripe and Google funds by email address instead of a persistent user ID — see Vibe Coding vs. Agentic Engineering.

Jaggedness is the symptom; verifiability + what the labs trained on is the proposed cause. Out-of-distribution circuits are where the spikes drop to valleys.

Ghosts, not animals#

We're not building animals, we are summoning ghosts.

The substrate is pre-training (statistics), with RL bolting capability on top, "increasing the disadvantages" of the statistical base. Consequences he draws:

Yelling doesn't help. "If you yell at them, they're not going to work better or worse — it doesn't have any impact." No affect, no morale, no intrinsic drive to model.
No five-step fix. Karpathy is candid that the framing may lack "real power" — it's mostly a stance of suspicion and ongoing empirical exploration, not a recipe. "It's more just being suspicious of it and figuring out over time."

The honesty is the point: a calibrated, slightly-distrustful model of a ghost beats an anthropomorphic model of an animal.

Why the framing changes how you build#

If models are jagged ghosts, then:

Stay in the loop. "You need to actually be in the loop a little bit and treat them as tools and stay in touch with what they're doing." (The discipline of Vibe Coding vs. Agentic Engineering.)
Don't anthropomorphize the failure surface. Errors won't be where a human's would be; they'll be at distribution edges (car wash, email IDs).
Map your circuits. Figure out whether your task is in-distribution (you fly) or out (you struggle and may need fine-tuning) — the practical move from The Verifiability Thesis.

Does jaggedness shrink over time?#

Karpathy hopes so but is unsure — and locates the cause again in training, not fundamentals: aesthetics/taste/simplicity "probably aren't part of the RL." His nanoGPT-simplification anecdote: models "hate" being asked to make code simpler and "can't do it" — a sign you're outside the RL circuits ("pulling teeth, not light speed"). He sees "nothing fundamental preventing it; the labs just haven't done it yet." So jaggedness is contingent, not essential — but real today.

Connections#

Dogfooding as Product Discipline — first-hand use is how you map a model's jagged failure surface
Andrej Karpathy — the "ghosts vs animals" essay, applied
The Verifiability Thesis — the proposed mechanism behind the jaggedness
Vibe Coding vs. Agentic Engineering — why the discipline demands human oversight of spec/taste
Outsource Your Thinking, Not Your Understanding — the human-in-the-loop residue jaggedness forces
Model Introspection Feedback — Cat Wu's "ask the model why it failed" presumes a ghost whose self-report is a debugging signal, not testimony
Scale-Dependent Prompt Sensitivity — a measured form of jaggedness: bigger models underperform smaller ones on a slice of benchmarks
AI-Driven Formal Proof Search — DeepMind's agents hallucinate "established lemmas" that are fake; formal verification catches exactly this jagged failure
Claude Character as Product — the deliberate counter-move: shaping the ghost's character even though motivation isn't intrinsic
Agentic Misalignment (AM) — jaggedness in the safety register: out-of-distribution behavior turning harmful
Evaluation Awareness & Grader Gaming — grader awareness is the kind of alien internal state a "ghost not animal" has that human deception intuitions don't cleanly map onto
Agentic Honesty & Diligence — the "noticed the problem but didn't surface it" failure is jaggedness in the honesty register: high capability, uneven follow-through
Recursive Self-Improvement — the essay leans on the joke/theory-of-mind precedent to argue research taste is the next jagged valley to fill, not a permanent human moat
Research Taste as the Human Bottleneck — the optimistic face of jaggedness: research taste "might be just another capability AI fails at then masters," like explaining a joke or theory of mind
Task Time-Horizon Scaling — the within-basket caveat on the time-horizon metric: a model that nails a 12-hour task can still fail a trivial one (the car wash)
Autonomous Scientific Discovery — the Mythos 5 science results are curated demonstrations of a still-jagged capability, not uniform competence across biology

Open Questions#

Karpathy concedes the framing may not have "real power." Is "ghost vs. animal" load-bearing, or a useful intuition pump that doesn't change concrete decisions?
If taste/aesthetics/simplicity entered the RL mix, would jaggedness in those dimensions smooth out — or are they too unverifiable to reward cleanly (cf. The Verifiability Thesis)?

Sources#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 19

Agentic Honesty & Diligence
As models get more capable, failing to surface decision-relevant information shifts from a capability failure to an ali…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Claude Character as Product
Personality as load-bearing product surface; Amanda's role at Anthropic; lunchtime vibe-checks as eval discipline; the…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
Dogfooding as Product Discipline
Product sense is built by relentless first-hand use ("ant food"); Mr. Peanut catch; cross-source (Cat Wu vibe-checks, G…
Evaluation Awareness & Grader Gaming
The model recognizing it is being tested/graded and reasoning about how its outputs will be assessed — sometimes unprom…
LLM Architecture, Training & Alignment
Map of Content for the llm-architecture domain — 19 concepts. Curated entry point; see Home for all domains.
Model Introspection Feedback
Cat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticis…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…

Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…

Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…

Cited by 19

Agentic Honesty & Diligence
As models get more capable, failing to surface decision-relevant information shifts from a capability failure to an ali…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Claude Character as Product
Personality as load-bearing product surface; Amanda's role at Anthropic; lunchtime vibe-checks as eval discipline; the…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
Dogfooding as Product Discipline
Product sense is built by relentless first-hand use ("ant food"); Mr. Peanut catch; cross-source (Cat Wu vibe-checks, G…
Evaluation Awareness & Grader Gaming
The model recognizing it is being tested/graded and reasoning about how its outputs will be assessed — sometimes unprom…
LLM Architecture, Training & Alignment
Map of Content for the llm-architecture domain — 19 concepts. Curated entry point; see Home for all domains.
Model Introspection Feedback
Cat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticis…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…