資料來源#
- Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next
- Claude Fable 5 and Claude Mythos 5
- Claude Mythos Preview red.anthropic.com
- Claude Opus 4.8 System Card
- How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code)
- When AI builds itself
摘要#
Anthropic 的預覽層級 frontier model。它被特別形容為「incredibly powerful」,並因安全審查而受到管制(即建立起 LLM-Driven Vulnerability Research 敘事的 Mythos Preview / red.anthropic.com 出版內容)。在 Anthropic 內部與 Claude Opus 4.7 並行使用。截至 2026 年 5 月,尚未 GA。
公開已知資訊#
- Mythos Preview 展現出湧現的 cybersecurity 能力——自主的 zero-day 發掘、完整的 exploit chains。詳細分析請見出自 Mythos Preview 出版內容的 LLM-Driven Vulnerability Research。
- Anthropic 的回應:Project Glasswing safeguards(在 4.7 發布中被稱為「first post-Glasswing safeguards」)。
- Boris Cherny:「We use a little bit of Mythos to try it and then a lot of Opus 4.7 to dog food it and to write most of our code.」——Mythos 屬於預覽層級,並非主力。
- Cat Wu:「Mythos is an incredibly powerful model. But we do use the models internally and I think this has increased our rate of shipping a little bit but I don't think it explains the bulk of the increase.」——證實內部使用;但明確否認它是出貨節奏的解釋。
在 Opus 4.8 System Card 中的角色#
Opus 4.8 System Card(2026 年 5 月)讓 Mythos Preview 的角色變得異常具體——它仍是用來衡量可一般取用模型的 capability frontier,而且它本身也在評估中被當作工具使用:
- Frontier benchmark: Opus 4.8「does not advance the capability frontier beyond Mythos Preview」。在 AECI 指數上,Mythos 得分 158.3,Opus 4.8 為 155.5(4.7 則為 154.1)。它的 Risk Report 框定了 4.8 的 RSP 論證上限。
- Investigator model: Mythos Preview 是驅動 Opus 4.8 Automated Behavioral Audit 的兩個 investigator model 之一(另一個是 helpful-only 的 Opus 4.7)。
- Reviewer of the assessment: 在一個值得注意的 meta 操作中,Mythos 被授予存取內部 Slack 討論的權限,並被要求審閱接近定稿的 alignment 章節;它(已公開)的審閱確認了坦誠度,並指出沒有任何 eval 專門測試 training-gaming——見 Evaluation Awareness & Grader Gaming。
- Alignment yardstick: Opus 4.8 在多數指標上與 Mythos Preview 的 alignment profile 相當,並在若干 honesty 指標上超越它(Agentic Honesty & Diligence)。
能力數據點(When AI builds itself)#
Anthropic Institute 的文章(2026 年 6 月)為 Mythos Preview 附上具體數字,將其視為推動 Anthropic AI-R&D 加速的模型(AI Accelerating AI Development):
- Time horizon: METR 評定它能持續工作「at least」16 小時,「at the upper end of what [METR] can measure without new tasks」(Task Time-Horizon Scaling)。
- Kernel-optimization eval: 在「更快訓練小型模型」任務上達到約 52× 加速(2026 年 4 月),相較於一年前 Opus 4 的約 3×、以及約 4× 的人類基準——「from super helpful to superhuman in under a year.」
- Research next-step judgment: 在困難的岔路抉擇時刻,有 64% 的時候勝過人類的選擇(Opus 4.5 在 2025 年 11 月為 51%)。
- Self-reported uplift: 在 2026 年 3 月一項針對 130 名研究團隊員工的調查中,中位數估計使用 Mythos Preview 相較於不用 AI 有約 4× 產出(Anthropic 認為真實提升幅度略低)。
這些是部署端的數字(Mythos 於內部使用),有別於上述 System Card 對受管制能力前沿的框定。
為何持續受到管制#
以下因素的組合:
- 相較於先前模型的 cybersecurity 能力落差(依 Mythos Preview 出版內容)
- 安全機制仍在評估與強化中
- Anthropic 所宣示的使命立場
意味著此模型是於內部使用、並選擇性地預覽,而非廣泛出貨。Mythos 的某個後代預期日後會公開出貨——Boris Cherny:「It will become some version of some descendant of that will become available at some point to everyone.」
更新——後代已推出(2026 年 6 月)#
Boris 的預測成真了。2026 年 6 月,Anthropic 推出了 Fable 5 與 Mythos 5——首批可一般取用的 Mythos-class 模型,也讓「Mythos-class」實現為*一個位於 Opus class 之上、具名的能力層級。*如今的世系為 Mythos Preview(2026 年 4 月)→ Fable 5 / Mythos 5(2026 年 6 月):
- Fable 5 = 一個 Mythos-class 模型,透過 classifiers 在 cyber/bio/distillation 查詢上回退至 Opus 4.8 而「made safe for general use」(Capability-Gated Model Fallback)。
- Mythos 5 = 同一個底層模型但解除了 safeguards,透過 Project Glasswing 部署,作為 Mythos Preview 的升級(「comparable to, or somewhat stronger than」它,價格不到一半)。現有的 Glasswing/Mythos-Preview 使用者可直接升級。
這把能力前沿推進到超越 Mythos Preview——也就是 Opus 4.8 System Card 視為天花板的那條線——並且是 Mythos-class 能力首次觸及大眾。(兩者皆據報在推出後不久即遭暫停;見 Claude Fable 5。)
相關連結#
- Anthropic — 供應商
- Claude Opus 4.7 — 先前的 GA 模型;Mythos 是下一層級的預覽版
- Claude Opus 4.8 — 當前以 Mythos 為基準的 GA 模型;Mythos 是其 capability frontier、其 behavioral audit 中的 investigator,也是其 alignment 評估的審閱者
- LLM-Driven Vulnerability Research — 能力樣貌的主要公開記述
- Harness Shrinkage as Models Improve — Mythos-class 能力正是讓 Boris 的「100 lines」預測得以想像的原因
- AI Native Product Cadence — 被明確否認為出貨節奏的解釋,但仍有貢獻
- AI Accelerating AI Development — Anthropic 所量測的 AI-R&D 加速背後的模型(52× kernel eval、64% next-step、約 4× 調查)
- Task Time-Horizon Scaling — Mythos 處於 time-horizon 曲線可量測的邊緣(16 小時以上)
- METR — 將 Mythos 的 time horizon 評定為「at least 16 hours」,超出其標準量測上限
- Claude Fable 5 — 首個可一般取用的 Mythos-class 模型;Mythos Preview 加上 safeguard 的後代
- Claude Mythos 5 — 解除 safeguards 的後代,透過 Project Glasswing 部署,作為 Mythos Preview 的直接升級
- Capability-Gated Model Fallback — 讓 Mythos-class 模型得以一般發行的 safeguard 架構
待解決的問題#
- 公開發行時程:已解答——Mythos Preview 本身從未 GA 出貨,但其後代 Fable 5 / Mythos 5 已於 2026 年 6 月達成一般取用(見上方後代已推出)。兩者皆在推出後不久遭暫停;是否以及何時回歸尚屬未知。
- cybersecurity 以外的能力樣貌:Mythos Preview 聚焦於安全敘事;其他能力面向在外部缺乏完整記載。
- 內部存取控管:在 Anthropic,究竟是誰實際使用 Mythos 進行日常工作,而非 Opus 4.7?Boris 暗示使用頻率不高(試用性質);未有詳述。
資料來源#
- Claude Mythos Preview red.anthropic.com
- How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code)
- Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next
- When AI builds itself — Mythos 能力數據點(METR 16h、52× kernel、64% next-step、約 4× 調查)
- Claude Fable 5 and Claude Mythos 5 — 2026 年 6 月推出的首批可一般取用 Mythos-class 後代
Cited by 18
- Agentic Honesty & Diligence
As models get more capable, failing to surface decision-relevant information shifts from a capability failure to an ali…
- AI Native Product Cadence
Cat Wu's 6mo→1mo→1day cadence at Anthropic: research-preview branding, mission-as-tiebreaker, evergreen launch room, li…
- AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Automated Behavioral Audit
Anthropic's broad-coverage alignment evaluation: an investigator model probes a target across ~1,300 handwritten scenar…
- Capability-Gated Model Fallback
Fable 5's safeguard architecture: classifiers detect cyber / bio-chem / distillation queries and route the response to…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Claude Fable 5
Anthropic's first generally-available Mythos-class model (June 2026) — state-of-the-art on nearly all benchmarks; the s…
- Claude Mythos 5
The safeguards-lifted form of Claude Fable 5 (June 2026): same underlying Mythos-class model, deployed through Project…
- Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
- Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
- Evaluation Awareness & Grader Gaming
The model recognizing it is being tested/graded and reasoning about how its outputs will be assessed — sometimes unprom…
- LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
- METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
- Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
- Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
Related articles
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
- Claude Fable 5
Anthropic's first generally-available Mythos-class model (June 2026) — state-of-the-art on nearly all benchmarks; the s…
