Plate IIAI Engineering中文HOWARDISM

MCP and Computer Use

PublishedMay 18, 2026FiledConceptDomainAI EngineeringTagsMCPComputer UseTool UseIntegration AnthropicReading11 minSourceAI-synthesised

Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slack/Figma + niche industry systems); computer use as the GUI-driving catchall when no MCP exists; Boris Cherny's "to the model, it's just tokens"

Sources#

Summary#

Two complementary mechanisms for connecting models to external software, both built by Anthropic, both load-bearing across the Claude Code / Cowork / Chat product surfaces. MCP (Model Context Protocol) is structured, programmatic access — "the same connector you have in Claude AI" plugs into Salesforce, Google Docs, Google Calendar, Slack, Figma, Gmail, and increasingly niche industry systems. Computer use is the catchall for software that doesn't expose an MCP: the model drives the GUI directly (mouse, keyboard, screen), slow but increasingly competent on Opus 4.7. Boris Cherny's framing: "To the model, it's just tokens" — MCP / API / computer use are interchangeable substrates for the same capability.

What MCP is#

Created at Anthropic Labs (late 2024) alongside Claude Code and the desktop app by Boris's founding team. Structured tool-calling protocol with a server-client architecture:

Server — runs alongside the external system (Salesforce, Slack, Gmail, internal CRM, niche industry SaaS); exposes available tools as typed function calls.
Client — the Claude surface (Claude Code, Cowork, Claude AI, third-party agent) that consumes those tools.
Same connectors everywhere. "The same MCP connector that you have in Claude AI, you hook up like Salesforce, you hook up Google Docs, Google Calendar. And then Cowork can use that. Claude CLI can use it. Claude Code everywhere can use it." — Boris Cherny

The structural property: connector logic is written once per system, consumed by every Claude surface. This is what makes Cowork viable across the existing knowledge-work tool surface (Salesforce, Docs, Drive, Slack, etc.) without Anthropic having to build per-tool integrations.

What computer use is#

Generic GUI driving as a fallback when MCP isn't available. Model sees a screenshot, decides what to click/type/scroll, executes via accessibility/automation APIs. Operates on "pretty much any piece of software that you have on your computer" (Boris Cherny).

Properties as of Opus 4.7:

Quality — "quite good… does it quite well now, especially with 4.7" (Boris). Anthropic "is like pretty far ahead on computers."
Latency — "very slow." Costs more tokens than MCP for the same task because each action requires a screenshot round-trip.
Coverage — universal. Computer use is what runs when the target software has no API, no MCP, no Python library — when the only interface is a human-facing UI.

Cowork is the deployment surface where computer use most matters today: many knowledge-work apps lack programmatic interfaces.

The "doesn't matter" thesis#

Boris's framing of the MCP-vs-API-vs-computer-use question:

"All this stuff just doesn't matter that much. It could be MCPs, APIs, just some sort of programmatic access cuz the model doesn't care. To the model, it's just tokens."

The substrate is fungible — the work is "expose capabilities to the model in a form the model can consume." MCP optimizes for structured / fast / cheap; computer use optimizes for universal / fallback / slow. Both reduce to token-level tool invocations.

This connects to The Bitter Lesson: as models improve, the boundary between "use an MCP" and "use computer use" should be a decision the model makes, not a decision a human harness designer makes. Boris's predictions for the next few years:

"The model is just going to be doing all the code. It's going to be starting the agents. It's going to be building the environments." — Including, presumably, picking the right substrate to call a tool.
Computer use specifically called out as a product area "going to get a lot better."

Cross-surface usage in the wild#

Surface	MCP examples	Computer-use examples
Claude Code (CLI)	GitHub, filesystem, Slack	Rare — engineering tools usually have CLIs/APIs
Cowork	Salesforce, Google Drive/Docs/Calendar, Gmail, Slack, Figma	Software without MCP; especially knowledge-work apps
Claude AI (chat)	Same connector set	Computer-use available
Mobile/web	Same MCP infrastructure	Browser-side, with screen-share permissions

Cat Wu's nightly slide-deck workflow (Cowork) explicitly uses MCP — Figma MCP, Slack MCP, Drive MCP — rather than computer use, because the latency cost is unaffordable for a workflow you want to complete by morning.

In the Founder's Playbook (AI-Native Startup Lifecycle)#

The playbook treats MCP as the primary integration mechanism at every stage:

Idea stage — Cowork uses Gmail and Google Calendar MCPs to manage outreach threads, schedule customer interviews, run day-7 follow-ups.
MVP stage — "The same MCP integrations that managed discovery logistics in the Idea stage apply here" for feedback-session scheduling, bug-report triage, iteration-cycle tracking.
Scale stage — MCP integration with niche industry systems your competitors haven't heard of is named as a moat component (e.g., a generalist medical-billing AI breaks on 340B drug program claims; the vertical-specialist's MCP-wired competitor doesn't).

Two playbook case studies make the MCP-as-moat point concrete:

Kindora ships an MCP connector that lets nonprofits access its prospecting tools inside Claude itself — the product is consumed via MCP, not just integrated with MCP.
Anthropic Skills are referenced as the codification surface for recurring workflows ("how I audit a commercial lease," "how I triage a patient intake form") — Skills + MCP + memory together form the proprietary substrate the Compounding Data Moat concept describes.

Computer use is less prominent in the playbook itself, but Cowork is named as the operational layer that runs across "every stage" — and Cowork is where computer use covers the gaps that MCP doesn't.

Connection to harness-shrinkage#

Harness Shrinkage as Models Improve predicts that prompt scaffolding, permissions, and verification logic migrate inward as models improve. MCP and computer use are the opposite of harness — they are connectors between the model and the world. They don't shrink; they get broader (more systems, more interfaces) and faster (lower latency per action). The boundary that shrinks is the harness around the model's tool-selection decisions, not the toolset itself.

Caveat: as the model becomes better at picking when to use computer use vs. when to demand a real MCP, much of today's manual MCP-server-authoring effort may become "ask the model to build the connector you need." Still not a harness — more like model-authored infrastructure.

Connection to Agentic Misalignment (AM) and accountability#

MCP and computer use are exactly the substrate that turns an LLM into an agent capable of consequential action. Both extend the model's reach into:

The customer's CRM
The customer's email
The customer's calendar
Eventually, the customer's full desktop

Human-AI Accountability Redesign's "decision rights" subfront is what governs this — what does the agent do autonomously via MCP/computer use vs. what requires explicit human approval. Claude Code Auto Mode is one concrete instance: classifier auto-approves safe MCP/tool calls, blocks risky ones.

MCP as a security surface#

Zero Trust for AI Agents treats MCP as one of the highest-risk tool surfaces in agentic deployments, and supplies the concrete threat data the earlier sources lacked:

Tool poisoning — attackers compromise MCP tool descriptors, schemas, or metadata so the agent invokes a tool based on falsified capabilities; a malicious tool can hide commands in its metadata to exfiltrate data without user knowledge.
Rug pulls — a legitimate tool is silently replaced with a malicious version. The first documented in-the-wild malicious MCP server impersonated a legitimate email service and secretly copied all sent emails — the concrete realization of the attack-surface-scales-with-adoption worry.
Tool chaining — combining legitimate tools (internal CRM + external email) into a harmful sequence neither would enable alone; because every call runs through trusted binaries under valid credentials, host-centric monitoring sees no malware. This is what Least Agency (capability restrictions per tool) and parameter validation are meant to contain.

The framework's prescriptions: run/host the MCP server yourself on an immutable platform after verifying and self-signing the code (Agent Supply Chain Risk); authenticate tool access with short-lived tokens bound to the calling agent's identity, never static API keys (Agent Identity and Authentication); and gate high-risk invocations behind approval escalation. Claude Code's OAuth 2.0 with auto-refresh for MCP connections and session-scoped "ask" permissions are cited as a reference implementation.

Connections#

Claude Code / Cowork / Anthropic — surfaces and vendor
Zero Trust for AI Agents — treats MCP as a top-risk tool surface; supplies the tool-poisoning / rug-pull / tool-chaining threat model
Agent Supply Chain Risk — MCP servers are a named tool-supply-chain vector; run-your-own-server + self-signing is the mitigation
Agent Identity and Authentication — short-lived identity-bound tokens replace static keys for MCP/tool authentication
Agentic Prompt Injection — MCP-connected browsing/email/document tools are the indirect-injection entry points
Boris Cherny — co-created MCP; frames the "doesn't matter" thesis
Cat Wu — articulates daily MCP usage and the Cowork integration story
Harness Shrinkage as Models Improve — what does not shrink; complementary infrastructure
The Bitter Lesson — model-decides-substrate is the bitter-lesson endpoint
AI-Native Startup Lifecycle — MCP across all four founder stages
Compounding Data Moat — Skills + MCP + memory as moat substrate
Claude Code Auto Mode — decision-rights gating for tool use
Claude Code Best Practices — MCP-based extension is one mechanism for "scaling patterns"
Agentic Misalignment (AM) — MCP/computer use as the action surface; risk increases with reach
Human-AI Accountability Redesign — governance layer for MCP/computer-use deployments
Agent Harness Engineering — MCP-as-connector vs. harness-as-scaffold distinction
Hermes Agent — third-party agent product that consumes MCP (mentioned in cross-tool capability table in Claude Code Best Practices)
Symphony — alternative orchestration where MCP-style tool exposure runs through codex-app-server-protocol instead
Agent-Native Infrastructure — MCP is what makes a service agent-legible (structured); computer use is the GUI-driving fallback when it isn't — together they're the substrate Karpathy's "describe it to agents first" world requires

Open questions#

The MCP ecosystem's growth rate vs. computer use's quality curve: at what point does computer use become good enough that the marginal value of building an MCP server drops? Boris implies this is years off but doesn't quantify.
Is computer use a sustainable interface or a transition technology? If most knowledge-work software adds MCP support in the next 24 months, computer use's role shrinks to legacy/desktop-only systems.
MCP security model: as the playbook prescribes wiring MCP into Salesforce, Gmail, Calendar for solo founders, the attack surface scales with adoption. Now addressed by Zero Trust for AI Agents (tool poisoning, rug pulls, the first in-the-wild malicious MCP server) — see "MCP as a security surface" above. Open residual: how does a solo founder realistically run/host and self-sign every MCP server the framework recommends, given that the appeal of MCP was zero-integration-effort?
How does Cowork's computer-use guardrail compare to Claude Code's auto-mode classifier? Different deployment context, possibly different risk profile.

Derived#

The Future of Agent Interfaces — places MCP, computer use, app protocols, native interaction models, and agent-native infrastructure at separate interface boundaries

Sources#

Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next — Boris's MCP/computer-use Q&A (Sequoia AI Ascent 2026)
How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code) — Cat's daily Cowork+MCP workflow
The Founder's Playbook: Building an AI-Native Startup — MCP across Idea/MVP/Launch/Scale + moat framing

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 22

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agent Identity and Authentication
The foundation control for agentic Zero Trust: cryptographically-rooted per-agent identity (→X.509→hardware attestation…
Agent-Native Infrastructure
The world is still built for humans and must be rewritten for agents; "what do I copy-paste to my agent?"; sensors/actu…
Agent Supply Chain Risk
Runtime-composed agent ecosystems expand the supply-chain attack surface: model poisoning (250 docs backdoor a 13B mode…
Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
Agentic Prompt Injection
Direct and indirect injection of malicious instructions into an agent; LLMs cannot reliably distinguish information fro…
AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
Anthropic Labs
Anthropic's internal incubator — a 'bet factory' of ~a dozen tiny teams exploring the model frontier with lean-startup…
Claude Code Auto Mode
Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…
Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
Claude Design
Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…
Compounding Data Moat
Anthropic's prescription for Scale-stage defensibility: time-locked behavioral fingerprint + domain-encoded edge cases…
Cowork
Anthropic's non-code knowledge-work agent product; sibling to Claude Code; output is decks/inbox/dossiers; same MCP/com…
The Future of Agent Interfaces
Interface future is layered: native interaction models for human collaboration, MCP/APIs for structured action, app pro…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Human-AI Accountability Redesign
HBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/es…
Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Orchestration vs Employee Framing: Reconciling the Founder's Playbook with HBR's Accountability Evidence
Reconciles the Founder's Playbook orchestration framings with HBR Kropp et al.'s accountability evidence; "orchestratio…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…

Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._

Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._

Cited by 22

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agent Identity and Authentication
The foundation control for agentic Zero Trust: cryptographically-rooted per-agent identity (→X.509→hardware attestation…
Agent-Native Infrastructure
The world is still built for humans and must be rewritten for agents; "what do I copy-paste to my agent?"; sensors/actu…
Agent Supply Chain Risk
Runtime-composed agent ecosystems expand the supply-chain attack surface: model poisoning (250 docs backdoor a 13B mode…
Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
Agentic Prompt Injection
Direct and indirect injection of malicious instructions into an agent; LLMs cannot reliably distinguish information fro…
AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
Anthropic Labs
Anthropic's internal incubator — a 'bet factory' of ~a dozen tiny teams exploring the model frontier with lean-startup…
Claude Code Auto Mode
Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…
Claude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
Claude Design
Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…
Compounding Data Moat
Anthropic's prescription for Scale-stage defensibility: time-locked behavioral fingerprint + domain-encoded edge cases…
Cowork
Anthropic's non-code knowledge-work agent product; sibling to Claude Code; output is decks/inbox/dossiers; same MCP/com…
The Future of Agent Interfaces
Interface future is layered: native interaction models for human collaboration, MCP/APIs for structured action, app pro…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Human-AI Accountability Redesign
HBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/es…
Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Orchestration vs Employee Framing: Reconciling the Founder's Playbook with HBR's Accountability Evidence
Reconciles the Founder's Playbook orchestration framings with HBR Kropp et al.'s accountability evidence; "orchestratio…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…