H
Howardism
Plate IIAI Engineering中文HOWARDISM

MCP and Computer Use

PublishedMay 18, 2026FiledConceptDomainAI EngineeringTagsMCPComputer UseTool UseIntegrationAnthropicReading11 minSourceAI-synthesised

Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slack/Figma + niche industry systems); computer use as the GUI-driving catchall when no MCP exists; Boris Cherny's "to the model, it's just tokens"

Illustration for MCP and Computer Use

Sources#

Summary#

Two complementary mechanisms for connecting models to external software, both built by Anthropic, both load-bearing across the Claude Code / Cowork / Chat product surfaces. MCP (Model Context Protocol) is structured, programmatic access — "the same connector you have in Claude AI" plugs into Salesforce, Google Docs, Google Calendar, Slack, Figma, Gmail, and increasingly niche industry systems. Computer use is the catchall for software that doesn't expose an MCP: the model drives the GUI directly (mouse, keyboard, screen), slow but increasingly competent on Opus 4.7. Boris Cherny's framing: "To the model, it's just tokens" — MCP / API / computer use are interchangeable substrates for the same capability.

What MCP is#

Created at Anthropic Labs (late 2024) alongside Claude Code and the desktop app by Boris's founding team. Structured tool-calling protocol with a server-client architecture:

  • Server — runs alongside the external system (Salesforce, Slack, Gmail, internal CRM, niche industry SaaS); exposes available tools as typed function calls.
  • Client — the Claude surface (Claude Code, Cowork, Claude AI, third-party agent) that consumes those tools.
  • Same connectors everywhere. "The same MCP connector that you have in Claude AI, you hook up like Salesforce, you hook up Google Docs, Google Calendar. And then Cowork can use that. Claude CLI can use it. Claude Code everywhere can use it." — Boris Cherny

The structural property: connector logic is written once per system, consumed by every Claude surface. This is what makes Cowork viable across the existing knowledge-work tool surface (Salesforce, Docs, Drive, Slack, etc.) without Anthropic having to build per-tool integrations.

What computer use is#

Generic GUI driving as a fallback when MCP isn't available. Model sees a screenshot, decides what to click/type/scroll, executes via accessibility/automation APIs. Operates on "pretty much any piece of software that you have on your computer" (Boris Cherny).

Properties as of Opus 4.7:

  • Quality — "quite good… does it quite well now, especially with 4.7" (Boris). Anthropic "is like pretty far ahead on computers."
  • Latency — "very slow." Costs more tokens than MCP for the same task because each action requires a screenshot round-trip.
  • Coverage — universal. Computer use is what runs when the target software has no API, no MCP, no Python library — when the only interface is a human-facing UI.

Cowork is the deployment surface where computer use most matters today: many knowledge-work apps lack programmatic interfaces.

The "doesn't matter" thesis#

Boris's framing of the MCP-vs-API-vs-computer-use question:

"All this stuff just doesn't matter that much. It could be MCPs, APIs, just some sort of programmatic access cuz the model doesn't care. To the model, it's just tokens."

The substrate is fungible — the work is "expose capabilities to the model in a form the model can consume." MCP optimizes for structured / fast / cheap; computer use optimizes for universal / fallback / slow. Both reduce to token-level tool invocations.

This connects to The Bitter Lesson: as models improve, the boundary between "use an MCP" and "use computer use" should be a decision the model makes, not a decision a human harness designer makes. Boris's predictions for the next few years:

  • "The model is just going to be doing all the code. It's going to be starting the agents. It's going to be building the environments." — Including, presumably, picking the right substrate to call a tool.
  • Computer use specifically called out as a product area "going to get a lot better."

Cross-surface usage in the wild#

SurfaceMCP examplesComputer-use examples
Claude Code (CLI)GitHub, filesystem, SlackRare — engineering tools usually have CLIs/APIs
CoworkSalesforce, Google Drive/Docs/Calendar, Gmail, Slack, FigmaSoftware without MCP; especially knowledge-work apps
Claude AI (chat)Same connector setComputer-use available
Mobile/webSame MCP infrastructureBrowser-side, with screen-share permissions

Cat Wu's nightly slide-deck workflow (Cowork) explicitly uses MCP — Figma MCP, Slack MCP, Drive MCP — rather than computer use, because the latency cost is unaffordable for a workflow you want to complete by morning.

In the Founder's Playbook (AI-Native Startup Lifecycle)#

The playbook treats MCP as the primary integration mechanism at every stage:

  • Idea stage — Cowork uses Gmail and Google Calendar MCPs to manage outreach threads, schedule customer interviews, run day-7 follow-ups.
  • MVP stage — "The same MCP integrations that managed discovery logistics in the Idea stage apply here" for feedback-session scheduling, bug-report triage, iteration-cycle tracking.
  • Scale stage — MCP integration with niche industry systems your competitors haven't heard of is named as a moat component (e.g., a generalist medical-billing AI breaks on 340B drug program claims; the vertical-specialist's MCP-wired competitor doesn't).

Two playbook case studies make the MCP-as-moat point concrete:

  • Kindora ships an MCP connector that lets nonprofits access its prospecting tools inside Claude itself — the product is consumed via MCP, not just integrated with MCP.
  • Anthropic Skills are referenced as the codification surface for recurring workflows ("how I audit a commercial lease," "how I triage a patient intake form") — Skills + MCP + memory together form the proprietary substrate the Compounding Data Moat concept describes.

Computer use is less prominent in the playbook itself, but Cowork is named as the operational layer that runs across "every stage" — and Cowork is where computer use covers the gaps that MCP doesn't.

Connection to harness-shrinkage#

Harness Shrinkage as Models Improve predicts that prompt scaffolding, permissions, and verification logic migrate inward as models improve. MCP and computer use are the opposite of harness — they are connectors between the model and the world. They don't shrink; they get broader (more systems, more interfaces) and faster (lower latency per action). The boundary that shrinks is the harness around the model's tool-selection decisions, not the toolset itself.

Caveat: as the model becomes better at picking when to use computer use vs. when to demand a real MCP, much of today's manual MCP-server-authoring effort may become "ask the model to build the connector you need." Still not a harness — more like model-authored infrastructure.

Connection to Agentic Misalignment (AM) and accountability#

MCP and computer use are exactly the substrate that turns an LLM into an agent capable of consequential action. Both extend the model's reach into:

  • The customer's CRM
  • The customer's email
  • The customer's calendar
  • Eventually, the customer's full desktop

Human-AI Accountability Redesign's "decision rights" subfront is what governs this — what does the agent do autonomously via MCP/computer use vs. what requires explicit human approval. Claude Code Auto Mode is one concrete instance: classifier auto-approves safe MCP/tool calls, blocks risky ones.

MCP as a security surface#

Zero Trust for AI Agents treats MCP as one of the highest-risk tool surfaces in agentic deployments, and supplies the concrete threat data the earlier sources lacked:

  • Tool poisoning — attackers compromise MCP tool descriptors, schemas, or metadata so the agent invokes a tool based on falsified capabilities; a malicious tool can hide commands in its metadata to exfiltrate data without user knowledge.
  • Rug pulls — a legitimate tool is silently replaced with a malicious version. The first documented in-the-wild malicious MCP server impersonated a legitimate email service and secretly copied all sent emails — the concrete realization of the attack-surface-scales-with-adoption worry.
  • Tool chaining — combining legitimate tools (internal CRM + external email) into a harmful sequence neither would enable alone; because every call runs through trusted binaries under valid credentials, host-centric monitoring sees no malware. This is what Least Agency (capability restrictions per tool) and parameter validation are meant to contain.

The framework's prescriptions: run/host the MCP server yourself on an immutable platform after verifying and self-signing the code (Agent Supply Chain Risk); authenticate tool access with short-lived tokens bound to the calling agent's identity, never static API keys (Agent Identity and Authentication); and gate high-risk invocations behind approval escalation. Claude Code's OAuth 2.0 with auto-refresh for MCP connections and session-scoped "ask" permissions are cited as a reference implementation.

Connections#

Open questions#

  • The MCP ecosystem's growth rate vs. computer use's quality curve: at what point does computer use become good enough that the marginal value of building an MCP server drops? Boris implies this is years off but doesn't quantify.
  • Is computer use a sustainable interface or a transition technology? If most knowledge-work software adds MCP support in the next 24 months, computer use's role shrinks to legacy/desktop-only systems.
  • MCP security model: as the playbook prescribes wiring MCP into Salesforce, Gmail, Calendar for solo founders, the attack surface scales with adoption. Now addressed by Zero Trust for AI Agents (tool poisoning, rug pulls, the first in-the-wild malicious MCP server) — see "MCP as a security surface" above. Open residual: how does a solo founder realistically run/host and self-sign every MCP server the framework recommends, given that the appeal of MCP was zero-integration-effort?
  • How does Cowork's computer-use guardrail compare to Claude Code's auto-mode classifier? Different deployment context, possibly different risk profile.

Derived#

  • The Future of Agent Interfaces — places MCP, computer use, app protocols, native interaction models, and agent-native infrastructure at separate interface boundaries

Sources#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 22
  • Agent Harness Engineering

    Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

  • Agent Identity and Authentication

    The foundation control for agentic Zero Trust: cryptographically-rooted per-agent identity (→X.509→hardware attestation…

  • Agent-Native Infrastructure

    The world is still built for humans and must be rewritten for agents; "what do I copy-paste to my agent?"; sensors/actu…

  • Agent Supply Chain Risk

    Runtime-composed agent ecosystems expand the supply-chain attack surface: model poisoning (250 docs backdoor a 13B mode…

  • Agentic Misalignment (AM)

    Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…

  • Agentic Prompt Injection

    Direct and indirect injection of malicious instructions into an agent; LLMs cannot reliably distinguish information fro…

  • AI-Native Startup Lifecycle

    Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…

  • Anthropic Labs

    Anthropic's internal incubator — a 'bet factory' of ~a dozen tiny teams exploring the model frontier with lean-startup…

  • Claude Code Auto Mode

    Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…

  • Claude Code Best Practices

    Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…

  • Claude Design

    Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…

  • Compounding Data Moat

    Anthropic's prescription for Scale-stage defensibility: time-locked behavioral fingerprint + domain-encoded edge cases…

  • Cowork

    Anthropic's non-code knowledge-work agent product; sibling to Claude Code; output is decks/inbox/dossiers; same MCP/com…

  • The Future of Agent Interfaces

    Interface future is layered: native interaction models for human collaboration, MCP/APIs for structured action, app pro…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Human-AI Accountability Redesign

    HBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/es…

  • Least Agency

    OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…

  • AI Engineering & Agent Tooling

    Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.

  • Open Questions Backlog

    _96 pages with open questions, as of 2026-06-14._

  • Orchestration vs Employee Framing: Reconciling the Founder's Playbook with HBR's Accountability Evidence

    Reconciles the Founder's Playbook orchestration framings with HBR Kropp et al.'s accountability evidence; "orchestratio…

  • The Bitter Lesson

    Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…

  • Zero Trust for AI Agents

    Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…

Related articles
  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Agent Loop Pattern

    `/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Anthropic

    AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…

  • Open Questions Backlog

    _96 pages with open questions, as of 2026-06-14._