MCP vs CLI: Why Anthropic Over-Engineered a Solved Problem

On March 11, 2026, Perplexity's CTO stood at their developer conference and said they were moving away from MCP. Cloudflare published benchmarks showing MCP consumes 244,000 tokens to describe what 1,000 tokens can express in code. And the sharpest point: Anthropic's own Claude Code — built by the same company that invented MCP — uses a bash tool, not MCP, as its primary integration mechanism. A documented look at why Anthropic over-engineered a solved problem, why this pattern keeps repeating in software history, and what it tells us about how AI tooling actually evolves.

Abhishek Sharma· Head of Engg @ Fordel Studios

March 21, 2026Updated April 8, 202612 min read

MCP vs CLI: Why Anthropic Over-Engineered a Solved Problem

The Crack Appeared Fast

On March 11, 2026, Perplexity CTO Denis Yarats took the stage at the Ask 2026 conference and announced they were moving away from MCP. The statement was direct: tool schemas eat 72% of the context window before the agent processes a single word of user input. Authentication is clunky. Most features go unused. For Perplexity's use case, MCP was more overhead than it was worth.

This was not a fringe voice. Perplexity runs one of the highest-volume AI query pipelines in the industry. When their CTO makes an architectural decision public, it carries signal.

“MCP's tool definitions consume 72% of available context window before the agent processes a single word of user input.”

Denis Yarats, Perplexity CTO

Cloudflare had published findings in the same period. Their Code Mode — which lets agents write and execute code rather than calling pre-defined MCP tools — cut token usage by 81% compared to describing the same API surface as MCP tool definitions. For a complex integration like 2,500 API endpoints, MCP required roughly 244,000 tokens to express what Code Mode expressed in approximately 1,000 tokens.

Two major operators, same conclusion: the protocol has a cost problem. And the cost is context.

···

The M×N Argument

To be fair about MCP: it was designed to solve a real problem. Before the protocol existed, every AI integration was custom. You wanted your agent to query a database — you wrote a function, described it to the model, handled the call, parsed the response. You wanted it to interact with GitHub — same process, different implementation. M models multiplied by N tools produced M×N custom integrations. Every new model meant re-implementing every tool. Every new tool meant integrating it with every model.

MCP promised to reduce this to M+N. Implement the protocol once on each side. Any compliant model talks to any compliant tool server without custom glue code. Anthropic announced the protocol on November 25, 2024. OpenAI, Google DeepMind, and Microsoft followed within months. The argument sounded reasonable. On paper it still does.

The Shell Was Right There

Large language models are trained on billions of shell interactions. Stack Overflow answers that show curl commands. GitHub repositories full of Makefiles and shell scripts. Man pages. README files. Decades of Unix knowledge, densely represented in the training corpus.

The practical consequence: models already know gh, git, stripe, aws, curl, jq, psql — not superficially, but deeply. They know the flags, the output formats, the pipe patterns, the error codes. This knowledge costs zero tokens to activate. There is no schema to load. No server to start. No protocol to negotiate. You give the model shell access, and it already knows how to use every mature CLI tool in existence.

CLI tools also compose natively. The model does not just know the individual tools — it knows the patterns for chaining them. `gh issue list --json | jq '.[] | .number'` is not something the model needs to be taught. It is something the model has seen thousands of times. That composability is structural, not incidental.

Any tool with a CLI is immediately accessible. Most mature tools — Stripe, GitHub, AWS, Cloudflare, Kubernetes, PostgreSQL — have excellent CLIs with complete API coverage. The initialization cost is zero.

Dimension	CLI	MCP
Initialization cost	Zero — model pre-trained on shell	Schema loading on every conversation
Model familiarity	Deep — billions of training examples	Protocol is 16 months old
Composability	Native via pipes and shell operators	Requires custom orchestration
Auth complexity	Standard credential files, env vars	OAuth flows, token management per server
Deployment	Tools already installed	MCP server must be running and reachable
Reliability	100% in Scalekit benchmark	72% — 7/25 runs failed (TCP timeouts)

···

The Numbers Don't Lie

Scalekit ran 75 head-to-head comparisons for token efficiency and a separate 25-run reliability test for MCP against GitHub's Copilot server. The results were not close.

32xToken overhead: MCP vs CLI for the same simple taskScalekit benchmark: 44,026 tokens (MCP) vs 1,365 tokens (CLI). The difference is almost entirely schema — 43 tool definitions injected into every conversation.

72%MCP reliability in production benchmarkCLI achieved 100%. MCP: 7 of 25 runs failed with TCP-level connection timeouts on GitHub's Copilot MCP server.

81%Token reduction: Code Mode vs MCP for complex APIsCloudflare benchmark: 2,500 API endpoints as MCP tools = ~244,000 tokens. Via Code Mode = ~1,000 tokens.

The reliability gap matters as much as the token gap. A 72% success rate is not a production-viable reliability posture for any synchronous workflow. The failures were not application errors — they were TCP-level connection timeouts, which means the underlying transport was the failure point. This is a structural problem with long-lived MCP server connections, not a configuration issue.

The token numbers explain why Perplexity moved away. At scale, that 32x difference compounds into meaningful inference cost and, more importantly, meaningful reduction in the context available for actual task work.

···

CORBA, SOAP, and Now MCP

This pattern has a history. In the 1990s, a committee of enterprise software companies designed CORBA — the Common Object Request Broker Architecture — to solve distributed object communication. The problem they identified was real: heterogeneous systems needed to call each other's methods across language and network boundaries. The solution they built was elaborate. CORBA's object adapter API required 200+ lines of interface definitions for functionality that needed approximately 30 lines. ACM Queue documented this in 2006, noting the ceremony-to-function ratio as a primary reason for CORBA's eventual abandonment.

SOAP repeated the pattern in the early 2000s. Microsoft's answer to web services: XML envelopes, WSDL interface description files, strict schemas, code generation pipelines. The problem SOAP addressed — cross-system method invocation over HTTP — was genuine. The solution was ceremonial.

Roy Fielding published his PhD dissertation in 2000. It described REST: use HTTP as it was designed, treat resources as URLs, use verbs as operations. HTTP was already there. REST won.

Three Honest Hypotheses

Why did MCP end up this way? Three hypotheses, none of them flattering, all of them plausible.

Why Anthropic Built a Protocol Instead of Using the Shell

The Unix gap

The engineers who designed MCP came predominantly from ML and research backgrounds, not systems and Unix backgrounds. They did not think instinctively in terms of shell pipelines, tool composition, and the Unix philosophy of small tools that do one thing well. They thought in terms of APIs, schemas, and protocols — the vocabulary of the environments they knew. The shell was not invisible to them; it simply was not their first instinct for the integration layer.

Protocol as moat

A proprietary protocol, even an "open" one, creates ecosystem gravity. If every tool implements MCP for Claude, switching to another model mid-workflow introduces friction — the new model needs MCP client support. The Linux Foundation donation in December 2025 neutralised this concern in practice, but the incentive existed at design time. A protocol with Claude as the primary client has different strategic value than a bash tool that works with any model.

The M×N framing was real, but the solution was wrong

The combinatorial integration explosion problem that MCP was designed to solve is genuine. The mistake was in the solution: build a new protocol layer instead of asking what primitive already solves this. The answer was the shell. Any model with bash access can call any CLI tool. The M×N problem dissolves not through a new protocol but through a shared execution environment that all models can already reason about.

···

Anthropic's Own Product Proves the Point

Claude Code is Anthropic's flagship developer product. It is the company's most visible bet on agentic AI. It ships with a bash tool — direct shell access — as its primary mechanism for interacting with the developer's environment.

Claude Code can run `gh pr create`, `stripe customers list`, `git log --oneline`, `kubectl get pods`. It does all of this without MCP servers, without JSON-RPC, without schema loading, without protocol negotiation. It opens a shell, runs commands, reads output, and reasons about what to do next.

This is not a minor implementation detail. This is the company that invented MCP, in their most-used developer product, making an explicit architectural choice to use the shell instead of their own protocol.

“The company that invented MCP built their flagship developer product on a bash tool.”

Fordel Studios

The most charitable interpretation is that MCP and bash serve different use cases, and Anthropic chose the right tool for each. That may be correct. The less charitable interpretation is that the engineers building Claude Code — who are closer to the daily reality of agent tool use than the team that designed MCP — made a pragmatic judgment that the protocol they inherited was not the right abstraction for their product.

···

Where MCP Survives and Where It Doesn't

MCP has genuine strengths in specific scenarios. Multi-tenant SaaS is the clearest case: when an agent needs to act on behalf of different users, each with their own credentials and access scopes, MCP's OAuth-per-user model is structurally correct. The CLI alternative — switching credential files per user — is workable but clunky at scale.

Dynamic tool discovery is another legitimate use case. If an agent needs to discover new tools at runtime without a redeploy, MCP's discovery mechanism has no obvious CLI equivalent. APIs with no CLI coverage are a third case where MCP may be the only practical option.

Where MCP fails: production agent pipelines where token cost compounds at volume, latency-sensitive workflows where server startup and schema loading add measurable overhead, and any deployment with a fixed, known toolset where the dynamic discovery benefit does not apply.

Scenario	Better Approach	Reason
Single-agent, known tools with CLI	CLI	Zero initialization cost, full model familiarity
Multi-tenant SaaS, per-user auth	MCP	OAuth-per-user is structurally correct
Latency-sensitive pipeline	CLI	No server startup, no schema loading
Fixed toolset, no new tools at runtime	CLI	Dynamic discovery adds cost with no benefit
API with no CLI coverage	MCP or direct API call	No CLI alternative exists
Dynamic tool discovery	MCP	Protocol handles this; CLI does not

The security picture adds weight to the CLI side. Docker's analysis of open-source MCP servers (Docker blog: MCP Security Issues Threatening AI Infrastructure) found that 43% have command injection vulnerabilities and 43% have flawed OAuth authentication flows. These are not edge cases — they represent structural problems with how MCP server authors are handling two of the hardest security problems in software. CLI security is not perfect, but its threat model is well-understood and its failure modes are documented by decades of practice.

How We Actually Build

At Fordel, CLI is the default. The burden of proof is on MCP, not on the shell. If a tool has a mature CLI — GitHub, Stripe, AWS, Cloudflare, PostgreSQL, Kubernetes — we use the CLI. Zero initialization cost, full model familiarity, high reliability. The model already knows these tools. We do not need to teach it.

MCP earns its way in on two conditions, both of which are genuine exceptions rather than defaults. First: multi-tenant SaaS where the agent acts on behalf of distinct end users, each with their own OAuth scope — at that point, CLI credential-switching becomes clunky enough that MCP's per-user auth model is structurally correct. Second: a system that has no CLI at all, where neither a shell command nor a direct API call is practical. Both cases exist. Neither is common.

Every other integration decision starts and ends with the shell. Not because it is familiar, but because it is the simplest mechanism that meets the security and reliability requirements. Simplest wins. It always has.

The engineers building the best AI agents in 2026 know their Unix tools as well as their LLM APIs. That is not a coincidence.

Frequently Asked Questions

Is MCP over-engineered compared to a simple CLI for AI tool integration?

MCP adds bidirectional communication, structured tool manifests, and session state — overhead that is unnecessary for simple, single-tool integrations. For wrapping a single command-line tool, a CLI wrapper is sufficient and simpler. MCP's complexity pays off only when you need tool discovery across multiple clients, stateful sessions, or standardized integration with multiple LLM hosts.

When should you use MCP instead of a CLI for AI integrations?

Use MCP when: multiple different AI clients need to discover and use the same tools without client-specific integration work, the integration requires persistent session state across tool calls, or you are building a tool that will be published for others to use. Use CLI for internal single-tool integrations, one-off automations, and prototypes.

What problems does MCP actually solve that CLI cannot?

MCP solves: dynamic tool discovery (clients learn what tools are available at runtime), structured schema validation (tool inputs are validated before the model calls them), standardized error responses the model can act on, and multi-client compatibility (one MCP server works with Claude Desktop, Cursor, and any other MCP host without changes).

How does MCP complexity affect maintenance over time?

MCP servers require ongoing maintenance: schema updates when tool signatures change, transport layer compatibility with new MCP spec versions, and authentication management. CLI wrappers are simpler to maintain but require custom integration work for each new AI client. The maintenance tradeoff favors MCP at scale, CLI for small footprints.

What does the MCP vs CLI debate reveal about AI tooling in 2026?

The debate reflects a broader pattern in AI tooling: standards created for the general case add complexity that hurts the simple case. MCP is the right default for the ecosystem but wrong for many individual integrations. Engineers should evaluate whether they are building for one client or many before adopting a protocol-heavy approach.

Part of: Fordel pillar guide

Building Production MCP Servers

Fordel's pillar guide to MCP — building, hosting, and securing custom servers for AI agent tool use.

Read the full guide →

Build with us

Need this kind of thinking applied to your product?

We build AI agents, full-stack platforms, and engineering systems. Same depth, applied to your problem.

Start a conversation View services

Newsletter

Enjoyed this? Get the weekly digest.

Research highlights and AI news, delivered every Thursday. No spam.

Loading comments...

Keep Reading

All articles