The Crack Appeared Fast
On March 11, 2026, Perplexity CTO Denis Yarats took the stage at the Ask 2026 conference and announced they were moving away from MCP. The statement was direct: tool schemas eat 72% of the context window before the agent processes a single word of user input. Authentication is clunky. Most features go unused. For Perplexity's use case, MCP was more overhead than it was worth.
This was not a fringe voice. Perplexity runs one of the highest-volume AI query pipelines in the industry. When their CTO makes an architectural decision public, it carries signal.
“MCP's tool definitions consume 72% of available context window before the agent processes a single word of user input.”
Cloudflare had published findings in the same period. Their Code Mode — which lets agents write and execute code rather than calling pre-defined MCP tools — cut token usage by 81% compared to describing the same API surface as MCP tool definitions. For a complex integration like 2,500 API endpoints, MCP required roughly 244,000 tokens to express what Code Mode expressed in approximately 1,000 tokens.
Two major operators, same conclusion: the protocol has a cost problem. And the cost is context.
The M×N Argument
To be fair about MCP: it was designed to solve a real problem. Before the protocol existed, every AI integration was custom. You wanted your agent to query a database — you wrote a function, described it to the model, handled the call, parsed the response. You wanted it to interact with GitHub — same process, different implementation. M models multiplied by N tools produced M×N custom integrations. Every new model meant re-implementing every tool. Every new tool meant integrating it with every model.
MCP promised to reduce this to M+N. Implement the protocol once on each side. Any compliant model talks to any compliant tool server without custom glue code. Anthropic announced the protocol on November 25, 2024. OpenAI, Google DeepMind, and Microsoft followed within months. The argument sounded reasonable. On paper it still does.
The Shell Was Right There
Large language models are trained on billions of shell interactions. Stack Overflow answers that show curl commands. GitHub repositories full of Makefiles and shell scripts. Man pages. README files. Decades of Unix knowledge, densely represented in the training corpus.
The practical consequence: models already know gh, git, stripe, aws, curl, jq, psql — not superficially, but deeply. They know the flags, the output formats, the pipe patterns, the error codes. This knowledge costs zero tokens to activate. There is no schema to load. No server to start. No protocol to negotiate. You give the model shell access, and it already knows how to use every mature CLI tool in existence.
CLI tools also compose natively. The model does not just know the individual tools — it knows the patterns for chaining them. `gh issue list --json | jq '.[] | .number'` is not something the model needs to be taught. It is something the model has seen thousands of times. That composability is structural, not incidental.
Any tool with a CLI is immediately accessible. Most mature tools — Stripe, GitHub, AWS, Cloudflare, Kubernetes, PostgreSQL — have excellent CLIs with complete API coverage. The initialization cost is zero.
| Dimension | CLI | MCP |
|---|---|---|
| Initialization cost | Zero — model pre-trained on shell | Schema loading on every conversation |
| Model familiarity | Deep — billions of training examples | Protocol is 16 months old |
| Composability | Native via pipes and shell operators | Requires custom orchestration |
| Auth complexity | Standard credential files, env vars | OAuth flows, token management per server |
| Deployment | Tools already installed | MCP server must be running and reachable |
| Reliability | 100% in Scalekit benchmark | 72% — 7/25 runs failed (TCP timeouts) |
The Numbers Don't Lie
Scalekit ran 75 head-to-head comparisons for token efficiency and a separate 25-run reliability test for MCP against GitHub's Copilot server. The results were not close.
The reliability gap matters as much as the token gap. A 72% success rate is not a production-viable reliability posture for any synchronous workflow. The failures were not application errors — they were TCP-level connection timeouts, which means the underlying transport was the failure point. This is a structural problem with long-lived MCP server connections, not a configuration issue.
The token numbers explain why Perplexity moved away. At scale, that 32x difference compounds into meaningful inference cost and, more importantly, meaningful reduction in the context available for actual task work.
CORBA, SOAP, and Now MCP
This pattern has a history. In the 1990s, a committee of enterprise software companies designed CORBA — the Common Object Request Broker Architecture — to solve distributed object communication. The problem they identified was real: heterogeneous systems needed to call each other's methods across language and network boundaries. The solution they built was elaborate. CORBA's object adapter API required 200+ lines of interface definitions for functionality that needed approximately 30 lines. ACM Queue documented this in 2006, noting the ceremony-to-function ratio as a primary reason for CORBA's eventual abandonment.
SOAP repeated the pattern in the early 2000s. Microsoft's answer to web services: XML envelopes, WSDL interface description files, strict schemas, code generation pipelines. The problem SOAP addressed — cross-system method invocation over HTTP — was genuine. The solution was ceremonial.
Roy Fielding published his PhD dissertation in 2000. It described REST: use HTTP as it was designed, treat resources as URLs, use verbs as operations. HTTP was already there. REST won.
Three Honest Hypotheses
Why did MCP end up this way? Three hypotheses, none of them flattering, all of them plausible.
Why Anthropic Built a Protocol Instead of Using the Shell
The engineers who designed MCP came predominantly from ML and research backgrounds, not systems and Unix backgrounds. They did not think instinctively in terms of shell pipelines, tool composition, and the Unix philosophy of small tools that do one thing well. They thought in terms of APIs, schemas, and protocols — the vocabulary of the environments they knew. The shell was not invisible to them; it simply was not their first instinct for the integration layer.
A proprietary protocol, even an "open" one, creates ecosystem gravity. If every tool implements MCP for Claude, switching to another model mid-workflow introduces friction — the new model needs MCP client support. The Linux Foundation donation in December 2025 neutralised this concern in practice, but the incentive existed at design time. A protocol with Claude as the primary client has different strategic value than a bash tool that works with any model.
The combinatorial integration explosion problem that MCP was designed to solve is genuine. The mistake was in the solution: build a new protocol layer instead of asking what primitive already solves this. The answer was the shell. Any model with bash access can call any CLI tool. The M×N problem dissolves not through a new protocol but through a shared execution environment that all models can already reason about.
Anthropic's Own Product Proves the Point
Claude Code is Anthropic's flagship developer product. It is the company's most visible bet on agentic AI. It ships with a bash tool — direct shell access — as its primary mechanism for interacting with the developer's environment.
Claude Code can run `gh pr create`, `stripe customers list`, `git log --oneline`, `kubectl get pods`. It does all of this without MCP servers, without JSON-RPC, without schema loading, without protocol negotiation. It opens a shell, runs commands, reads output, and reasons about what to do next.
This is not a minor implementation detail. This is the company that invented MCP, in their most-used developer product, making an explicit architectural choice to use the shell instead of their own protocol.
“The company that invented MCP built their flagship developer product on a bash tool.”
The most charitable interpretation is that MCP and bash serve different use cases, and Anthropic chose the right tool for each. That may be correct. The less charitable interpretation is that the engineers building Claude Code — who are closer to the daily reality of agent tool use than the team that designed MCP — made a pragmatic judgment that the protocol they inherited was not the right abstraction for their product.
Where MCP Survives and Where It Doesn't
MCP has genuine strengths in specific scenarios. Multi-tenant SaaS is the clearest case: when an agent needs to act on behalf of different users, each with their own credentials and access scopes, MCP's OAuth-per-user model is structurally correct. The CLI alternative — switching credential files per user — is workable but clunky at scale.
Dynamic tool discovery is another legitimate use case. If an agent needs to discover new tools at runtime without a redeploy, MCP's discovery mechanism has no obvious CLI equivalent. APIs with no CLI coverage are a third case where MCP may be the only practical option.
Where MCP fails: production agent pipelines where token cost compounds at volume, latency-sensitive workflows where server startup and schema loading add measurable overhead, and any deployment with a fixed, known toolset where the dynamic discovery benefit does not apply.
| Scenario | Better Approach | Reason |
|---|---|---|
| Single-agent, known tools with CLI | CLI | Zero initialization cost, full model familiarity |
| Multi-tenant SaaS, per-user auth | MCP | OAuth-per-user is structurally correct |
| Latency-sensitive pipeline | CLI | No server startup, no schema loading |
| Fixed toolset, no new tools at runtime | CLI | Dynamic discovery adds cost with no benefit |
| API with no CLI coverage | MCP or direct API call | No CLI alternative exists |
| Dynamic tool discovery | MCP | Protocol handles this; CLI does not |
The security picture adds weight to the CLI side. Docker's analysis of open-source MCP servers (Docker blog: MCP Security Issues Threatening AI Infrastructure) found that 43% have command injection vulnerabilities and 43% have flawed OAuth authentication flows. These are not edge cases — they represent structural problems with how MCP server authors are handling two of the hardest security problems in software. CLI security is not perfect, but its threat model is well-understood and its failure modes are documented by decades of practice.
How We Actually Build
At Fordel, CLI is the default. The burden of proof is on MCP, not on the shell. If a tool has a mature CLI — GitHub, Stripe, AWS, Cloudflare, PostgreSQL, Kubernetes — we use the CLI. Zero initialization cost, full model familiarity, high reliability. The model already knows these tools. We do not need to teach it.
MCP earns its way in on two conditions, both of which are genuine exceptions rather than defaults. First: multi-tenant SaaS where the agent acts on behalf of distinct end users, each with their own OAuth scope — at that point, CLI credential-switching becomes clunky enough that MCP's per-user auth model is structurally correct. Second: a system that has no CLI at all, where neither a shell command nor a direct API call is practical. Both cases exist. Neither is common.
Every other integration decision starts and ends with the shell. Not because it is familiar, but because it is the simplest mechanism that meets the security and reliability requirements. Simplest wins. It always has.
The engineers building the best AI agents in 2026 know their Unix tools as well as their LLM APIs. That is not a coincidence.