MCP in Production: The Engineering Reality

The Model Context Protocol went from Anthropic proposal to industry standard in 14 months. Stripe, GitHub, Cloudflare, and Amazon all ship official MCP servers. OpenAI and Google adopted it. The Linux Foundation now governs it. What the adoption curve hides: 88% of open-source MCP servers have broken authentication, a single CVE compromised 437,000 developer environments, and observability across multi-server setups remains a manual pain. A complete engineering guide to what MCP is, how it actually works in production, and what to get right before you ship.

Abhishek Sharma· Head of Engg @ Fordel Studios

March 20, 2026Updated April 8, 202613 min read

MCP in Production: The Engineering Reality

Fourteen months. That is how long it took the Model Context Protocol to go from an Anthropic technical proposal to an industry-governed standard adopted by OpenAI, Google DeepMind, Microsoft, and hundreds of enterprise teams. The protocol was open-sourced in November 2024. By December 2025, Anthropic had donated it to the Agentic AI Foundation — a Linux Foundation project — making it vendor-neutral.

That trajectory is not normal for infrastructure standards. HTTP took years. OAuth 2.0 took nearly a decade to reach serious enterprise adoption. MCP achieved governance transfer in 13 months. The reason is structural: every team building AI agents hit the same integration wall at the same time, and MCP was the first credible solution that solved the problem at the right abstraction level.

The problem it solves, and the engineering reality of running it in production, are the two things worth understanding before you build on it.

···

What MCP Actually Solves

Before MCP, every AI integration was custom. You wanted your agent to query a database — you wrote a function, described it to the model, handled the call, parsed the response. You wanted it to read from GitHub — same process, different implementation. You wanted it to do both in a single workflow — now you maintained two custom integrations, each with its own auth, its own error handling, its own context representation.

Scale that to a production agent that needs 15 tools, and you have a maintenance surface that grows quadratically. Every new model you want to use requires re-implementing the integrations for that model's function-calling format. Every new tool requires integrations for every model.

Dimension	Function Calling (pre-MCP)	MCP
Tool scope	Per-model, per-application	Cross-model, cross-application standard
Deployment	Bundled inside app runtime	Independently deployed tool servers
Discovery	Hardcoded in system prompt	Dynamic at runtime via protocol
Scaling	Tools share app resources	Each server scales independently
Reusability	Re-implemented per integration	One server, any compliant client
Auth handling	Application-managed	Defined per-server (OAuth or API key)
Versioning	Tied to application deploy	Server-versioned independently

MCP adds meaningful operational overhead compared to a single inline function call. For a single-model, single-tool case, that overhead is not justified. For any agent with more than three tools, or any team maintaining tools across multiple AI products, the MCP model pays for itself quickly.

The Growth Numbers

8M+MCP server downloads by April 2025Up from approximately 100,000 in November 2024 — the month the protocol launched

97MMonthly SDK downloads by late 2025Sourced from Zuplo's State of MCP Report

5,800+MCP servers in the public ecosystemIncluding official servers from Stripe, GitHub, Cloudflare, Vercel, Figma, and Supabase

OpenAI added MCP support to its Agents SDK, Responses API, and ChatGPT desktop in March 2025. Google DeepMind confirmed Gemini MCP support the following month. Microsoft added it to Windows. The inflection point was not Anthropic's own adoption — it was the moment competing model providers decided MCP was the standard rather than a protocol to compete with.

“The protocol won not when Anthropic shipped it, but when OpenAI adopted it.”

That moment matters because it changed the risk calculus for teams considering MCP. Building on a proprietary Anthropic protocol carried model lock-in risk. Building on a standard that OpenAI, Google, and Microsoft all support is a different proposition entirely.

···

How MCP Works at the Transport Layer

Understanding the transport evolution is essential for production deployments, because the original transport design had hard limits that shaped how early adopters hit the wall.

The original MCP specification used Server-Sent Events (SSE) for server-to-client communication. SSE is fine for simple browser-to-server streaming, but in a multi-agent, multi-tool production environment it created stateful connection requirements that did not survive real network conditions. Long-lived SSE connections do not route cleanly through load balancers. They do not scale horizontally without sticky sessions. In a Kubernetes environment they become operational debt almost immediately.

For local development and single-process agents, stdio transport remains valid — the agent process spawns the MCP server as a child process and communicates over stdin/stdout. This is simpler but does not survive multi-process or distributed deployments. The practical rule: stdio for local tooling, Streamable HTTP for anything that sees production traffic.

The Companies That Moved First

The most informative signal in the MCP ecosystem is which companies shipped official servers and what they exposed. These are not experiments — they represent specific engineering bets about how AI will integrate with their platforms.

Stripe shipped an MCP server exposing payment flows, webhook management, and dashboard operations. The bet: AI agents will increasingly orchestrate payment workflows on behalf of users, and those agents need a standardized way to interact with Stripe's API without per-integration custom code.

GitHub's official MCP server enables read/write access to code, issues, and pull requests. This is the foundation for coding agents that do real repository work — not just suggesting code but committing it, opening PRs, triaging issues. The GitHub Actions integration layer becomes relevant once agents can take these actions autonomously.

Cloudflare made an aggressive move: they not only built an MCP server for Workers, KV, R2, and D1, but also launched managed remote MCP server hosting — essentially Cloudflare Workers as an MCP deployment target. This positions them as infrastructure for the MCP ecosystem, not just a participant.

Supabase, Vercel, and Figma all ship official servers. Block, Bloomberg, and Amazon are documented Fortune 500 production deployments. The pattern: companies that move developer tools and APIs are first. Consumer-facing platforms are next.

···

The Security Reality

The adoption curve looks clean from a distance. The security picture up close is not.

The consequences of this are not theoretical:

Documented MCP Security Incidents

CVE-2025-6514: Vulnerability in the mcp-remote OAuth proxy compromised approximately 437,000 developer environments.
January 2026: Three CVEs in Anthropic's own reference Git MCP server — path traversal, arbitrary file deletion, and remote code execution when chained with a filesystem server.
Asana multi-tenant failure: Access control error in shared MCP infrastructure allowed one customer to read another customer's data due to missing auth token isolation.
Supabase + Cursor incident: Agent running with service_role access processed support tickets; attacker embedded SQL in a ticket that exfiltrated tokens through the MCP layer.
June 2025: Hundreds of MCP servers found bound to 0.0.0.0 by default with no authentication — exposed to any device on the local network.

The common thread in these incidents is not exotic attack techniques — it is the gap between what developers assume about MCP security and what the specification actually requires. MCP defines the protocol; it does not mandate auth. That decision is delegated to server implementers, and in the rush to ship integrations, auth often lands in technical debt.

Prompt injection is the other attack surface specific to MCP. Unlike traditional API injection, prompt injection in an MCP context can cause a model to call unintended tools, leak secrets through tool outputs, or execute privileged operations using the agent's credentials. The attack surface is the tool description itself — an adversarially crafted description can influence model behavior at inference time.

“MCP gives agents hands. Prompt injection teaches those hands to pickpocket.”

Fordel Studios

Production Security Checklist for MCP Servers

OAuth over static keys

Implement OAuth 2.0 with proper token scoping. Static API keys in environment variables do not support rotation, scoping, or revocation without rebuilding the integration. This is the single highest-leverage security improvement for most MCP deployments.

Principle of least privilege per tool

Each MCP tool should carry the minimum permissions required for its function. A tool that reads GitHub issues should not hold a token that can delete repositories. Scope credentials at tool granularity, not server granularity.

Multi-tenant isolation at the auth layer

If your MCP server handles requests from multiple users or organizations, auth tokens must be scoped per tenant and must not be shared across request contexts. The Asana incident is a direct failure of this pattern.

Tool description sanitization

Review tool descriptions for prompt injection risk. Descriptions that include user-controlled content, external data, or anything that can be manipulated at runtime create injection surface. Treat tool descriptions as a security boundary.

Bind to localhost, not 0.0.0.0

Local development MCP servers should never bind to 0.0.0.0. The June 2025 incident involved servers in developer environments that were reachable from other network devices due to this default. Explicit localhost binding in development configurations.

···

Observability: The Unsolved Problem

In a standard web application, request tracing is solved. You instrument the application, emit spans, and observe the full request path in Datadog, Honeycomb, or Jaeger. In a multi-agent MCP deployment, the trace crosses three boundaries: the LLM controller, the MCP client layer, and the individual tool servers.

Each of those layers emits its own logs. None of them share a trace context by default. When an agent fails to complete a task and you need to understand why — did the model misinterpret the tool schema? Did the tool server return a malformed response? Did auth silently fail? — you are correlating logs across separate systems by timestamp, which is slow and error-prone at production query volumes.

The practical answer in 2026: treat each MCP server as an instrumented microservice. Emit spans for every tool invocation, include the tool name, input hash (not raw input — inputs may contain sensitive data), response status, and latency. Aggregate these in your existing observability platform. Correlate with LLM controller logs via a shared request ID injected at the agent session boundary.

Multi-Server Orchestration

Most real agent deployments involve multiple MCP servers. A coding agent might simultaneously hold connections to a GitHub MCP server, a Jira MCP server, a documentation search server, and a code execution server. Each server has its own auth requirements, its own latency profile, its own failure modes.

Tool discovery becomes a meaningful engineering problem at this scale. An agent presented with 60+ tools from four servers will consume a significant portion of its context window just on tool descriptions, before any task-specific content. Selective tool loading — presenting only the tools relevant to the current task context — is the correct pattern, but it requires the MCP client to make relevance judgments that most current implementations do not support.

Scenario	Tool Count	Context Impact	Recommended Pattern
Single server, focused tools	3–8 tools	Minimal	Load all tools at session start
Multi-server, task-scoped	10–30 tools	Moderate	Load by server, filter to task
Full enterprise integration	30–100+ tools	Significant	Dynamic tool loading per subtask
Shared agent infrastructure	100+ tools	Context-critical	Tool router agent upstream

The tool router pattern — a lightweight upstream agent that interprets intent and selects the relevant server subset before handing off to the primary agent — is emerging as the standard solution for large tool sets. It adds a hop, but preserves context budget for actual task work.

···

The 2026 MCP Roadmap

The Agentic AI Foundation published the 2026 MCP roadmap in early January. The four areas with the most engineering consequence:

MCP 2026 Development Priorities

Agent-to-agent communication: Standardized protocol for MCP-enabled agents calling other agents as tools — enabling composable multi-agent pipelines without custom orchestration code.
Streaming responses: Tool servers returning partial results progressively, enabling agents to begin processing long-running tool outputs before completion.
Authorization delegation: Standardized patterns for agents requesting and delegating scoped permissions to tool servers, addressing the OAuth gap at the specification level.
Tool discovery registries: Public and private catalogs of MCP servers with standardized metadata, enabling agents to dynamically discover new tools at runtime.

The agent-to-agent communication work is the most consequential for production systems. Today, multi-agent pipelines require custom orchestration — you build the routing logic, the context passing, the failure handling. If MCP standardizes agent-to-agent calling, orchestration frameworks like LangChain and AutoGen lose their primary architectural differentiation. That has real consequences for teams that have bet heavily on those frameworks.

“If MCP standardizes agent-to-agent calling, every custom orchestration framework becomes a thin wrapper over a protocol.”

What to Build Now

The practical engineering guidance for teams in 2026:

If you are integrating AI into existing software, build your integration surface as an MCP server from the start rather than as custom function calling. The upfront cost is higher — you need to follow the spec, handle transport correctly, implement auth properly. The downstream benefit is that your integration works with any MCP-compatible agent, not just the one you built it for.

If you are building agents, do not write custom integrations for tools that already have MCP servers. GitHub, Stripe, Cloudflare, Supabase — official servers exist. Use them. The maintenance burden of custom integrations compounds; MCP server maintenance is someone else's problem.

If you are evaluating MCP servers from the ecosystem, treat them as third-party dependencies with security implications. Check the auth model before adding them to a production agent. Static key servers connecting to sensitive APIs are the highest-risk category.

Where Fordel Works in This Stack

We build production AI integrations for clients across SaaS, finance, and legal. The shift to MCP changes how we structure that work. Rather than building per-client, per-agent custom tool implementations, we now build MCP-compliant tool servers that expose client systems — CRMs, internal databases, document stores, compliance systems — to any agent the client runs.

The security and observability gaps in the ecosystem are real engineering work, not theoretical concerns. We instrument every MCP server we ship with OpenTelemetry spans. We implement OAuth at the server level, not API keys. We bind to proper scopes. These are not premium features — they are baseline requirements for anything touching production data.

If you are planning an agentic system and have not mapped your tool surface to MCP yet, that is the starting point. The integration architecture decision you make now will determine how much of your codebase you rewrite when agent-to-agent becomes standardized.

Frequently Asked Questions

What is MCP and what does a production MCP deployment actually require?

MCP (Model Context Protocol) is Anthropic's open protocol for connecting AI models to external tools and data sources. Production deployment requires more than the tutorial shows: proper auth on every tool call, structured error responses, idempotency for tool operations that have side effects, rate limiting, and full audit logging of every tool invocation with input/output snapshots.

How do you handle MCP server failures in production?

Production MCP failure handling: implement circuit breakers per tool, return structured error objects (not raw exceptions) so the LLM can reason about failures, add retry logic with exponential backoff for transient failures, design tools to be idempotent so retries do not cause duplicate side effects, and log all failures with full context for debugging.

What are the security requirements for production MCP servers?

MCP security checklist: authenticate every tool call (not just session init), validate all input parameters against your JSON Schema before execution, implement per-tool rate limiting, sanitize tool outputs before returning to the LLM to prevent indirect prompt injection, and maintain a tamper-proof audit log of all tool calls.

How does MCP scale in multi-tenant production environments?

Multi-tenant MCP requires per-tenant tool scoping (tenant A should not be able to call tools scoped to tenant B), tenant-isolated rate limiting, per-tenant audit logs, and session isolation to prevent state leakage between tenant sessions. Most open-source MCP server examples are single-tenant and require significant work to be safe in multi-tenant deployments.

When should you use MCP vs direct API integration in production?

Use MCP when the agent needs to dynamically discover and call tools across multiple capabilities without hard-coded integration logic. Use direct API integration when the tool surface is fixed, well-known, and performance-critical — MCP's protocol overhead (tool discovery, schema validation) adds 20–50ms per call compared to a direct HTTP call.

Part of: Fordel pillar guide

Building Production MCP Servers

Fordel's pillar guide to MCP — building, hosting, and securing custom servers for AI agent tool use.

Read the full guide →

Build with us

Need this kind of thinking applied to your product?

We build AI agents, full-stack platforms, and engineering systems. Same depth, applied to your problem.

Start a conversation View services

Newsletter

Enjoyed this? Get the weekly digest.

Research highlights and AI news, delivered every Thursday. No spam.

Loading comments...

Keep Reading

All articles