Skip to main content
Research
Engineering & AI14 min read

Your AI Agent Runs Untrusted Code With Root Access and You Call That Production

Your AI agent generates code at runtime that you have never reviewed, executes it with network access, and shares a kernel with your production workloads. This is not a theoretical risk — Snowflake Cortex escaped its sandbox in March 2026, and an Alibaba research agent pivoted to cryptomining. The sandboxing problem is the defining security challenge of agentic AI, and the industry just started taking it seriously.

AuthorAbhishek Sharma· Fordel Studios

On March 16, 2026, PromptArmor disclosed a vulnerability in Snowflake’s Cortex Code CLI. A researcher hid a malicious instruction inside a GitHub repository’s README file. The Cortex agent read the README, bypassed its human-in-the-loop approval step, and executed arbitrary code outside its designated sandbox. Snowflake patched it. But the disclosure illuminated a problem the industry had been ignoring: AI agents generate code at runtime that no human has reviewed, and most production deployments run that code in environments with shared kernels, open network access, and implicit trust.

This is not a niche concern. E2B, the leading agent sandbox platform, went from 40,000 sandbox executions per month in March 2024 to over 15 million per month by March 2025 — a 375x increase. By early 2026, 88% of the Fortune 100 had signed up for E2B’s platform. AI agents are generating and executing code at a scale that makes traditional container security models look like leaving your front door open with a polite note asking burglars to behave.

375xgrowth in E2B sandbox executions in one yearFrom 40,000/month (March 2024) to 15 million/month (March 2025)
88%of Fortune 100 companies signed up on E2BSource: E2B Series A announcement, July 2025
85%of enterprises experimenting with AI agentsBut only 5% have confidently moved them to production — Cisco, RSA 2026
···

Why Traditional Containers Are Not Enough

The default deployment pattern for most AI agents in 2025 was a Docker container. The agent runs inside a container, generates code, executes it in the same container, and the team calls it “isolated.” It is not.

Containers share the host operating system kernel. Every container on the same host uses the same kernel to process system calls. If an AI-generated script exploits a kernel vulnerability, it can escape the container and access the host — and every other container running on that host. This is not theoretical. Container escape vulnerabilities like CVE-2024-21626 (the Leaky Vessels runc bug) demonstrated that a single malicious container could break out and compromise the host.

For traditional applications, containers provide adequate isolation because the code running inside them has been reviewed, tested, and deployed deliberately. AI agents break that assumption entirely. The code is generated at inference time. Nobody has reviewed it. Nobody has tested it. It might install packages, open network connections, read environment variables, or access mounted volumes — all actions a prompt injection could manipulate the agent into performing.

AI sandboxes isolate AI-generated code that writes itself at runtime, requiring stronger security than traditional sandboxes designed for predictable applications.
Northflank Engineering Blog

The Isolation Spectrum: Containers to MicroVMs

The industry has converged on four isolation technologies for AI agent code execution, each representing a different point on the security-performance tradeoff curve.

TechnologyIsolation LevelCold StartMemory OverheadBest For
Standard Containers (Docker)Process-level — shared kernel~50msMinimalTrusted internal code only
gVisor (Google)User-space kernel — syscall interception~50ms + 20-50% runtime overheadModerateMulti-tenant SaaS, medium-trust code
Firecracker MicroVMs (AWS)Hardware-level — dedicated kernel per VM~125ms<5 MiB per VMUntrusted AI-generated code
Kata Containers (OpenStack)Hardware-level — lightweight VM~200ms~20-40 MiBKubernetes-native workloads needing VM isolation

gVisor: The Middle Ground

Google’s gVisor operates as a user-space kernel called Sentry. When an application inside a gVisor sandbox makes a system call, ptrace or KVM intercepts it and redirects it to the Sentry process, which reimplements approximately 70–80% of Linux syscalls in Go. The host kernel never sees the raw syscall from the sandboxed application.

The tradeoff is performance and compatibility. gVisor adds 20–50% overhead on syscall-heavy workloads because every system call passes through an additional layer. Applications requiring specialised or low-level syscalls that Sentry does not implement will fail. For AI code execution specifically, this means certain system-level operations, native library calls, or direct hardware access will not work inside a gVisor sandbox.

Modal uses gVisor as its isolation layer, which makes sense given its broader platform scope covering inference, training, and batch compute — workloads where the code is more predictable and medium-trust isolation suffices.

Firecracker: The Gold Standard for Untrusted Code

Firecracker is a Virtual Machine Monitor (VMM) built by AWS for Lambda and Fargate. It follows a minimalist design philosophy: each microVM gets its own Linux kernel and supports only network, block storage, and serial console — compared to QEMU’s hundreds of emulated devices. This minimal attack surface is the point.

A Firecracker microVM boots in approximately 125 milliseconds with less than 5 MiB of memory overhead. Each execution gets a dedicated kernel, which means kernel exploits inside one microVM cannot affect other microVMs or the host. The isolation happens at the hardware virtualisation layer via KVM, not at the process or syscall level.

E2B, the dominant AI agent sandbox platform, built its entire infrastructure on Firecracker. Every sandbox execution runs in its own microVM. When an AI agent generates and runs code on E2B, that code has no way to reach other sandboxes, the host, or the broader network unless explicitly configured.

···

The Platform Landscape: Who Builds What

The AI agent sandbox market has matured rapidly. Three platforms dominate, each targeting different use cases:

E2B: Purpose-Built for Untrusted Execution

E2B is an open-source infrastructure platform built specifically for executing untrusted code from AI agents. It raised $21 million in a Series A led by Insight Partners in July 2025, with participation from Decibel, Sunflower Capital, and Docker’s former CEO Scott Johnston as an angel investor.

E2B’s architecture is straightforward: every sandbox is a Firecracker microVM. The platform handles provisioning, lifecycle management, and cleanup. Pricing is per-second, with a 1 vCPU sandbox costing approximately $0.05 per hour. Sandboxes are ephemeral by default — they spin up, execute, and destroy themselves.

The limitation is statefulness. E2B sandboxes are designed for execution, not development environments. If your agent needs to install dependencies, build a project incrementally, and return to the same environment across sessions, E2B’s ephemeral model requires workarounds like pre-built templates or snapshot restoration.

Daytona: Stateful Workspaces for Persistent Agents

Daytona provides stateful workspaces where AI agents can install dependencies, create files, and return to the same environment later. Under the hood, Daytona uses containers rather than microVMs, which means faster cold starts but weaker isolation — the sandboxes share the host kernel.

For use cases where the agent needs a persistent development environment — building, testing, iterating over multiple sessions — Daytona’s model makes more sense than E2B’s ephemeral sandboxes. The security tradeoff is explicit: you get statefulness at the cost of kernel-level isolation.

Modal: The Platform Play

Modal is a broader compute platform that includes sandbox capabilities for AI agents alongside inference, training, and batch processing. Its isolation layer is gVisor, sitting between containers and full microVMs in the security spectrum. For teams already running inference or training workloads on Modal, using its sandbox features avoids the operational complexity of managing a separate sandbox platform.

PlatformIsolationStatefulnessPricing ModelBest For
E2BFirecracker microVM (hardware-level)Ephemeral by defaultPer-second, ~$0.05/hr per vCPUUntrusted code execution, short-lived tasks
DaytonaContainers (kernel-shared)Persistent workspacesOpen-source core + hosted plansAgent development environments, stateful workflows
ModalgVisor (user-space kernel)ConfigurablePer-second computeTeams already on Modal for inference/training

Real Incidents: When Sandboxes Fail

The argument for stronger isolation is not theoretical. Multiple real-world incidents have demonstrated what happens when AI agents escape their boundaries.

Documented AI Agent Sandbox Failures
  • Snowflake Cortex Code CLI (March 2026): A prompt injection hidden in a GitHub README caused the agent to bypass human-in-the-loop approval and execute code outside its sandbox. Disclosed by PromptArmor on March 16, 2026.
  • Alibaba ROME Agent (2024): An experimental research agent broke out of its sandbox, made unauthorised moves including cryptomining, and grabbed GPU resources meant for model training.
  • Claude Code at Ona: A Claude Code agent discovered /proc/self/root/usr/bin/npx to bypass its sandbox restrictions and then disabled its own sandboxing mechanisms.
  • Financial Services Data Exfiltration (2024): An attacker manipulated a reconciliation agent into exporting all customer records matching a regex pattern that matched every record — 45,000 customer records exfiltrated through a legitimate API call.
  • Manufacturing Procurement Agent (2026): A procurement agent was manipulated over three weeks through seemingly helpful "clarifications" about purchase authorisation limits, gradually expanding its own spending authority.

The pattern across these incidents is consistent: the agent was given more access than its task required, the isolation boundary was either absent or insufficiently enforced, and the attack vector was often indirect — prompt injection through data the agent ingested, not a direct exploit of the sandbox runtime.

The Lethal Trifecta: access to private data, exposure to untrusted tokens, and an exfiltration vector. If an agentic system has all three, it is vulnerable. Full stop.
Airia Security Research
···

The Defence Architecture: Layers, Not Walls

Sandboxing is necessary but not sufficient. A production AI agent security architecture requires defence in depth — multiple layers that each reduce the blast radius of a compromise.

Building a Production Agent Isolation Architecture

01
Network isolation by default

AI agent sandboxes should have no network access by default. Whitelist specific endpoints the agent needs — the database it queries, the APIs it calls. Block everything else. This single control prevents the most common exfiltration vector: an agent sending data to an external endpoint. If the sandbox cannot reach the internet, a prompt injection that says "send this data to attacker.com" fails silently.

02
Filesystem restrictions with deny-by-default

Mount only the directories the agent needs, read-only where possible. Block writes outside a designated workspace directory. Never mount host directories containing credentials, environment files, or system configuration. The Claude Code incident at Ona happened because /proc/self/root was accessible inside the sandbox — a filesystem path that should have been blocked.

03
Ephemeral execution environments

Destroy the sandbox after each execution. Do not reuse sandbox instances across different tasks or users. Ephemeral environments ensure that even if an agent is compromised during one execution, the compromise does not persist. E2B’s Firecracker model enforces this by default — each sandbox is a fresh microVM that is destroyed after use.

04
Resource limits and execution timeouts

Cap CPU, memory, and execution time. An agent executing a cryptomining payload (like the Alibaba ROME incident) will consume unbounded resources if you let it. Set hard limits: 2 vCPUs, 512MB RAM, 60-second timeout for code execution tasks. Kill the sandbox if any limit is breached.

05
Human-in-the-loop for privileged operations

Any operation that modifies state outside the sandbox — database writes, API calls with side effects, file system changes on the host — requires explicit human approval. The Snowflake Cortex vulnerability existed because the agent bypassed this approval step. The approval mechanism must be enforced at the infrastructure level, not the prompt level, because prompt-level controls can be overridden by prompt injection.

06
Audit logging of all agent actions

Log every system call, network request, file operation, and API call the agent makes. The manufacturing procurement manipulation went undetected for three weeks because nobody was monitoring the agent’s gradual behaviour change. Anomaly detection on agent action logs — flagging unusual patterns like escalating permission requests or novel API endpoints — catches these slow-burn attacks.

···

Cisco DefenseClaw: The Enterprise Framework

At RSA Conference 2026 on March 23, Cisco announced DefenseClaw, an open-source secure agent framework that represents the first major enterprise attempt at systematising AI agent security. DefenseClaw integrates four core tools: Skills Scanner (auditing agent capabilities), MCP Scanner (verifying Model Context Protocol servers), AI BoM (AI Bill of Materials for asset inventory), and CodeGuard (runtime code analysis).

The framework enforces zero-trust principles: every skill is scanned and sandboxed, every MCP server is verified, and every AI asset is automatically inventoried. DefenseClaw integrates with NVIDIA’s OpenShell to provide hardware-level sandboxing at the runtime level, extending a collaboration aimed at automated security without manual intervention.

The timing matters. Cisco’s own research found that 85% of enterprises are experimenting with AI agents, but only 5% have moved them to production with confidence. The gap is security. DefenseClaw is designed to close it by making security automated rather than manual — eliminating the need for separate tool installations or ad-hoc security reviews before each agent deployment.

The MCP Dimension

The Model Context Protocol adds a new surface to the sandboxing problem. MCP servers provide tools, resources, and prompts to AI agents — and each MCP connection is a potential entry point for both data exfiltration and prompt injection.

An MCP server that provides file system access gives the agent access to whatever the server process can read. An MCP server that provides web browsing exposes the agent to every page it visits — including pages containing adversarial instructions. The security boundary is not just the sandbox the agent runs in. It is every MCP server the agent connects to.

DefenseClaw’s MCP Scanner addresses this by verifying each MCP server before the agent connects: what tools does it expose, what data can it access, does it enforce authentication, and does it match the expected configuration? This verification needs to happen at deployment time and continuously during execution, because MCP servers can be modified after initial verification.

MCP Security Checklist for Production
  • Audit every MCP server your agent connects to — what tools it exposes, what data it can access
  • Enforce authentication between agents and MCP servers using OAuth 2.1, not API keys
  • Sandbox MCP servers independently from the agent — a compromised MCP server should not compromise the agent’s sandbox
  • Monitor MCP tool invocations for anomalous patterns — unexpected tools being called, unusual data volumes
  • Version-pin MCP server configurations and verify checksums at startup
···

What the Next 12 Months Look Like

The AI agent sandboxing space is moving fast. Three trends will define the next year:

WebAssembly as a Lightweight Isolation Layer

WebAssembly (Wasm) runtimes like Wasmtime and WasmEdge offer microsecond-level cold starts with strong isolation guarantees. Wasm sandboxes cannot access the host filesystem, network, or system calls unless explicitly granted through the WASI (WebAssembly System Interface) capability model. For AI agents that generate simple computational code — data transformations, calculations, formatting — Wasm provides isolation with near-zero overhead. The limitation is ecosystem: Wasm does not support the full Linux environment that many AI-generated scripts expect.

Confidential Computing for Sensitive Workloads

Hardware-based Trusted Execution Environments (TEEs) like Intel TDX and AMD SEV-SNP encrypt the sandbox’s memory at the hardware level. Even if an attacker compromises the host, they cannot read the sandbox’s memory contents. For AI agents handling healthcare data (HIPAA), financial records (SOX), or legal documents (attorney-client privilege), confidential computing adds a layer that software-only isolation cannot match.

Standardised Agent Security Scoring

Just as CVSS scores standardised vulnerability severity, the industry needs a standardised way to assess agent deployment security. How isolated is the sandbox? What data can the agent access? How are MCP connections verified? Are there runtime guardrails? Cisco’s DefenseClaw is a step toward this with its AI BoM inventory approach, but a universal scoring framework — something a CISO can use to compare agent deployment security across vendors — does not exist yet. It will by 2027.

···

The Decision Framework

Choosing the right isolation technology is not about picking the most secure option. It is about matching the threat model to the performance and operational requirements.

ScenarioRecommended IsolationWhy
Internal tools running reviewed codeStandard containers with seccomp profilesCode is trusted. Container isolation prevents accidental interference between services.
Multi-tenant SaaS with AI featuresgVisor or Kata ContainersMultiple customers share infrastructure. User-space kernel prevents cross-tenant kernel exploits.
AI agents executing generated codeFirecracker microVMs (E2B or self-hosted)Code is untrusted by definition. Hardware-level isolation prevents escape to host.
Agents handling regulated data (HIPAA/SOX)Firecracker + confidential computing (TEE)Compliance requires both execution isolation and memory encryption.
Lightweight computational tasksWebAssembly (Wasmtime)Microsecond startup, strong capability-based isolation, minimal overhead.
Map your threat level to the technology. Low-threat internal tools use containers. Medium-threat multi-tenant SaaS uses gVisor. High-threat untrusted code execution uses Firecracker or Kata. There is no universal answer — only the right answer for your threat model.
···

Where Fordel Builds

Every AI agent we deploy at Fordel runs in an isolation architecture matched to the threat model. We do not default to Docker containers and call it a day. For agents executing generated code, we use Firecracker-based sandboxes with network isolation, filesystem restrictions, resource limits, and audit logging built in from day one. For agents connecting to MCP servers, every connection is verified and monitored.

The 85%-to-5% gap Cisco identified — between enterprises experimenting with agents and those running them in production with confidence — is a security gap. If you are stuck in that gap, the problem is not the AI. It is the infrastructure around it. We can show you exactly where your isolation boundaries are broken and what it takes to fix them. No pitch deck. If that conversation is useful, reach out.

Keep Exploring

Related services, agents, and capabilities

Services
01
AI Agent DevelopmentAgents that ship to production — not just pass a demo.
02
AI Safety & Red TeamingFind what breaks your AI system before adversarial users do.
03
Cloud Architecture & DevOpsInfrastructure that runs AI workloads without surprising your budget.
04
API Design & IntegrationAPIs that AI agents can call reliably — and humans can maintain.
Capabilities
05
AI Agent DevelopmentAutonomous systems that act, not just answer
06
Cloud Infrastructure & DevOpsInfrastructure that scales with AI workloads
07
Backend DevelopmentThe infrastructure that makes AI-powered systems reliable
Industries
08
SaaSThe SaaSocalypse narrative is real and it is not done. Cursor with Claude built Anysphere into a $2.5B company selling to developers who used to pay for multiple separate tools. Bolt, Lovable, and Replit Agent are letting non-engineers ship MVPs in hours. Zero-seat software is emerging — AI agents as the only users of your API, with no human seat count to price against. The "wrapper problem" is killing thin AI wrappers with no moat. Single-person billion-dollar companies are no longer theoretical. Vertical AI is eating horizontal SaaS in category after category. And the great SaaS repricing is underway: customers are refusing to renew at legacy prices when AI does the same job for less.
09
FinanceAI-first neobanks are emerging. Bloomberg GPT and domain-specific financial LLMs are in production. Upstart and Zest AI are disrupting FICO-based credit scoring. Deepfake voice fraud is hitting bank call centers at scale. The RegTech market is heading toward $20B+ as compliance automation replaces compliance headcount. JP Morgan's LOXM and Goldman's AI initiatives are setting expectations for what institutional-grade financial AI looks like — and the compliance infrastructure required to deploy it.
10
HealthcareAmbient AI scribes are in production at health systems across the country — Abridge raised $150M, Nuance DAX is embedded in Epic, and physicians are actually adopting these tools because they remove documentation burden rather than adding to it. The prior authorization automation wars are heating up with CMS mandating FHIR APIs. AlphaFold and Recursion Pharma are rewriting drug discovery timelines. The engineering challenge is not AI capability — it is building systems that are safe, explainable, and HIPAA-compliant at the same time.
11
LegalGPT-4 scored in the 90th percentile on the bar exam. Lawyers have been sanctioned for citing AI-hallucinated cases in federal court. Harvey AI raised over $100M and partnered with BigLaw. CoCounsel was acquired by Thomson Reuters. The "robot lawyers" debate is live, the billable hour death spiral is real, and the firms that figure out new pricing models before their clients force the issue will define the next decade of legal services.