Traditional security architecture draws a perimeter around trusted systems and assumes that traffic inside the perimeter is safe. AI-native applications demolish this assumption. An LLM processing user input can be manipulated through prompt injection to execute unintended actions. An AI agent with tool access can be tricked into accessing resources it should not touch. A RAG system can leak sensitive documents through carefully crafted queries. Every component in an AI system is both a potential entry point and a potential exfiltration channel.
Zero-trust architecture — "never trust, always verify" — provides the right security model for this reality. Every request is authenticated and authorized regardless of its origin. Every data access is logged and auditable. Every component has minimum necessary permissions. These principles, applied to AI-specific attack surfaces, create a defensible security posture.
AI-Specific Attack Surfaces
| Attack Surface | Description | Zero-Trust Mitigation |
|---|---|---|
| Prompt injection | User manipulates LLM via crafted input | Input validation, output filtering, least-privilege tool access |
| Data exfiltration via model | Model outputs leak training data or RAG docs | Output filtering, document-level access control in RAG |
| Agent tool abuse | Agent uses tools beyond intended scope | Per-tool authorization, action logging, human-in-the-loop for sensitive ops |
| Model poisoning | Manipulated training data affects model behavior | Data provenance tracking, validation pipeline |
| Credential theft via AI | AI system stores/processes credentials insecurely | Short-lived tokens, no credential storage in prompts or context |
NIST 800-207 and AI
NIST Special Publication 800-207 defines the zero-trust architecture framework that federal agencies and an increasing number of private organizations are adopting. Its core tenets — identity-based access, micro-segmentation, continuous verification, and least privilege — apply directly to AI systems, but require adaptation for AI-specific patterns.
Securing AI Agents
AI agents that can use tools — execute code, call APIs, access databases, send messages — represent the highest-risk AI attack surface. An agent with database access that is compromised via prompt injection can exfiltrate data. An agent with email access can send phishing messages. The zero-trust approach to agent security is multi-layered.
Zero-Trust Agent Security
Each agent session gets a unique set of tool permissions based on the user context and task. A customer support agent should not have access to financial tools. Define tool permission sets as policy, not code.
Every tool call is individually authorized against a policy engine. Read operations may be auto-approved; write operations require explicit authorization (or human approval for sensitive actions).
Never give an agent long-lived API keys or database credentials. Use short-lived tokens scoped to the minimum permissions needed. Tokens expire at the end of the session.
Treat agent outputs as untrusted. Validate tool call parameters against expected schemas before execution. This catches prompt injection attempts that try to manipulate tool arguments.
Log every tool call, its parameters, the result, and the user/session context. This audit trail is essential for incident investigation and compliance.
RAG Security: Document-Level Access Control
A RAG system that does not implement document-level access control is a data leak waiting to happen. If a user asks a question and the retrieval layer returns a document they should not have access to, the LLM will happily include that information in its response. The user does not even need to know the document exists — the semantic search does the discovery.
Implementing document-level access control in RAG requires tagging every document (and every chunk) with access control metadata at indexing time, and filtering retrieval results against the requesting user's permissions before passing context to the model. This is architecturally straightforward but operationally complex — document permissions change, users' roles change, and the access control metadata in the vector store must stay synchronized with the source-of-truth authorization system.
- Tag every document chunk with source document ID and access control metadata
- Filter retrieval results against user permissions before LLM context injection
- Implement query audit logging — track what each user searches for and what documents were retrieved
- Test for cross-tenant data leakage — semantic similarity can return documents from other tenants if access controls are not enforced
- Monitor for prompt injection patterns in user queries that attempt to bypass access controls
“In AI-native applications, the model is not a trusted component — it is an untrusted processor that transforms untrusted input into untrusted output. Zero-trust means treating every interaction with the model as a potential security boundary.”