Last week, LiteLLM — a package with millions of downloads used to proxy calls to every major LLM provider — had a malicious version slip through. If you had it in your prod stack unmonitored, your API keys had a bad week. Here is exactly what we changed at Fordel and how you can replicate it in a day.
Why Does This Keep Happening to AI Packages?
AI tooling packages are disproportionately attractive targets. They sit between your code and your most sensitive credentials: OpenAI keys, Anthropic tokens, AWS Bedrock IAM roles, internal database connections. A compromised LLM client library does not need to escalate privileges — it already has them. It proxies your requests, so malicious exfiltration looks identical to normal traffic. And most teams pull these packages with no integrity checks at all.
The attack surface has grown fast. Eighteen months ago your AI stack was maybe the OpenAI SDK and a vector client. Today a typical production AI service has 8–15 direct AI-related dependencies: an LLM gateway, an embedding client, a vector store SDK, an orchestration framework, an eval runner, an observability SDK. Each one is a potential injection point.
We have seen three distinct attack patterns in the last six months across our client stack audits: typosquatted packages (litellm vs litelm), malicious version bumps on legitimate packages, and postinstall scripts that phone home. The LiteLLM incident was the second type — a legitimate, popular package with a poisoned release.
“Your LLM client already has your API keys, your prompt templates, and usually direct database access. A compromised version does not need to do anything clever.”
- Node.js or Python project with at least one AI SDK dependency
- Access to your CI/CD pipeline (GitHub Actions, GitLab CI, or equivalent)
- A package manager lockfile committed to the repo (package-lock.json, yarn.lock, or poetry.lock)
- Egress firewall or cloud security group you can modify (AWS SG, GCP VPC firewall, or Cloudflare Gateway)
- 15 minutes per service to run the initial audit
Step 1: Map Your Actual AI Dependency Tree
You probably know your direct dependencies. You do not know your transitive ones. Run this first.
For Node.js projects:
npm ls --all 2>/dev/null | grep -E "(openai|anthropic|langchain|litellm|llama|cohere|mistral|groq|bedrock|pinecone|weaviate|chroma|qdrant|langfuse|braintrust|instructor)"
For Python:
pip show $(pip freeze | grep -iE 'openai|anthropic|langchain|litellm|llama|cohere|mistral|groq|bedrock|pinecone|weaviate|chromadb|qdrant|langfuse|instructor' | cut -d= -f1) 2>/dev/null | grep -E 'Name|Version|Location'
Do this for every service in your stack, not just the ones you think are AI-related. We found LiteLLM as a transitive dependency in a Node service that had no direct AI usage — it was pulled in by an eval framework the team had forgotten was installed.
Export the full output to a file. This is your baseline. Commit it as docs/ai-dependency-manifest.txt and update it on every dependency bump.
Step 2: Lock Versions and Verify Integrity Hashes
This is the most impactful change you can make in ten minutes. Lockfiles give you version pinning. Integrity hashes give you tamper detection.
Node.js — your package-lock.json already contains integrity hashes. The problem is most teams have npm install in CI without the --ci flag, which can silently update the lockfile. Change this in every CI pipeline:
# Replace npm install with:
npm ci
# npm ci:
# - Fails if package-lock.json is missing or inconsistent
# - Never updates the lockfile
# - Verifies the integrity hash of every installed package
Python — pip does not verify hashes by default unless you tell it to. Add this to your requirements install step:
# Generate hashes for your current lockfile
pip-compile --generate-hashes requirements.in -o requirements.txt
# Install with hash verification
pip install --require-hashes -r requirements.txt
If you are using Poetry, run poetry lock and commit poetry.lock. In CI, use poetry install --no-root --frozen. The --frozen flag is the equivalent of npm ci.
The critical thing: if a malicious version is published and your lockfile has the correct hash of the legitimate version, the install will fail loudly in CI before it ever reaches production. That is the whole point.
Step 3: Add Automated CVE Scanning Scoped to AI Packages
General dependency scanners like Dependabot generate noise. When you are triaging 40 low-severity CVEs, the one that matters gets lost. We scope our AI-specific scanning separately so it gets its own alert channel and its own failure mode in CI.
Here is the GitHub Actions workflow we use at Fordel:
# .github/workflows/ai-deps-audit.yml
name: AI Dependency Audit
on:
push:
paths: ['package-lock.json', 'poetry.lock', 'requirements*.txt']
schedule:
- cron: '0 8 * * 1' # Monday morning, not Friday afternoon
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- run: npm ci
- name: Extract and audit AI packages
run: |
node -e "
const lock = require('./package-lock.json');
const ai = Object.keys(lock.packages || {})
.filter(p => /openai|anthropic|langchain|litellm|llamaindex|cohere|pinecone|weaviate|chromadb/.test(p))
.map(p => p.replace('node_modules/', ''))
.filter(Boolean);
console.log('AI packages found:', ai);
"
npm audit --audit-level=high 2>&1 | grep -E '(high|critical)' || echo 'No high/critical CVEs'
We also maintain a small internal blocklist — a JSON file listing specific package@version strings we have flagged as compromised — and check against it in every CI run. After the LiteLLM incident we added the affected version range to it within the hour.
Step 4: Restrict Network Egress for LLM Clients
This is the change that stops the damage even when a compromised package is already running. A malicious LLM client library wants to exfiltrate your API keys somewhere unexpected. If your service can only make outbound connections to approved domains, exfiltration fails at the network layer.
The approved egress list for a typical AI service is smaller than you think:
# Approved outbound domains for an AI service
api.anthropic.com
api.openai.com
api.groq.com
bedrock-runtime.us-east-1.amazonaws.com # your region
your-vector-db.svc.cluster.local # internal
your-postgres.rds.amazonaws.com # internal
# Block everything else outbound at the security group or firewall level
In AWS, this means a security group with explicit outbound rules to those CIDR ranges, plus a VPC endpoint for S3 and Bedrock so internal traffic never leaves your VPC. In GCP, use VPC firewall egress rules. In Kubernetes, use a NetworkPolicy with an explicit egress allowlist per namespace.
Most teams skip this because it feels like over-engineering. It is not. It is the only mitigation that works when the malicious code is running inside your process with all your env vars already loaded.
Step 5: Monitor for Credential Anomalies at the API Level
Every major LLM provider has usage dashboards and webhook alerts. Most engineers never configure them. Set up alerts for anomalous spend — the kind that looks like a fire drill at 9am on a Monday.
# Anthropic: usage anomaly alerts via API
curl -X POST https://api.anthropic.com/v1/alerts \
-H 'x-api-key: YOUR_KEY' \
-d '{"type": "spend_threshold", "threshold_usd": 50, "window_minutes": 60}'
# OpenAI: set soft and hard spend limits
# Settings > Billing > Usage limits > hourly soft limit
# Groq: scope API keys per environment (read-only vs write)
More useful than any of these: use a separate API key per environment and per service. When you see anomalous spend on the staging Anthropic key, you immediately know which service is affected without combing logs. Name them descriptively in the dashboard: fordel-prod-rag-service, fordel-staging-chat, fordel-dev-eval. A compromised key becomes self-identifying.
Rotate all AI API keys on a 90-day schedule minimum. Add a calendar event. Do it now if you are not already. Key rotation is annoying. Getting locked out of production while you rebuild trust with your LLM provider is more annoying.
Step 6: Write a One-Page Incident Response Runbook
The LiteLLM incident was announced on a Monday morning. Teams that had a runbook were rotating keys and deploying patched versions within an hour. Teams that did not were still in Slack threads trying to figure out whether they were affected at 3pm.
Your runbook needs exactly five sections. Keep it in your repo at docs/ai-incident-response.md:
- 1. Detection: How you found out (alert, public disclosure, direct notification) — include links to your monitoring dashboards
- 2. Scope check: Which services use the affected package? Run: npm ls <package> or pip show <package> across all repos
- 3. Immediate containment: Disable the affected API keys in the provider dashboard within 5 minutes — before you fully understand the scope
- 4. Patched deploy: Pin to a known-good version in the lockfile, deploy with --frozen or npm ci, verify hashes pass in CI
- 5. Post-incident: Issue new keys, update your internal blocklist, add the version range to your CI audit check, write a 3-line blameless postmortem
The key insight: step 3 (disable keys) comes before you have fully diagnosed the problem. You lose maybe a few hours of LLM functionality. You do not lose your API budget, your customer data, or your reputation.
What Are the Most Common Mistakes Teams Make Here?
The most common mistake: treating AI SDK packages the same as utility packages. A compromised lodash is bad. A compromised litellm hands an attacker the keys to every AI provider you use and the ability to intercept every prompt and response in your system. The blast radius is categorically different.
Second: pinning the major version but not the patch. ^1.2.0 in your package.json will auto-install 1.2.1 when it is published. That is exactly how the LiteLLM incident propagated to teams who thought they were pinned. Use exact versions: no caret, no tilde.
// Bad — will auto-update to any 1.x.x
"litellm": "^1.2.0"
// Good — locked to this exact version until you consciously bump it
"litellm": "1.2.0"
Third: not having a lockfile committed. This still happens. The lockfile is not a build artefact — it is a security control. Commit it. If it is in your .gitignore, remove it there right now.
Fourth: running npm audit and calling it done. npm audit reports CVEs in the NVD database. A malicious version published this morning will not be in the NVD yet. The integrity hash check is the only thing that catches it, because the hash will not match what your lockfile recorded for the legitimate version.
Fifth: one API key shared across all environments and services. When everything uses the same key, you cannot scope or rotate per service, and you cannot tell from anomalous usage which part of your system is affected.
What Do You Have Now?
If you followed these six steps, here is what you have built:
Defense layer 1: Lockfile + integrity hashes
→ catches tampered packages at install time in CI
Defense layer 2: Automated weekly CVE scan scoped to AI packages
→ catches known bad versions before Monday morning
Defense layer 3: Network egress restriction
→ contains damage when a compromised package is already running
Defense layer 4: Per-service API keys with spend alerts
→ gives you 5-minute detection of credential abuse
Defense layer 5: Incident runbook
→ converts a chaotic 6-hour incident into a 45-minute contained response
None of this is exotic. It is the same defense-in-depth you would apply to any critical dependency. The only reason it is worth spelling out in 2026 is that AI toolchain packages have been under-secured while their blast radius has grown faster than any other category in the stack.
The LiteLLM incident will not be the last one. The packages sitting between your code and your API keys are too valuable a target. Set this up today when it is a calm afternoon exercise. Not at 9am on a Monday when it is an incident.
“We run the same audit tooling on AI SDKs that we run on authentication libraries. Because in 2026, they carry the same weight.”





