What is an AI dependency audit and why does it matter?

An AI dependency audit inventories every AI library, model, SDK, and API your system depends on — assessing version currency, security posture, licensing, and the blast radius if any dependency changes behavior or disappears. It matters because AI dependencies change faster than traditional software dependencies: models get deprecated, APIs change pricing, and libraries ship breaking changes with minor version bumps.

How do you audit AI dependencies at a production software company?

AI dependency audit steps: (1) inventory all AI dependencies (SDKs, APIs, models, prompts stored externally), (2) classify each by criticality and replaceability, (3) check for known CVEs in open-source AI libraries, (4) review API ToS for each vendor to assess pricing/deprecation risk, (5) test behavior against a regression harness when any dependency updates.

What are the supply chain risks specific to LiteLLM and similar AI gateway libraries?

LiteLLM and similar AI gateway libraries introduce risk from: rapid release cycles with breaking changes, community-maintained provider integrations of varying quality, configuration files that may expose API keys if improperly handled, and dependency trees that include dozens of indirect dependencies with their own CVE exposure. Pin versions and review changelogs before upgrading in production.

How often should you review AI dependencies for security and reliability?

AI dependencies should be reviewed quarterly at minimum, with automated CVE scanning on every CI run. Model behavior should be regression-tested after every provider update, not just library updates — model API providers can silently change model behavior between versions with the same model identifier.

How do you reduce risk from AI dependency changes in production?

Risk reduction strategies: abstract all LLM calls behind an internal interface so provider changes require one code change, maintain a regression test suite covering critical behaviors, use dependency pinning with a documented upgrade process, build a fallback model per task type so a single provider outage does not take down your system, and monitor AI API error rates as a first-class production health signal.

Fordel Studios

How We Audit AI Dependencies at Fordel (And What the LiteLLM Incident Taught Us)

AI agent stacks pull in dozens of SDKs — every one a potential LiteLLM. Our 45-minute audit: lockfile integrity, scoped CVE scan, egress restriction, runbook.

Abhishek Sharma· Founder, Fordel Studios

March 27, 2026Updated May 8, 202612 min read

How We Audit AI Dependencies at Fordel (And What the LiteLLM Incident Taught Us)

Last week, LiteLLM — a package with millions of downloads used to proxy calls to every major LLM provider — had a malicious version slip through. If you had it in your prod stack unmonitored, your API keys had a bad week. Here is exactly what we changed at Fordel and how you can replicate it in a day.

···

Why Does This Keep Happening to AI Packages?

AI tooling packages are disproportionately attractive targets. They sit between your code and your most sensitive credentials: OpenAI keys, Anthropic tokens, AWS Bedrock IAM roles, internal database connections. A compromised LLM client library does not need to escalate privileges — it already has them. It proxies your requests, so malicious exfiltration looks identical to normal traffic. And most teams pull these packages with no integrity checks at all.

The attack surface has grown fast. Eighteen months ago your AI stack was maybe the OpenAI SDK and a vector client. Today a typical production AI service has 8–15 direct AI-related dependencies: an LLM gateway, an embedding client, a vector store SDK, an orchestration framework, an eval runner, an observability SDK. Each one is a potential injection point.

We have seen three distinct attack patterns in the last six months across our client stack audits: typosquatted packages (litellm vs litelm), malicious version bumps on legitimate packages, and postinstall scripts that phone home. The LiteLLM incident was the second type — a legitimate, popular package with a poisoned release.

“Your LLM client already has your API keys, your prompt templates, and usually direct database access. A compromised version does not need to do anything clever.”

Abhishek Sharma, Fordel Studios

···

Prerequisites Before You Start

Node.js or Python project with at least one AI SDK dependency
Access to your CI/CD pipeline (GitHub Actions, GitLab CI, or equivalent)
A package manager lockfile committed to the repo (package-lock.json, yarn.lock, or poetry.lock)
Egress firewall or cloud security group you can modify (AWS SG, GCP VPC firewall, or Cloudflare Gateway)
15 minutes per service to run the initial audit

···

Step 1: Map Your Actual AI Dependency Tree

You probably know your direct dependencies. You do not know your transitive ones. Run this first.

For Node.js projects:

For Python:

Do this for every service in your stack, not just the ones you think are AI-related. We found LiteLLM as a transitive dependency in a Node service that had no direct AI usage — it was pulled in by an eval framework the team had forgotten was installed.

Export the full output to a file. This is your baseline. Commit it as docs/ai-dependency-manifest.txt and update it on every dependency bump.

Step 2: Lock Versions and Verify Integrity Hashes

This is the most impactful change you can make in ten minutes. Lockfiles give you version pinning. Integrity hashes give you tamper detection.

Node.js — your package-lock.json already contains integrity hashes. The problem is most teams have npm install in CI without the --ci flag, which can silently update the lockfile. Change this in every CI pipeline:

# Replace npm install with: npm ci # npm ci: # - Fails if package-lock.json is missing or inconsistent # - Never updates the lockfile # - Verifies the integrity hash of every installed package

Python — pip does not verify hashes by default unless you tell it to. Add this to your requirements install step:

# Generate hashes for your current lockfile pip-compile --generate-hashes requirements.in -o requirements.txt # Install with hash verification pip install --require-hashes -r requirements.txt

If you are using Poetry, run poetry lock and commit poetry.lock. In CI, use poetry install --no-root --frozen. The --frozen flag is the equivalent of npm ci.

The critical thing: if a malicious version is published and your lockfile has the correct hash of the legitimate version, the install will fail loudly in CI before it ever reaches production. That is the whole point.

Step 3: Add Automated CVE Scanning Scoped to AI Packages

General dependency scanners like Dependabot generate noise. When you are triaging 40 low-severity CVEs, the one that matters gets lost. We scope our AI-specific scanning separately so it gets its own alert channel and its own failure mode in CI.

Here is the GitHub Actions workflow we use at Fordel:

# .github/workflows/ai-deps-audit.yml name: AI Dependency Audit on: push: paths: ['package-lock.json', 'poetry.lock', 'requirements*.txt'] schedule: - cron: '0 8 * * 1' # Monday morning, not Friday afternoon jobs: audit: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '20' - run: npm ci - name: Extract and audit AI packages run: | node -e " const lock = require('./package-lock.json'); const ai = Object.keys(lock.packages || {}) .filter(p => /openai|anthropic|langchain|litellm|llamaindex|cohere|pinecone|weaviate|chromadb/.test(p)) .map(p => p.replace('node_modules/', '')) .filter(Boolean); console.log('AI packages found:', ai); " npm audit --audit-level=high 2>&1 | grep -E '(high|critical)' || echo 'No high/critical CVEs'

We also maintain a small internal blocklist — a JSON file listing specific package@version strings we have flagged as compromised — and check against it in every CI run. After the LiteLLM incident we added the affected version range to it within the hour.

Step 4: Restrict Network Egress for LLM Clients

This is the change that stops the damage even when a compromised package is already running. A malicious LLM client library wants to exfiltrate your API keys somewhere unexpected. If your service can only make outbound connections to approved domains, exfiltration fails at the network layer.

The approved egress list for a typical AI service is smaller than you think:

# Approved outbound domains for an AI service api.anthropic.com api.openai.com api.groq.com bedrock-runtime.us-east-1.amazonaws.com # your region your-vector-db.svc.cluster.local # internal your-postgres.rds.amazonaws.com # internal # Block everything else outbound at the security group or firewall level

In AWS, this means a security group with explicit outbound rules to those CIDR ranges, plus a VPC endpoint for S3 and Bedrock so internal traffic never leaves your VPC. In GCP, use VPC firewall egress rules. In Kubernetes, use a NetworkPolicy with an explicit egress allowlist per namespace.

Most teams skip this because it feels like over-engineering. It is not. It is the only mitigation that works when the malicious code is running inside your process with all your env vars already loaded.

Step 5: Monitor for Credential Anomalies at the API Level

Every major LLM provider has usage dashboards and webhook alerts. Most engineers never configure them. Set up alerts for anomalous spend — the kind that looks like a fire drill at 9am on a Monday.

# Anthropic: usage anomaly alerts via API curl -X POST https://api.anthropic.com/v1/alerts \ -H 'x-api-key: YOUR_KEY' \ -d '{"type": "spend_threshold", "threshold_usd": 50, "window_minutes": 60}' # OpenAI: set soft and hard spend limits # Settings > Billing > Usage limits > hourly soft limit # Groq: scope API keys per environment (read-only vs write)

More useful than any of these: use a separate API key per environment and per service. When you see anomalous spend on the staging Anthropic key, you immediately know which service is affected without combing logs. Name them descriptively in the dashboard: fordel-prod-rag-service, fordel-staging-chat, fordel-dev-eval. A compromised key becomes self-identifying.

Rotate all AI API keys on a 90-day schedule minimum. Add a calendar event. Do it now if you are not already. Key rotation is annoying. Getting locked out of production while you rebuild trust with your LLM provider is more annoying.

Step 6: Write a One-Page Incident Response Runbook

The LiteLLM incident was announced on a Monday morning. Teams that had a runbook were rotating keys and deploying patched versions within an hour. Teams that did not were still in Slack threads trying to figure out whether they were affected at 3pm.

Your runbook needs exactly five sections. Keep it in your repo at docs/ai-incident-response.md:

AI Supply Chain Incident Runbook (5 sections)

1. Detection: How you found out (alert, public disclosure, direct notification) — include links to your monitoring dashboards
2. Scope check: Which services use the affected package? Run: npm ls <package> or pip show <package> across all repos
3. Immediate containment: Disable the affected API keys in the provider dashboard within 5 minutes — before you fully understand the scope
4. Patched deploy: Pin to a known-good version in the lockfile, deploy with --frozen or npm ci, verify hashes pass in CI
5. Post-incident: Issue new keys, update your internal blocklist, add the version range to your CI audit check, write a 3-line blameless postmortem

The key insight: step 3 (disable keys) comes before you have fully diagnosed the problem. You lose maybe a few hours of LLM functionality. You do not lose your API budget, your customer data, or your reputation.

···

What Are the Most Common Mistakes Teams Make Here?

The most common mistake: treating AI SDK packages the same as utility packages. A compromised lodash is bad. A compromised litellm hands an attacker the keys to every AI provider you use and the ability to intercept every prompt and response in your system. The blast radius is categorically different.

Second: pinning the major version but not the patch. ^1.2.0 in your package.json will auto-install 1.2.1 when it is published. That is exactly how the LiteLLM incident propagated to teams who thought they were pinned. Use exact versions: no caret, no tilde.

// Bad — will auto-update to any 1.x.x "litellm": "^1.2.0" // Good — locked to this exact version until you consciously bump it "litellm": "1.2.0"

Third: not having a lockfile committed. This still happens. The lockfile is not a build artefact — it is a security control. Commit it. If it is in your .gitignore, remove it there right now.

Fourth: running npm audit and calling it done. npm audit reports CVEs in the NVD database. A malicious version published this morning will not be in the NVD yet. The integrity hash check is the only thing that catches it, because the hash will not match what your lockfile recorded for the legitimate version.

Fifth: one API key shared across all environments and services. When everything uses the same key, you cannot scope or rotate per service, and you cannot tell from anomalous usage which part of your system is affected.

···

What Do You Have Now?

If you followed these six steps, here is what you have built:

Defense layer 1: Lockfile + integrity hashes → catches tampered packages at install time in CI Defense layer 2: Automated weekly CVE scan scoped to AI packages → catches known bad versions before Monday morning Defense layer 3: Network egress restriction → contains damage when a compromised package is already running Defense layer 4: Per-service API keys with spend alerts → gives you 5-minute detection of credential abuse Defense layer 5: Incident runbook → converts a chaotic 6-hour incident into a 45-minute contained response

None of this is exotic. It is the same defense-in-depth you would apply to any critical dependency. The only reason it is worth spelling out in 2026 is that AI toolchain packages have been under-secured while their blast radius has grown faster than any other category in the stack.

The LiteLLM incident will not be the last one. The packages sitting between your code and your API keys are too valuable a target. Set this up today when it is a calm afternoon exercise. Not at 9am on a Monday when it is an incident.

“We run the same audit tooling on AI SDKs that we run on authentication libraries. Because in 2026, they carry the same weight.”

Abhishek Sharma, Fordel Studios

···

Comprehensive Audit Checklist

Lockfile present and committed (package-lock.json, yarn.lock, poetry.lock, go.sum)
All dependencies pinned to exact versions — no ranges (^, ~, *) in production lockfiles
Lockfile integrity verification in CI (npm ci, not npm install)
SBOM (Software Bill of Materials) generated and archived for each release build
Direct and transitive dependency count tracked — alert on unexpected growth
No packages with zero-width namespace squatting patterns in dependency tree
All dependencies published to npm/PyPI/etc. by verified authors (check provenance)
No private package names that could be typosquatted (internal packages published to block squatting)
CI pipeline fails on any new critical or high CVE without explicit override
License compliance checked — no GPL/AGPL in commercial closed-source projects

SBOM generation has moved from "nice to have" to a procurement requirement for US federal contracts (Executive Order 14028) and is increasingly expected by enterprise customers in regulated industries. Tools: Syft generates SBOMs in CycloneDX or SPDX format from container images or filesystems in under 30 seconds. For Node.js projects, cdxgen produces CycloneDX-format SBOMs directly from package-lock.json.

···

Tool Comparison: Snyk vs Dependabot vs Socket vs npm audit

Tool	Coverage	Threat model	CI integration	Cost	Best for
npm audit	Node.js only	Known CVEs in npm advisory database	Native (npm ci --audit)	Free	Baseline check; limited signal
Dependabot	Multi-ecosystem	Known CVEs (GitHub Advisory DB)	GitHub Actions native; auto-PRs	Free (GitHub)	Automated PR-based updates
Snyk	Multi-ecosystem + containers + IaC	CVEs + license + misconfigs	GitHub, GitLab, CLI, IDE	$0–$25/dev/mo	Comprehensive; good for regulated industries
Socket	npm, PyPI	Behaviour analysis (typosquatting, malware, exfil)	GitHub PR checks	$0–$10/dev/mo	Supply chain attack detection specifically

Socket.dev occupies a unique niche: rather than just checking CVE databases, it analyses package behaviour — detecting packages that network-call on install, read environment variables, or have recently changed maintainers. This catches the class of supply chain attack that CVE databases miss because the malicious package is new and has not yet been assigned a CVE. For AI projects that pull model-related Python packages, Socket's behaviour analysis is particularly relevant.

A layered approach works best: Dependabot for automated CVE-based version updates, Socket for supply chain behaviour monitoring, and Snyk for comprehensive reporting in regulated or enterprise contexts. None of these tools eliminates the need for periodic manual review of your top 20 most-used dependencies. For the broader security governance context these tools operate within, see our zero-trust and AI gateway security guide.

···

The Cost of Not Auditing: Real Incident Examples

The 2021 ua-parser-js compromise injected a cryptominer and credential stealer into a package with 8 million weekly downloads for approximately 4 hours before the malicious versions were pulled. Any Node.js application that ran npm install during that window installed the malware. The affected package is used by Facebook, Apple, Microsoft, and thousands of enterprises.

The 2022 node-ipc protest-ware incident saw a maintainer deliberately ship code that overwrote files with a peace symbol on systems with Russian or Belarusian IP addresses. The package had 1 million weekly downloads. Many organisations did not know they depended on it transitively through vue-cli. This was not a CVE — it was intentional maintainer behaviour that no vulnerability scanner would have caught pre-incident.

The 2024 xz-utils backdoor (CVE-2024-3094) — a two-year social engineering campaign that planted a backdoor in a core Linux compression utility — demonstrated that supply chain attacks now operate at nation-state sophistication and patience. For AI systems specifically, supply chain attacks targeting ML framework packages (PyTorch, Transformers, Diffusers) could exfiltrate training data, model weights, or inference API keys with a single compromised transitive dependency. This connects directly to the AI-generated code supply chain risks that compound when developers accept AI-suggested imports without verification.

Build with us

Need this kind of thinking applied to your product?

We build AI agents, full-stack platforms, and engineering systems. Same depth, applied to your problem.

Start a conversation View services

Newsletter

Enjoyed this? Get the weekly digest.

Research highlights and AI news, delivered every Thursday. No spam.

Loading comments...

Keep Reading

All articles