AI-Generated Code Is Poisoning Your Software Supply Chain

Veracode found security flaws in 45% of AI-generated code. Endor Labs reports 80% of AI-suggested dependencies contain risks. A new attack vector called slopsquatting exploits hallucinated package names to inject malware into your build pipeline. This article breaks down the real supply chain risks of AI-assisted development, what the data actually shows, and what engineering teams need to do before the next npm install runs.

Abhishek Sharma· Head of Engg @ Fordel Studios

March 21, 2026Updated April 8, 202614 min read

AI-Generated Code Is Poisoning Your Software Supply Chain

In July 2025, Veracode published its GenAI Code Security Report after testing more than 100 large language models across 80 curated coding tasks. The headline finding: 45% of AI-generated code samples failed security tests and introduced OWASP Top 10 vulnerabilities. When given a choice between a secure and insecure method, the models chose the insecure option nearly half the time.

Four months later, Endor Labs released its fourth annual State of Dependency Management report. It found that only 1 in 5 dependency versions recommended by AI coding assistants were both safe and free from hallucination. The other 80% introduced risk — known vulnerabilities, non-existent packages, or third-party modules with unclear provenance.

These are not edge cases from contrived benchmarks. These are the tools that 95% of software engineers now use at least weekly, generating code that ships to production. The supply chain attack surface has not just expanded — it has been automated.

45%of AI-generated code fails security testsSource: Veracode GenAI Code Security Report, 100+ LLMs tested across 80 coding tasks

80%of AI-suggested dependencies contain risksSource: Endor Labs State of Dependency Management 2025, covering 10,663 GitHub repos

742%average annual increase in OSS supply chain attacks over 3 yearsSource: Sonatype, with 120,612 malware attacks blocked in a single quarter of 2025

···

The Anatomy of AI-Assisted Supply Chain Compromise

Traditional supply chain attacks required a threat actor to compromise an existing package, typosquat a popular library name, or infiltrate a maintainer account. These attacks were manual, targeted, and relatively slow. AI code generation has introduced three new attack vectors that are faster, broader, and harder to detect.

Slopsquatting: When the Model Hallucinates Your Next Dependency

Slopsquatting is a supply chain attack that exploits a quirk of large language models: they hallucinate package names. An LLM asked to generate code that parses CSV files might recommend importing a package called csv-parser-utils — a package that does not exist in any registry. The name sounds plausible. It follows naming conventions. But it was invented by the model.

A research team studying 756,000 AI-generated code samples found that nearly 20% recommended non-existent packages. When the same prompt was repeated, 43% of the hallucinated packages appeared consistently across 10 queries. The hallucinations are not random — they are reproducible.

This reproducibility is what makes slopsquatting viable. An attacker monitors which package names LLMs consistently hallucinate, registers those names on npm or PyPI, and publishes packages containing malicious code. The next developer who accepts the AI suggestion runs npm install, and the malicious package enters their dependency tree.

Insecure Code Patterns at Scale

Beyond dependency hallucination, AI models generate insecure code patterns with alarming consistency. The Veracode study found that Java was the riskiest language for AI code generation, with a security failure rate exceeding 70%. Python, C#, and JavaScript followed with failure rates between 38% and 45%.

The specific vulnerability that AI models handle worst is Cross-Site Scripting (CWE-80). AI tools failed to defend against XSS in 86% of relevant code samples. This is not a subtle, hard-to-detect vulnerability. XSS is one of the oldest and most well-documented web security issues, and AI models still generate vulnerable code for it the vast majority of the time.

Georgetown University’s Center for Security and Emerging Technology (CSET) independently confirmed these findings. Their evaluation of five LLMs found that almost half of all generated code snippets contained bugs that were “often impactful and could potentially lead to malicious exploitation.” Earlier research on GitHub Copilot specifically found that approximately 40% of its 1,689 generated programs were vulnerable to MITRE’s CWE Top 25 Most Dangerous Software Weaknesses.

“When given a choice between a secure and insecure method to write code, generative AI models chose the insecure option 45% of the time. This rate has remained largely unchanged even as models have dramatically improved in generating syntactically correct code.”

Veracode GenAI Code Security Report

Dependency Version Roulette

Even when AI models recommend real packages, they frequently recommend the wrong version. Endor Labs found that between 44% and 49% of AI-imported dependency versions had known vulnerabilities. The model does not check the CVE database before suggesting a version. It recommends whatever version appeared most frequently in its training data — which, given the age distribution of open source code, is often an outdated version with known security issues.

This creates a perverse dynamic: the more popular a package was at a particular version, the more likely the model is to recommend that version, regardless of whether it has since been patched. Developers who trust the AI’s version recommendation without checking are importing yesterday’s vulnerabilities into today’s code.

Why This Is Getting Worse, Not Better

The intuitive assumption is that newer, larger models should generate more secure code. The data shows otherwise. Veracode’s key finding is that security performance has remained largely unchanged over time, even as models have dramatically improved at generating syntactically correct and functionally complete code. The models are getting better at writing code that works. They are not getting better at writing code that is safe.

Factor	Why It Amplifies Risk
Scale of adoption	95% of engineers use AI tools weekly; 56% report doing 70%+ of their work with AI. Every insecure pattern propagates faster.
Speed of generation	A developer using AI generates code 3–10x faster. Security review processes designed for human-speed development cannot keep up.
Trust calibration	Developers treat AI suggestions like senior developer recommendations. But unlike a senior developer, the model has no concept of security posture.
Training data lag	Models are trained on historical code that includes years of unpatched vulnerabilities, deprecated APIs, and pre-disclosure CVEs.
Feedback loops	AI-generated code enters public repositories, becomes training data for the next model generation, and reinforces insecure patterns.
MCP server proliferation	Endor Labs found that 10,663 MCP server repositories often use AI-suggested dependencies, centralizing supply chain risk at integration points.

The feedback loop is especially dangerous. When AI-generated code — including its insecure patterns — gets committed to public repositories on GitHub, it becomes training data for the next generation of models. Georgetown’s CSET report identified this as a systemic risk: models training on their own insecure outputs creates a degenerative cycle where insecure code patterns become statistically dominant in the training distribution.

···

The OWASP Perspective

The OWASP Top 10 for LLM Applications (2025 edition) addresses AI-generated code risks under two categories: Supply Chain Vulnerabilities and Improper Output Handling.

Supply Chain Vulnerabilities cover the full dependency lifecycle — from AI models recommending malicious or hallucinated packages to compromised model weights and training data poisoning. Improper Output Handling covers the case where LLM-generated code runs unsanitized in downstream systems, which is precisely what happens when a developer accepts an AI code suggestion and ships it without security review.

OWASP LLM Top 10 Categories Relevant to AI Code Generation

Supply Chain Vulnerabilities: hallucinated packages, outdated dependencies, compromised third-party models, unvetted MCP server integrations
Improper Output Handling: AI-generated code executed without sanitisation, leading to injection, XSS, or privilege escalation
Data and Model Poisoning: adversarial training data that causes models to consistently recommend specific malicious packages or insecure patterns
Excessive Agency: AI coding agents with write access to filesystems, package managers, and CI/CD pipelines operating without adequate guardrails

The 2025 edition added Vector and Embedding Weaknesses as a new category, acknowledging that RAG-based coding assistants that retrieve code snippets from vector stores inherit the security posture of whatever code was indexed. If your vector store contains insecure code patterns, your AI assistant will recommend them with high confidence.

···

What the Malware Numbers Actually Show

The open source supply chain was under attack before AI code generation existed. But the scale has changed dramatically.

Snyk identified over 3,000 malicious npm packages in 2024, with more than 3,600 malicious packages total across npm and PyPI. By Q4 2025, Sonatype was blocking 120,612 malware attacks in a single quarter. Socket’s mid-year 2025 threat report documented a steady rise in destructive malware using delayed execution and remotely controlled kill switches to evade early detection.

The intersection of this existing malware landscape with AI-generated code hallucinations creates a multiplier effect. Attackers no longer need to guess which package names developers might mistype. They can query the same AI models developers use, identify which non-existent packages the models consistently recommend, and register those names. The AI model becomes an unwitting accomplice in the supply chain attack.

3,600+malicious packages identified across npm and PyPI in 2024Source: Snyk, with npm being the most targeted ecosystem

120K+malware attacks blocked by Sonatype in Q4 2025 aloneRepresenting a 742% average annual increase in OSS supply chain attacks

34%of AI-suggested dependencies do not exist in any public registrySource: Endor Labs, testing AI coding assistants across PyPI, npm, Maven, and NuGet

The SBOM Question

Software Bills of Materials have been a federal requirement for software sold to the US government since Executive Order 14028 in 2021. CISA updated its SBOM guidance in 2025, requiring machine-readable formats like SPDX or CycloneDX. In January 2026, the OMB shifted from prescriptive mandates to a risk-based approach, but agencies can still require SBOMs as part of their risk assessment.

For AI-generated code, SBOMs face a fundamental challenge: they document what is in the software, but they cannot document what should not be there. An SBOM will faithfully record that your application depends on csv-parser-utils version 1.0.0. It will not tell you that csv-parser-utils was hallucinated by an AI model, registered by a threat actor two weeks ago, and contains a reverse shell.

This is not a failure of SBOMs. It is a limitation of post-hoc documentation when the code generation process itself is compromised. SBOMs remain essential for transparency and incident response. But they are a detection mechanism, not a prevention mechanism. The prevention has to happen earlier in the pipeline — before the dependency is installed, before the code is committed, before the AI suggestion is accepted.

···

What Engineering Teams Need to Do

The response to AI-generated code security risks is not to stop using AI tools. That ship has sailed — 95% adoption means these tools are infrastructure, not optional. The response is to treat AI-generated code with the same rigour you would apply to code from an untrusted contributor.

Securing Your AI-Assisted Development Pipeline

Validate every dependency before installation

Never run npm install or pip install on an AI-suggested package without first confirming it exists in the public registry, checking its publish date, download count, and maintainer history. Tools like Socket, Snyk, and Endor Labs provide automated checks for package provenance and known malicious indicators. If a package was published in the last 30 days with minimal downloads, treat it as suspicious regardless of how confidently the AI recommended it.

Pin dependency versions and audit AI suggestions against CVE databases

AI models default to whatever version was most common in training data, which is rarely the most current or secure version. Use lockfiles (package-lock.json, poetry.lock) and audit tools (npm audit, pip-audit, Snyk) to catch known vulnerabilities in AI-suggested versions before they enter your dependency tree. Consider running automated version checks as a pre-commit hook.

Run static analysis on all AI-generated code

Integrate SAST tools (Semgrep, CodeQL, Veracode) into your CI pipeline and run them on every commit, not just periodic scans. Given that 45% of AI-generated code introduces OWASP Top 10 vulnerabilities, static analysis is no longer optional — it is the minimum viable security practice for AI-assisted development. Pay particular attention to XSS, injection, and authentication bypass patterns.

Isolate AI coding agents from sensitive infrastructure

AI coding agents with MCP integrations, filesystem access, and terminal execution capabilities should run in sandboxed environments with minimal permissions. An AI agent that can write files, install packages, and execute code has the same attack surface as an untrusted script. Apply the principle of least privilege: read access to the codebase, write access only to designated directories, no direct access to production credentials or deployment pipelines.

Generate and maintain SBOMs with AI provenance tracking

Use CycloneDX or SPDX to generate SBOMs for every release. Where possible, annotate which dependencies were AI-suggested versus developer-chosen. This provenance information accelerates incident response when a supply chain compromise is discovered — you can immediately identify which AI-suggested dependencies need review rather than auditing the entire dependency tree.

Establish an AI code review policy

Treat AI-generated code as untrusted contributor code in your review process. This means no auto-merging AI-generated PRs, mandatory security-focused review for any code that handles authentication, authorization, data access, or external API calls, and explicit sign-off that dependencies have been validated. The goal is not to slow development down — it is to ensure the 45% of insecure suggestions get caught before they ship.

···

The Tooling Landscape for AI Code Security

Tool / Platform	What It Does	When to Use It
Socket	Deep package inspection, detecting supply chain attacks via behavioural analysis rather than just CVE matching	Pre-install validation of every new dependency, especially AI-suggested ones
Snyk	Vulnerability scanning across dependencies, containers, and IaC with AI-specific package risk scoring	Continuous monitoring of your dependency tree and CI/CD integration
Endor Labs	Dependency risk scoring that accounts for AI hallucination, version safety, and provenance	Evaluating AI-suggested dependencies before committing them to your lockfile
Semgrep / CodeQL	Static analysis for security anti-patterns in AI-generated code	Every commit in CI, with rules tuned for the vulnerabilities AI models most commonly introduce
Veracode	Comprehensive application security testing including SAST, DAST, and SCA	Enterprise security programmes requiring compliance-grade scanning of AI-generated code
Sonatype Nexus	Repository firewall that blocks known-malicious packages before they enter your build	Protecting your package manager from installing slopsquatted or typosquatted dependencies

No single tool covers the full attack surface. The production pattern is layered: a repository firewall (Sonatype) blocks known-bad packages at the registry level, a dependency scanner (Snyk, Endor Labs, Socket) validates packages at install time, a SAST tool (Semgrep, CodeQL) catches insecure patterns in generated code, and an SBOM generator documents everything for audit and incident response.

What Happens Next

The AI code generation security problem has three possible trajectories:

Three Possible Futures

Models improve their security awareness: Possible but not happening yet. Veracode’s data shows no meaningful improvement in security outcomes across model generations. The models are optimised for functionality, not safety. Until security metrics are weighted equally with correctness in model training, this trajectory is unlikely.
Tooling catches up and provides guardrails: This is the most likely near-term outcome. Socket, Snyk, Endor Labs, and others are building AI-specific security capabilities. The challenge is adoption — most teams have not yet updated their security tooling to account for AI-generated code patterns.
A major incident forces industry-wide change: The most probable catalyst for systemic improvement. When an AI-hallucinated dependency leads to a significant breach at a well-known company, the resulting regulatory and reputational pressure will accelerate adoption of the practices described in this article. The question is not whether this will happen, but when.

The US Department of Defense published an AI/ML Supply Chain Risks and Mitigations advisory in March 2026, explicitly acknowledging that AI-generated code and AI-suggested dependencies create novel supply chain risks that existing frameworks do not adequately address. ENISA, the EU’s cybersecurity agency, published a package manager advisory in March 2026 addressing similar concerns for European organisations.

“The burden of ensuring that AI-generated code outputs are secure should not rest solely on individual users, but also on AI developers, organizations producing code at scale, and those who can improve security at large, such as policymaking bodies or industry leaders.”

Georgetown CSET

···

Where Fordel Builds

We build production software for clients in finance, healthcare, insurance, and SaaS — industries where a supply chain compromise is not an inconvenience but a regulatory incident. Every project we deliver includes dependency auditing, SAST integration, SBOM generation, and security-focused code review as standard practice, not premium add-ons.

If you are using AI coding tools in production and have not updated your security pipeline to account for AI-specific risks — hallucinated dependencies, insecure code patterns, version roulette — you have a gap. We can audit your current pipeline, identify where AI-generated code is introducing risk, and implement the tooling and processes to close it. That conversation costs nothing. The alternative costs more.

Frequently Asked Questions

How does AI-generated code create software supply chain risks?

AI coding assistants can generate code that includes invented package names (hallucinated dependencies), outdated dependencies with known CVEs, insecure patterns copied from training data that predates security patches, and references to packages that can be squatted by attackers. The risk is that developers trust generated code implicitly without the same scrutiny they apply to code they write manually.

How do you audit AI-generated code for supply chain vulnerabilities?

AI code supply chain audit steps: (1) validate every generated dependency against official registries before installing, (2) run dependency scanning (Dependabot, Snyk, OSV) on every generated package.json or requirements.txt, (3) review generated import statements for typosquatted package names, (4) verify that generated API usage matches current library documentation.

What are the most dangerous patterns in AI-generated code from a security perspective?

Highest-risk AI code patterns: hardcoded credentials in generated config files, SQL string concatenation instead of parameterized queries, insecure deserialization, missing input validation in generated API handlers, and use of deprecated cryptography functions (MD5, SHA1) that the model learned from old training data.

How does AI code generation change your SAST and security review process?

AI-generated code requires heavier SAST coverage, not lighter. Generated code produces false confidence — it looks correct and compiles cleanly while containing security issues. Add AI-specific SAST rules for hallucinated dependencies, increase review coverage on generated authentication and authorization logic, and never allow AI-generated code to bypass security review gates.

Is AI-generated code more or less secure than human-written code?

Current evidence is mixed. AI-generated code tends to be more consistent on well-known patterns (input sanitization, basic auth flows) but introduces novel failure modes (hallucinated packages, outdated patterns from old training data). The net risk is higher in teams where AI output reduces code review scrutiny, not because the AI itself is less secure.

Part of: Fordel pillar guide

AI Agent Architecture: Production Patterns

Fordel's pillar guide to architecting production AI agents — state machines, retry semantics, escalation, and audit trails.

Read the full guide →

Build with us

Need this kind of thinking applied to your product?

We build AI agents, full-stack platforms, and engineering systems. Same depth, applied to your problem.

Start a conversation View services

Newsletter

Enjoyed this? Get the weekly digest.

Research highlights and AI news, delivered every Thursday. No spam.

Loading comments...

Keep Reading

All articles