We had this argument in the ’90s. We settled it. Lines of code is not productivity. Then AI happened and apparently we forgot everything.
How Did Lines of Code Become a Flex Again?
This week, Y Combinator’s CEO Garry Tan told the world he ships 37,000 lines of AI-generated code per day. Fast Company ran the story. Tech Twitter amplified it. A developer actually looked under the hood and found what anyone who has shipped production software would expect: massive boilerplate, duplicated patterns, auto-generated configuration files, and code that looks impressive in a line count but does not look impressive when you read it.
This is not a new phenomenon. Fred Brooks wrote about it in The Mythical Man-Month in 1975. Bill Gates said it in the 1980s. Every serious engineering organization on the planet moved away from measuring lines of code decades ago because the incentive structure is backwards. You reward volume, you get volume. You do not get quality. You do not get maintainability. You do not get software that works six months from now when somebody who did not write it has to fix a bug at 2 AM.
But here we are in 2026, and the metric is back. Not because anyone proved it correlates with value — but because AI tools produce code so fast that the number sounds impressive to people who do not write software for a living. And those people happen to be investors.
What Does 37,000 Lines of Code Actually Look Like?
Let me put this in perspective. The entire Linux kernel, one of the most complex software systems ever built, maintained by thousands of engineers over three decades, is roughly 30 million lines of code. At 37,000 lines per day, you would write the Linux kernel from scratch in about 810 days. Just over two years.
If that does not sound absurd to you, you have never maintained a codebase.
Here is what 37,000 lines per day actually looks like in practice: scaffolded components where 90% is boilerplate. Configuration files that could be ten lines but are two hundred because the AI generated every possible option with default values. Test files where the same assertion pattern is copy-pasted forty times with slightly different inputs. CSS files with vendor prefixes for browsers that died in 2019.
GitClear’s 2024 analysis of 153 million lines of code found that AI-assisted development produces 39% more “moved” and “copy-paste” code compared to human-only development. The code churn rate — the percentage of code that gets rewritten or deleted within two weeks of being committed — increased by 15% in AI-heavy codebases. You are not building faster. You are accumulating material that future engineers will have to understand, maintain, and eventually rewrite.
A separate study from METR in early 2026, studying experienced open-source developers working on their own repositories, found that AI coding tools made them 19% slower on real-world tasks. Not faster. Slower. The overhead of reviewing, correcting, and integrating AI suggestions ate more time than it saved.
Why Are Smart People Falling for a Discredited Metric?
Because the incentive structure rewards it.
If you are a founder pitching investors, “I ship 37,000 lines of code per day” sounds like a superpower. It sounds like you have a 50-person engineering team compressed into one person with a subscription to Cursor. The investor who has never debugged a null pointer exception in production hears this and thinks: this person moves fast.
If you are a developer tools company, lines of code per day is the number you want people to optimize for, because it directly correlates with how much they use your tool. Cursor, Claude Code, GitHub Copilot — every AI coding assistant has an incentive to make you generate more code, not better code. More code means more API calls. More API calls means more revenue. The business model is volume, not value.
If you are an engineering manager being pressured to demonstrate “AI adoption” to leadership, lines-of-code metrics let you show a graph that goes up and to the right. Nobody asks whether the code is good. Nobody asks what it costs to maintain. The graph goes up. Job secured.
“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”
And if you are a developer who just watched your CEO publicly brag about shipping 37K lines per day, you now know what metric you are being implicitly measured against. Good luck explaining that you spent your day deleting 2,000 lines and the system is better for it.
Is There a Legitimate Case for High Code Volume?
Yes, and I will steel-man it honestly.
There are situations where generating large volumes of code is genuinely productive. Database migrations and schema changes often require verbose, repetitive DDL statements. Internationalization files are inherently high-volume. Infrastructure-as-code with Terraform or Pulumi can legitimately produce thousands of lines for a single environment setup. Test suites, when well-structured, can be repetitive by design.
AI tools are genuinely excellent at this kind of work. If Garry Tan is scaffolding new projects, generating boilerplate configuration, or creating initial test harnesses, then high line counts make sense as a natural side effect of the work. Nobody is arguing that these tasks should be done by hand.
The problem is not that AI generates a lot of code. The problem is celebrating the volume as the achievement. It is conflating the exhaust with the engine. The valuable part of software engineering has never been typing. It is deciding what to type, understanding why, anticipating what breaks, and knowing what to leave out.
When you strip away the scaffolding, the configuration, the boilerplate, and the auto-generated tests, how many of those 37,000 lines represent actual decisions? How many represent architecture? How many represent understanding the problem deeply enough to solve it in 50 lines instead of 500?
If the answer is “I don’t know, I didn’t review it line by line,” then you have just told me everything I need to know about the quality of that codebase.
What Is the Actual Cost of Unreviewed AI Code?
This week, multiple outlets reported on developers experiencing cognitive overload and even sleep disorders from AI coding tool usage. The New Stack ran a piece titled “I started to lose my ability to code.” These are not people who failed to adapt. These are experienced engineers describing the psychological cost of reviewing code they did not write but are responsible for.
Here is the math nobody at YC is doing.
If you generate 37,000 lines per day and spend zero time reviewing it, you have 37,000 lines of unaudited code in your system. Over a month, that is over a million lines. In a year, you have written a codebase the size of Chromium with zero quality gates.
If you generate 37,000 lines per day and spend 30 seconds reviewing each line (which is fast for meaningful review), that is 308 hours of review time. Per day. Even if 90% of the code needs no review because it is boilerplate, you are still looking at 30+ hours of review for the remaining 3,700 lines of novel code.
So either the code is not being reviewed, or the number is misleading. There is no third option.
The downstream cost is real. Sonar’s 2026 developer survey found that 67% of developers spend more time maintaining AI-generated code than they saved generating it. Not because the code does not work initially — but because it works just well enough to pass a demo and just poorly enough to break under real-world conditions.
- Comprehension debt: future engineers must understand code nobody deliberately wrote
- Churn rate: 15% more code rewritten within 2 weeks in AI-heavy repos (GitClear)
- Security surface: every line of unreviewed code is a potential vulnerability
- Onboarding time: new team members face codebases 3-5x larger than necessary
- Test maintenance: auto-generated tests that test implementation, not behavior
How Should We Measure AI-Augmented Engineering?
If lines of code is wrong, what is right? Here is what I track at Fordel Studios.
Cycle time from commit to production.
How fast does a change move from a developer’s branch to a live system? This measures the entire pipeline: code quality, review speed, CI reliability, and deployment confidence. AI tools should make this number go down, not up.
Change failure rate.
What percentage of deployments cause incidents? If you are shipping faster but breaking more, you are not being productive. You are creating work for your future self and your on-call teammates.
Code review turnaround.
How long do PRs sit before getting meaningful review? AI can help here by pre-screening for obvious issues, but the review still needs a human who understands the system and the business context.
Deletion ratio.
What percentage of your commits are net-negative in line count? A healthy codebase should regularly shrink. If every sprint adds 37,000 lines and removes zero, your repository is a landfill.
Time-to-tenth-commit.
How quickly can a new team member make ten meaningful contributions? This measures whether your code is understandable, your documentation is functional, and your architecture is navigable. A million-line codebase nobody can onboard into is not an asset. It is a liability.
“The best code is no code at all. Every line of code you write is a liability that needs to be tested, maintained, and understood by the next person.”
What Should the Industry Actually Do About This?
Stop letting founders set engineering culture. That is my first and loudest recommendation. A CEO who brags about personal code volume is sending a signal to every engineer in the organization: we value throughput over thoughtfulness. That signal cascades. The senior engineers start rubber-stamping PRs to keep pace. The junior engineers stop asking questions because the culture rewards shipping, not understanding. The tech lead stops pushing back on architecture decisions because there is no time — we have 37,000 lines to ship today.
AI coding tool vendors need to stop showing lines-generated in their marketing. I have seen dashboards in Cursor, Copilot, and Claude Code that prominently display how many lines of code the AI wrote for you in a session. This is the equivalent of a fast food restaurant putting calorie counts on a scoreboard. You are gamifying the wrong metric.
Engineering leaders need to adopt the DORA metrics framework and stop inventing vanity metrics that make AI adoption look successful. Deployment frequency, lead time for changes, change failure rate, and time to restore service. These four metrics, validated by a decade of research from Google’s DevOps Research and Assessment team, actually correlate with organizational performance. Lines of code does not.
And developers — you need to stop feeling inadequate because some VC-funded founder claims to outcode you by 100x. The developers reporting sleep disorders and cognitive overload are experiencing the natural consequence of trying to keep up with a metric that was never designed to be kept up with. Nobody can meaningfully review 37,000 lines of code per day. The claim is not aspirational. It is a red flag.
Is This Actually About AI or Is It About Something Deeper?
This is about the tech industry’s oldest pathology: confusing activity with progress.
We did it with hours worked in the 2000s. We did it with story points in the 2010s. We are doing it with lines of AI-generated code in the 2020s. Every generation of technologists invents a proxy metric for productivity, optimizes the metric instead of the outcome, and then acts surprised when the outcome does not improve.
AI coding tools are genuinely transformative. I use them every day. My team uses them every day. They make certain categories of work dramatically faster and more pleasant. But the value is not in the volume they produce. The value is in the decisions they free you to focus on. The thinking. The architecture. The “should we build this at all” conversation that never happens when everyone is too busy generating code to question why.
Garry Tan shipping 37,000 lines per day is not a vision of the future of software engineering. It is a vision of the future of software accumulation. And if you have ever inherited a codebase that someone “moved fast” on, you know exactly how that story ends.
You spend six months figuring out what it does. Then you rewrite it. In fewer lines.





