A client asked us to "improve developer productivity." They measured productivity as story points per sprint -- averaging forty-two, wanting sixty. Meanwhile: three of five releases introduced regressions, median deploy lead time was eleven days, and customer bugs trended up twenty percent. "Productive" by their metric, failing by every measure that mattered.
Story points measure estimation accuracy, not productivity. Lines of code measure typing speed. Velocity measures throughput of an arbitrary unit. These are vanity metrics.
We track eight metrics in four categories.
Delivery performance (DORA metrics): Deployment frequency -- how often you ship. Our target: three to five times per week. Lead time for changes -- commit to production. Our target: under twenty-four hours, actual average six hours. Change failure rate -- deployments causing production failures. Our target: under five percent, actual three percent. Mean time to recovery -- our target under one hour.
Code health: PR review turnaround -- target two-hour median. Beyond four hours, developers are blocked and context-switching. Test suite reliability -- percentage of CI runs passing first attempt excluding legitimate failures. Target ninety-eight percent. Below ninety-five means flaky tests eroding CI confidence.
Product impact: Feature adoption rate -- what percentage of shipped features are used by over ten percent of users? In our experience, roughly forty percent. Tracking this forces focus on fewer, higher-impact features. Customer-reported bugs per release, normalized by size, indicating whether quality keeps pace with shipping speed.
Team health: two questions biweekly. "Scale of one to ten, how sustainable is the pace?" and "What is the biggest thing slowing you down?" If sustainability drops below seven for two periods, reduce scope and investigate.
Implementation: DORA from CI/CD logs via a lightweight script. Code health from GitHub API. Product impact from PostHog. Team health from a retro form. Two hours per month total.
Pick four metrics, automate collection, review monthly, act on negative trends. Never measure more than eight things. Never measure anything you will not act on. And never use metrics to compare individual developers -- fastest way to destroy trust and game the numbers.
About the Author
Fordel Studios
AI-native app development for startups and growing teams. 14+ years of experience shipping production software.
The best feature you can ship is the one you decide not to build. Here is how we say no to client feature requests while maintaining trust and demonstrating product thinking.
Product-led growth is a go-to-market strategy, but it succeeds or fails based on engineering decisions. Here are the technical patterns that drive self-serve adoption.
Most feature backlogs are wish lists disguised as roadmaps. We use a scoring framework that forces hard conversations about what actually matters and what just feels important.
We love talking shop. If this article resonated, let's connect.
Start a ConversationTell us about your project. We'll give you honest feedback on scope, timeline, and whether we're the right fit.
Start a Conversation