DORA Metrics: The 4 Numbers That Predict Engineering Team Success
A decade of research across tens of thousands of developers revealed what makes high-performing teams different. Here are the 4 metrics that predict success.
If you work in software, you’ve probably heard someone mention DORA metrics. Maybe in a standup, maybe in a retrospective, maybe in a thread about why deploys take forever.
But what are they, actually?
DORA stands for DevOps Research and Assessment. Started in 2014 by Nicole Forsgren, Jez Humble, and Gene Kim, this research program has surveyed tens of thousands of software professionals across thousands of organizations. (The 2024 report alone included over 39,000 respondents.) The question they wanted to answer: What separates high-performing software teams from everyone else?
Not opinions. Not hunches. Data that actually correlates with business outcomes - profitability, market share, productivity.
They found four metrics that consistently predict success.
The 4 Metrics
Deployment Frequency
How often do you ship code to production?
Top teams deploy multiple times per day - sometimes dozens of times. Struggling teams might ship once a month, or even less frequently. The gap between the best and worst is enormous.
Frequent deployments mean you’re delivering value continuously instead of stockpiling changes into big, risky releases. It’s also a signal that your pipeline isn’t broken - that you can ship when you need to.
Lead Time for Changes
This is the clock from first commit to code running in production.
High-performing teams get changes out in under a day. Low performers can take a month or longer.
Think about what that means. On a high-performing team, you fix a bug in the morning and users have it by afternoon. On a struggling team, that same fix sits in review, waits for a release train, gets bundled with 50 other changes, and finally ships weeks later - if nothing blocks it.
Short lead times mean fast feedback. You ship, you learn, you iterate. Long lead times mean you’re guessing.
Change Failure Rate
What percentage of your deployments cause problems? Outages, bugs, rollbacks, emergency hotfixes - any of it counts.
The best teams keep this in single digits. Low performers? The 2023 data showed them averaging around 64% - meaning more than half their deployments caused issues.
This is the counterweight to speed. You can deploy 100 times a day, but if half those deployments break something, you’re not fast - you’re reckless. The best teams move quickly without leaving a trail of incidents behind them.
Recovery Time
When something breaks in production, how long until it’s fixed? (DORA now calls this “Failed Deployment Recovery Time” to focus specifically on deployment-caused failures.)
Things will break. That’s not pessimism, it’s reality. 100% uptime doesn’t exist. What matters is how fast you bounce back.
Top teams recover in under a day - often within hours. Low performers can take a week or more. By the time a struggling team finishes their incident triage meeting, a high-performing team has already fixed the problem and moved on.
Why These 4?
They balance each other.
Deployment Frequency and Lead Time measure speed - how fast you move. Change Failure Rate and Recovery Time measure stability - how well you move.
You can’t game this by optimizing just one. Deploy constantly but break everything? Your Change Failure Rate exposes you. Slow down to avoid failures? Your Deployment Frequency drops.
Two metrics for velocity, two for stability. Together they answer the question that matters: how fast can you ship and how reliably can you do it?
The Benchmarks
DORA groups teams into performance tiers using cluster analysis on survey data. Here’s the important thing to understand: these benchmarks shift every year based on how the industry is actually performing. Some years there’s an “Elite” tier, some years there isn’t (2022 had none). The clusters aren’t fixed targets - they’re snapshots of where teams currently stand.
That said, the general pattern holds:
Top performers deploy on-demand (often multiple times daily), with lead times under a day, failure rates in single digits, and recovery measured in hours.
Low performers deploy infrequently (monthly or less), with lead times stretching to weeks or months, failure rates that can exceed 60%, and recovery times that drag past a week.
Here’s the part that got everyone’s attention: high performers are 2x more likely to exceed their organizational goals - profitability, productivity, market share.
These aren’t vanity metrics. They correlate with business outcomes.
What These Metrics Won’t Tell You
A few things to keep in mind before you start measuring everything:
Don’t use these to rank developers. DORA metrics measure teams and systems, not individuals. The moment you start comparing Alice’s deployment frequency to Bob’s, you’ve missed the point entirely - and probably created some resentment.
Context matters. If you’re shipping medical devices with six-month regulatory cycles, “deploy multiple times per day” isn’t a realistic target. Same goes for embedded systems, hardware, anything with physical constraints. The research applies broadly, but not universally.
These aren’t targets to hit by next quarter. The goal is understanding where you are and what’s slowing you down. Rushing to match elite benchmarks usually means cutting corners that show up later.
Delivery isn’t everything. DORA focuses on how well you ship software. It doesn’t measure whether users actually want what you’re shipping, whether your codebase is maintainable, or whether your team is burning out. Important stuff lives outside these four numbers.
Why These Caught On
DORA metrics spread because they solved a real problem: teams arguing in circles about how to get better.
Before DORA, “we should deploy more often” was just an opinion. Now it’s a conversation grounded in data. You can look at your deployment frequency, compare it to benchmarks, and have an actual discussion about what’s blocking you.
They also work regardless of your stack. Monolith or microservices, Python or Go, AWS or your own servers - the metrics apply. And they’re backed by years of research, not some consultant’s slide deck.
Here’s the thing to remember: DORA metrics aren’t about comparing yourself to Google. They’re about comparing your team to itself last month.
Where are you now? What’s slowing you down? What would it take to get a little better?
That’s it. Improvement over time, not chasing someone else’s numbers.