The Definition of Done (DoD)
The Definition of Done (DoD)
“Done” Is the Most Dangerous Word in Software Development
Every team has a Definition of Done. Most teams just haven’t written it down, which means every developer has a different one. That disagreement — silent, assumed, never examined — is responsible for a remarkable amount of the rework, the bugs, the “but I thought you said it was done” arguments, and the features that technically shipped but practically didn’t work.
Let me describe two teams.
Team A says a story is done when the code is merged to main. Testing happens later. Documentation is optional. The feature might work in development environments. It might not work in production. It definitely hasn’t been reviewed against the acceptance criteria because that’s what QA does.
Team B defines done as: code reviewed by a peer, unit tests passing, integration tests passing, deployed to staging, QA verified against acceptance criteria, performance benchmarks met, documentation updated for any changed user-facing behavior, no known defects that block users from completing the core workflow.
Same word. Completely different realities. Team A’s sprints consistently end with velocity numbers that look good and sprint goals that aren’t met. Team B’s velocity is lower but their releases are real.
Why Teams Have Weak Definitions of Done
It’s uncomfortable to enforce. A strict DoD means stories that look done need to go back. Developers feel blamed. Sprint goals get missed. Velocity drops temporarily. Managers get nervous. There’s significant social pressure to keep DoD loose.
It exposes problems. A strict DoD on testing will immediately reveal that the team doesn’t have adequate test coverage. A strict DoD on deployment will reveal that your deployment process is painful and unreliable. Teams that have been hiding technical debt behind a loose DoD will see their sprint velocity crater when they start enforcing real standards. This is accurate information, and it feels bad.
“Done enough” culture is self-reinforcing. Once a team has established that 90% done counts as done, the technical debt accumulates quietly. Features shipped without proper testing generate bugs. Bugs generate patches. Patches generate more technical debt. The team gets slower every quarter. The DoD doesn’t feel like the cause because the relationship is indirect and delayed.
What a Good Definition of Done Actually Contains
A DoD should reflect your team’s actual quality standards, not an aspirational list of practices you don’t currently do. Build it from the real checkpoints that make the difference between features that work reliably and features that cause support tickets.
For most software development teams, a minimum viable DoD includes:
Functional verification: The code does what the acceptance criteria says. This should be verified by someone other than the author — either a QA specialist or a peer, depending on team structure.
Automated test coverage: The new or modified functionality has automated tests. Not 100% coverage on every line — that’s theater — but meaningful tests that would catch obvious regressions.
Integration verification: The code works in the integrated system, not just in isolation. “Works on my machine” doesn’t satisfy the DoD.
Code review: At least one other developer has read the code and had the opportunity to identify problems. Not rubber-stamping, actual review.
Deployment confirmation: The feature has been deployed to a staging or test environment that resembles production. Stories that are “done” in development but not deployed to staging are not done.
Known defects cleared: No known defects that block users from completing the primary workflow. Bug-riddled features are not done; they’re shipments of problems.
Non-functional requirements: Performance, accessibility, security — whatever your product requires. Define these for your context, not as a generic list.
The Common Mistakes
Making the DoD aspirational. A DoD that describes where you want to be in six months rather than your current standards is useless today and creates confusion. Build your DoD from what you can actually do now. Upgrade it as your capabilities improve.
Making the DoD too granular. A 25-item checklist that nobody reads is worse than a 6-item checklist that everyone takes seriously. Focus on the criteria that actually distinguish “done” from “done enough.”
Not making the DoD public. If the DoD exists in a Confluence page that nobody’s visited since it was created, it’s not functioning as a shared standard. Put it somewhere visible — on the wall, at the top of the sprint board, in the story template.
Not updating the DoD. Quality standards should evolve as the team learns and as the product matures. A DoD that made sense for a pre-production product might be inadequate once real users depend on the software. Schedule a quarterly review.
Treating DoD as a QA gate rather than a team standard. When “done” means “approved by QA” rather than “the team has verified it meets our quality standards,” quality becomes someone else’s responsibility. The whole team should be accountable to the DoD.
How to Fix a Dysfunctional DoD
Start by making the current implicit definition explicit. In your next retrospective: ask each team member to independently write down what “done” means to them for a typical story. Put the answers on the board. The variation will be instructive.
Then have the conversation about what “done” should mean — specifically, what checkpoints prevent the problems you’re actually experiencing. If bugs are your biggest pain point, focus on testing criteria. If deployment is painful, focus on deployment verification. Ground the DoD in your real pain, not in a template.
Introduce the new DoD at the start of a sprint with the explicit acknowledgment that some stories may not make it to done under the new standard. This is honest. A sprint where 8 stories meet the real DoD is more valuable than a sprint where 14 stories meet a fake one.
The Payoff
Teams with rigorous Definitions of Done have predictable release cadences because “done” actually means shippable. They spend less time on bug fixes because bugs are caught before the story closes. Their velocity stabilizes at a lower but more honest number, and then typically increases as the technical debt stops accumulating.
More importantly: the conversation about DoD forces explicit agreement on quality standards that most teams never have. That agreement is worth having regardless of what specific criteria you land on.
What your team agrees it means to be done tells you more about your engineering culture than any metric. Make the agreement deliberate.