As we try to improve the flow of value to the end-user, the first item that usually gains focus is the productivity of development teams and how to measure it. I’d like to propose that productivity is measured by customer value delivery, not team output. However, that reality is often lost as we rush to find easy numbers to get a handle on measuring teams. Misusing metrics undermines the goals of improvement efforts and playing Whack-a-Mole with metrics anti-patterns is tedious. Hopefully, the anti-patterns cheat sheet will help.
Myth: How long it will take to complete a story
Reality: Story points are an abstraction of how complicated, how uncertain, and how big something is expressed by an arbitrary number that is only meaningful to the team, kinda. It’s mostly a holdover from Waterfall estimation and there are better ways to get this done.
If you measure me by it: If you want more story points, I’ll increase the number of points by creating more stories or increasing the number of points per story. Agile: where the stories are made up and the points don’t matter. Ron Jeffries said, “I like to say that I may have invented story points, and if I did, I’m sorry now.” The original intent of story points was to create an estimation abstraction instead of explaining to stakeholders, “yes, I said 3 days but that’s three days where I only need to develop and everything goes correctly”. Listen to Ron and split stories to a day or less and stop estimating.
Myth: Higher velocity/throughput means we are more productive!
Reality: Velocity or throughput give us an average of how many things of “about this size” we have completed during a time box so we can plan in the future.
If you measure me by it: If you want more completed in the same amount of time, I’ll create more tasks or increase the number of story points per task. “OK, I’ll just standardize the size of those!”, you say. Sorry, you can’t. You can make story points mean days, but that just means you’re faking agile delivery. If you want day estimates, then use days. Just be aware that estimation is wasteful. That time is better spent delivering and establishing a reliable history of delivery that makes us predictable. Also, consider that every change we make is new work that has never been done before. We aren’t building manufactured housing where each is the same. We are architecting something bespoke. We only have general ideas on time.
Code Coverage %:
Myth: Code coverage means we are testing more, so we should have minimum code coverage standards.
Reality: Code coverage indicates the percentage of code that is being executed by test code. Test code has two important functions.
- It executes the code we want to test.
- It asserts some expectation about the results of that code.
Code coverage reporters measure the first activity. The second cannot be measured by robots because the efficacy of an assertion is knowledge work.
If you measure me by it: I’ll make sure more code is covered. That’s good, right? Maybe, but maybe not. It costs money to write and maintain test code, so we should eliminate tests that are not providing value. One example of meaningless code coverage is testing anemic public
setter methods that contain no business logic. Why these are bad is too deep a subject for this topic, but the result is we have methods that do nothing but access data. Testing them means we are only testing the underlying language. We cannot fix the language so we should not test it. Another example is just as wasteful but far more dangerous.
This test increases code coverage and hides the fact that the code is not tested. It’s safer to delete this test than to leave it in place.
Committed vs. Completed:
Myth: If we measure teams by how much they completed vs. what they committed to, they will work hard to meet the commitments and we’ll deliver more!
Reality: No battle plan ever survives contact with the enemy. The reason we are not using Waterfall anymore is that the industry has come to terms with the fact that life is uncertain and we cannot make hard plans for the next month, much less the next quarter or longer. One or more of the following is always true:
- The requirements are wrong
- We will misunderstand them
- They will change before we deliver
If you measure me by it: There are two actions I will take to ensure I meet this goal. First, I will commit to less work. This might be good because we tend to over-commit in the first place. It won’t make me work faster though. The other thing that I will do is to stick to the plan, no matter what. Defects will be deferred. If higher priority work is discovered, I’ll defer that too. Adherence to a plan ignores value delivery, but measuring me this way means you care more about the plan than the value. I care about the same things you care about if you’re paying me.
Lines of Code
Myth: If I am typing more lines of code per day, I’m more productive, so we need to measure lines of code or the number of commits per day for each developer.
Reality: Development is mostly not a physical activity. A top developer does more thinking than typing. The code is only the documentation of the proposed solution in a format that a computer can understand. The work is coming up with the solution and writing it in a way that is clearly understandable by future developers. I’ve delivered small enhancements before that resulted in the deletion of over 5,000 lines of code.
If you measure me by it: I’ll certainly make sure more code is added. I’ll have no real incentive to make the code easy to read or avoid re-use. Copying and pasting code is much easier than spending the time to develop a well-designed business solution.
Number of Defects Fixed
Myth: If we measure the number of defects fixed, there will be fewer defects
Reality: Defect reduction comes from working to improve delivered outcomes, not by focusing on the number of defect tickets we’ve closed.
If you measure me by it: Worst case is you pay me a bonus for defects fixed. Adding minor defects that I know where to find and fix is pretty simple. The next worst case is I’ll prioritize minor defects over critical feature work. I’ll also be scheduling meetings to argue if a defect is really a feature because moving it from “defect” to “enhancement” counts as defect reduction as well.
What should we measure?
We need to measure things that improve reduce waste and improve the flow of value. In the next installment, we will discuss some better metrics.