I wrote a paper a couple of years ago called “How to Misuse and Abuse DORA Metrics” based on real-world experiences using metrics to help teams identify constraints and the appalling and destructive ways others used them in the industry.
Recently, McKinsey released their “Developer Productivity” framework that reads like they reviewed my paper and said,
Don’t get me wrong, some of their metrics can be useful if used correctly. Measuring the number of handoffs is an example. However, their main focus seems to be “coders need to code.” That’s not how value is produced. Like many other organizations I’ve seen, it appears the McKinsey team read the Cliff Notes for DORA and SPACE metrics, missed the nuance and warnings, and went all in on recommending the wrong things to their customers. Here’s my feedback.
McKinsey’s metrics framework:
Code Review Velocity has nothing to do with measuring outcomes. Of course, neither do deployment frequency, lead time for change, or MTTR. Those all relate to optimizing processes for small batches and rapid feedback and response. The SPACE framework only uses code reviews and the example metrics around code reviews as an example of how to apply the framework to improve something. Outcomes should be goal-focused, not activity-focused.
- Code Review Timing: Yes, code review can create a lot of handoff waste. It doesn’t make sense to have this as a separate measure from “code review velocity.” If you value stream map the code review process, you’ll find that async code review is killing your productivity. Pairing improves that dramatically. Instead of trying to sub-optimize for code review, measure the thing we actually want to improve. Focus on the trend for the time required to complete and deliver a unit of work. Reducing that time requires reducing the size of every story and the size of every commit and optimizing every process upstream and downstream of coding.
- Story Points Completed: A “story point” is a made-up number. It was conceived as yet another way to obfuscate estimates for thought work that is difficult to estimate. As originally conceived, it represented the number of mythical “ideal days” of effort. There’s so much time wasted on getting better at “story pointing,” arguing about the Fibonacci sequence, “planning poker,” and other story point nonsense. Frameworks like SAFe recommend even more nonsense with “normalized story points” so that management can compare teams’ velocities. However, the volume of story points doesn’t mean anything if the goal, as shown in McKinsey’s chart above, is to optimize how work is done. Use metrics that track match size and defect rates if you want to drive waste out of the system. Value stream maps should be used to find and remove handoffs and the wait times they create. Story points are useless for anything and even more useless for this goal. Track throughput instead. However, also make sure to track defect rates and team burnout. The goal is smaller and higher quality, not more volume and longer hours.
- Handoffs: I like this one. Good job, McKinsey. Stop using testing teams, use pairing instead of code review, operate what you build, and don’t have any people doing anything manual to the right of development.
In the other focus areas they have metrics listed at the individual level that can be useful unless you measure “developer satisfaction,” “retention,” and “interruptions” at the individual level. Those should really be measured in larger aggregates and then only very carefully to prevent becoming targets. Things start getting really toxic in the “Opportunities focus” section, though.
Here’s what McKinsey has to say about “Contribution analysis”:
Contribution analysis. Assessing contributions by individuals to a team’s backlog (starting with data from backlog management tools such as Jira, and normalizing data using a proprietary algorithm to account for nuances) can help surface trends that inhibit the optimization of that team’s capacity. This kind of insight can enable team leaders to manage clear expectations for output and improve performance as a result. Additionally, it can help identify opportunities for individual upskilling or training and rethinking role distribution within a team (for instance, if a quality assurance tester has enough work to do).
I know exactly what happens when you have people focus on their individual output. I’ve measured the outcomes. I can only assume that McKinsey doesn’t understand “systems thinking.” The outcome is what I call a “pandemonium of developers.” You have a group of people all working on the same backlog but not acting as a team. Code review suffers, mentoring sufferers, pairing is impossible, work decomposition suffers, etc. Anything that requires more than one person, including helping someone get unstuck, will be deprioritized. Never measure individual output: ever.
As for “if a quality assurance tester has enough work to do,” tell me you don’t understand what QA should be doing on a team without telling me. Their job isn’t to test. Their job is to improve how testing is done.
The nonsense continues:
For example, one company found that its most talented developers were spending excessive time on noncoding activities such as design sessions or managing interdependencies across teams. In response, the company changed its operating model and clarified roles and responsibilities to enable those highest-value developers to do what they do best: code.
Oh. My. God!
Yes, that’s what a developer’s job is: typing. So, after McKinsey pointed out how much time developers were spending on non-typing things like design, the company siloed design from coding? I suppose they created a new role to manage dependencies rather than designing engineering solutions to handle dependencies. Oh, that’s right, developers shouldn’t be bothered with design. Perfect. Code, code monkey!
Later on, they ALMOST identify the real problem.
To truly benefit from measuring productivity, leaders and developers alike need to move past the outdated notion that leaders “cannot” understand the intricacies of software engineering, or that engineering is too complex to measure.
The real problem is that too many in management don’t understand the work they manage. Management can understand the intricacies of software engineering if they become leaders and study the work they manage. Not all managers are leaders. Some simply want a framework to hold people accountable. Good job, McKinsey, you’ve delivered!
And buried at the bottom:
Learn the basics. All C-suite leaders who are not engineers or who have been in management for a long time will need a primer on the software development process and how it is evolving.
Exactly. The reason management struggles to measure the right thing is that they don’t understand the work they want to measure. Those who do understand tend to measure the right things. The one thing McKinsey doesn’t do with this framework is help fix this problem. They are making it worse.
This new approach has been implemented at nearly 20 tech, finance, and pharmaceutical companies…
I’ve some predictions about the “nearly 20” companies that use McKinsey’s framework:
- They have change advisory boards.
- They have testing teams
- They have architecture review boards
- They use SAFe or some other agile scaling framework
- They have poor architecture and no plan to improve it
- They have feature factory teams
- They claim “CD won’t work here”
They may not have all of these, but I’ll wager that they have more than half of these problems. All of these reduce productivity, and none of them are improved by measuring how much time people are heads down coding and how many Jira tickets they complete. Measure the system, not the people.
This McKinsey article was written by Chandra Gnanasambandam, Martin Harrysson, Alharith Hussin, Jason Keovichit, and Shivam Srivastava. If y’all want better information, I’m happy to consult for McKinsey. I promise you’ll have better information to share with your clients when I’m done.
Update: Other responses to McKinsey’s nonsense.
- John Cutler: The Ultimate Guide to Developer Counter-productivity
- Dave Farley: My Response To The NONSENSE McKinsey Article On Developer Productivity
- Gergely Orosz & Kent Beck: Measuring developer productivity? A response to McKinsey
- Dan North: The Worst Programmer I Know