How to Misuse and Abuse DORA Metrics

Since the 2018 release of Accelerate, DORA metrics have grown in popularity to measure development. However, there has also been a growth in using them for the wrong reasons resulting in poor outcomes. In this talk, I’ll be discussing common problems we are seeing with the spread of DORA metrics, how to use them appropriately, and try to dispel the myth that there are only 4.


Bryan Finster

Distinguished Engineer & Value Stream Architect, USAF Platform 1



Hi everyone. Welcome today. We're going to be talking about how to misuse and abused or metrics. I'm Brian, Fenster a distinguished engineer for defense unicorns, working as a value stream architect for the United States air forces platform. One also board advisor for the value stream management consortium. They're doing some good work there. I suggest you look them up. And I firmly believe that if we deploy more, we will absolutely sleep better. Back in 2014 in a previous organization, we had an ask from our leadership to take our very large and complex system that was deploying every quarter and find out how we could deploy it by weekly so we can better serve the business need. And so we started digging in, uh, in embark copies of continuous delivery to figure out how we could get that done. And, you know, we had some ideas, but we really wanted to get into the nuts and bolts.


And once we started really looking into it, we said, well, why can't we release every single day? Uh, in fact, we had a little bit of pushback from this, but we were like, no, this is really the goal that we should be going after. And so we did the first thing we did was we aligned teams to business domains. We did some deep domain decomposition of the capabilities we wanted to implement first, um, built some team structures around that and said, okay, well we need these teams building a loosely coupled architecture that would allow them to deploy in any sequence, uh, at any time, uh, on demand. And, uh, then we also built a CD platform team that would build the platform that the teams could consume so that the teams can construct pipeline, submit their particular needs while also implementing some core capabilities for the entire area.


And we had some lessons from this. I mean, the first thing was, I said, you know, previously we'd had a, a scaled agile implementation that was just not giving us the outcomes we wanted. Um, and what we learned was is that as we started decoupling the system and letting these teams run as fast as each one of them could run, that we w it was scaled much better, and it gave us much better outcomes and trying to use a process framework. Also, we learned that, just the question, why can't we deliver today? Let's let's uncover these problems. It was really the best tool for finding the organizational issues, both within the teams and within the area that was impacting the teams for, uh, to, to get to where we wanted to be for that daily delivery. That as we started driving down that batch size and say, well, really, why can't we get today?


I mean, you know, we, we deployed last week. Why can't we do it every single day? It was very good at uncovering those problems and then letting us methodically go after and correct those issues. We also learned that CD absolutely improves outcomes, but more importantly, and improve morale and that the teams are implementing continuous delivery flow. We're much happier in other teams. We're really looking forward to getting better at that as well. And we spent a lot of time doing education about, Hey, here's what we found that works, what doesn't work and spreading that message.


We also found we needed better metrics. Uh, the things that we were measuring were not giving us, we wanted, because anytime you're measuring things that have to do with people, it changes people's behavior. And so the metrics we chose didn't change behavior in the way that we intended. So the first example of this was we knew that for continuous delivery to work well, we needed to improve how we were testing. Uh, we couldn't wait to test the end. It had, it happened integral to the flow of delivery. And so we thought, well, if we measure test coverage and what we should see is improved testing. Now, what we actually saw was an increased number of poor tests that we saw people doing testing sprints, uh, delaying features to work on tests, to meet a minimum number that we picked, but the tests themselves weren't very good at all, because we hadn't really focused on how to test better. We just said, test more. And so they did. And what we really should have been focusing on is the outcomes we wanted and not compliance to a number we should have really been focusing on how do we improve deploy frequency while also improving our defect rates so that we have smaller batches of more high quality work and doing that would have improved testing.


Something else we were looking at was we were looking at completion rates. We wanted to be more predictable. And so we thought, well, if we focus on completed versus committed work, that the teams would be better at keeping those commitments. And what we really found was that it, it, it, uh, focus people in planning and sort of working that if, you know, to meet the number, they wouldn't spend a lot of time and very detailed planning to make sure the timing would fit exactly and where they wanted to for the sprint or the quarter to meet their, their metric really should have been focusing on, again, is outcomes. We wanted to improve flow of delivery of smaller batches of work that could be more consistently delivered so that we would be predictable. And so by starting to shrink the development cycle time, and then the lead time and focusing on improving defect rates at the same time to have balance to not drive off the rails really would have given us better outcomes.


And honestly, the very first time I went to dev ops enterprise, so went, I was there to try to learn about metrics, to try to improve these sentences. So in 2017, we had an enterprise goal to expand this to all the teams. So we knew that from our experience and continuous delivery was a really effective tool for improving everything around the enterprise. And so let's just do that for everyone. And so what we did was we created a global platform that was very opinionated and around continuous delivery flow, that the tool was easier to use when you were doing a continuous delivery flow, where you're using trunk based development and continuous integration, then it would be if you're using something like get flow or some other, uh, legacy process that was not compatible with CD. And we also gave a fighter metrics who brought in hygiene and then started putting, you know, kind of scores on some of these signals that, you know, how are you doing a trunk based development, continuous integration, how's the daily deploy coming, are the pipeline stable?


Do you have a good static analysis coming in that, you know, you're generating cleaner code and all these things. And, and the gamification proved to be pretty effective. What we learned was that the game of hide metrics were really effective for like the first half the people who were interested in learning about continuous delivery, uh, or just improving delivery, uh, at all. And the leading edge, people were very competitive about it. They would come and challenge us all the time about how we were scoring things to try to make that better. We also learned the opinion that a pipeline encouraged people who were the late adopters, because to me, it was a little bit painful to automate their process. There was a lot of hoops you'd have to jump through and work around. So all sorts of things to try to get their old bad process implemented with the CD pipeline. And so they would come to us for help. And that actually led to us creating a dojo to help teams execute continuous delivery. And, uh, actually I wound up leading that dojo for several years.


We also needed a better way to communicate why we're doing this. We knew at the CTO level, because that was the goal is we're gonna improve the enterprise with continuous delivery. That was a top-down request from the CTO. And we knew in CD platform because we'd executed this way in other areas. And the thing we didn't have good though, it was a good way to communicate to everybody else we were trying to get done. Uh, we, we needed some better ways to do that, but then in 2018, accelerate came out and it validated what we'd learned that continuous delivery was absolutely a good tool for improving the performance and quality of, uh, of what we're delivering and improving the organization as a whole and improving, uh, how people felt about working here that it was less burnout and less pain. And then it gave us a way to communicate that broadly.


I mean, we literally bought six cases of accelerate, handed it out globally to, uh, executive leadership and senior leadership with a flyer inside. Please read this book. Uh, this will explain what it is we're trying to get done and really, uh, help you along your path as you try to help the teams in your area. And also give us four key metrics. You know, these should be familiar with anybody who's really been focusing around here. That's lead time, deploy frequency, meantime to restore and change, fail percentage to talk about how are we, how healthy is the system of delivery and how's it improving? And we said, look, here's these four key metrics we can look at. Um, and what we can also track these and it caused a problem. What happened was that those four metrics really make sense in the context of the rest of the book, but you have to read the book for those four metrics to really make sense, because measuring behaviors is a complex problem.


If you are measuring people, anything around people's, we complex and you have to always be inspecting and adapting and verifying that what you're doing, isn't causing adverse behavior and people didn't read the books. Some people did, and we would have a really good conversations with those people. And they became really good advocates what we're trying to do, but others didn't, um, the books gathered dust. And so we weren't being effective at doing the communication we needed so that they would understand how these things work. And that caused the metrics to be oversimplified. That the way we were communicating things made it sound like all we have to do is measure our way to improvement. And some people started doing that because the purpose got lost in translation. And we, we wound up in chaos because we'd oversimplified the problem.


And this led to people using metrics in ways that absolutely should not have been. Here's an example of using metrics as goals. I've seen areas where they had OKR is for door metrics. And that's not what these are for. The reality is, is that you don't high performing organizations are not high-performing because they focus on the door metrics. High-performing organizations focused on improving the value delivery, uh, to their end users. And methodically got better at that. And that any goals we have should be focused on our business outcomes and not metrics that, uh, focus on health. They also use for productivity. Now we can find out which teams are less productive and go and figure out how to fix them. But the fact that matter is, is every team has their own context. And the absolute value of the door metrics is very dependent on our context.


So that team is delivering and we should see improvement, but not try to use it to stack rank teams. But also this is not about productivity. This is about measuring the health of the system of delivery. And if you don't have the right culture anyway, then using these metrics will be destructive to our goals of, or speed that we needed to deliver faster. So we need to have a higher deploy frequency and everything. That'd be great. I mean, the problem is, is that these are not, that it's not a speed metric it's batch size metrics. The goal is not to go faster. The goal is to reduce the size of delivered batches so that we have higher quality. That's more stable that we can recover faster when things fail, which they will, and that the reducing as batches exposes and a waste that we can remove.


And it makes things more efficient that we get more efficient quality processes, a more efficient just process in general and doing that doesn't crude speed. But if we just focus on the delivery frequency, won't let speed because the quality will decrease. We also saw vanity radiators. These are dashboards people would create. And here's the thing. This is actually an example for a different organization. The one I was in where it shows look how good we are at Dora metrics. And so you can see here that deploy frequencies, obviously the most important thing, but if you are really deep in the continuous delivery, there's all sorts of red flags in here that aren't really apparent. Uh, if, if you don't have a deep knowledge of it, but more troubling, how big is each one of these orgs? There's no context here. What's the time range we're looking at? You know, for the first one is at 437 deploys in the last two years or last week. Don't know. And we don't know if this is better or worse, or what action to take from this data. It's just look how great we are.


These metrics are health indicators. That's what they're for. We're trying to find out, are we reducing the batch size because smaller batches make things better or are we improving our quality and reliability? Those things are incredibly important and shrinking batch size up making those things better. Isn't going to be helpful. We need to track that as well. We're also trying to reduce toil because not only does it make us more efficient, but it makes people happier not to do that. We're accelerating the feedback. We're getting the information back to the team, the product owner, or the organization to inform our business goals. And do we have happier customers because none of the others matter if our customers aren't getting better value. And do we have happier teams because we have happier teams, we're going to have better solutions that are more stable with higher uptime because people can focus on that instead of just trying to meet the next metric.


And are there only four metrics? Um, no. The continuous delivery metrics that, that we have here are that they're for continuous delivery, but they are not a complete view of our entire organization. And they're not the only four metrics. And the door recommends if you look at it through accelerate. And I did briefly for this talk to prepare that I went through and I pulled out some things I knew were in there, um, around flow, you know, we need to improve the flow of delivery and look at where we have constraints through the organization. Continuous integration is key because we're not going to make CD better if CIS not better. And these are metrics that are really easy for the team to control. We have to have the right culture. We need to make sure that people are happy. They want to work here. We have trust and we need to avoid burnout. We need to make sure the customer outcomes are good. You know, it doesn't, again, it doesn't matter if we improve everything else for the customer is unhappy.


We need good information, radiators things that cause actionable, uh, motion with the data. So here we actually have an example of something we're building inside platform. One that will be open-sourcing as part of our platform on the left, we have whip load. And we're talking about how many things do we have in progress compared to how many people do we have in the organization or in this case, a team, because we have too much whip than there's lots of context switching going on, nothing really gets finished. It's it just causes stress and burnout. Instead of us actually finishing things is Dominica to grant us. We'll say to stop, starting start finishing on the right we'll use showing how we're all we're doing around continuous integration, because this is something that's actually, we can gamify and show people. This is what CEI looks like, and this is how you're doing on CGI.


And, um, this is what good looks like, because if we don't show what good looks like, no one has anything to compare it to, to say, how do we, you know, no incentive to improve, but then it's not good enough to just say, this is what good looks like it. This where you are. That's just cruel. What we should do is provide tips for improvement that are down to earth. These are some good ways to break down work, better tests, better do evolutionary coding. All these things required for continuous integration. So the teams can self-improve. And so we don't have to talk to everybody in the organization directly about how to get that done.


We need to understand the relationship these metrics have. I mean, we've talked a lot about there's more than these four metrics and we need to look at all of them, but how, for what reasons, number one, we need to be focusing on business and customer value. That's the most important metrics we have and we need to identify with high value is how are we going to measure it and track it? And we need to have goals. We need to have objectives that are with objective key results that tell us whether we're getting closer to our goals and that those objectives are actually improving our business and customer value. Then we need flow metrics to verify that we don't have any constraints, or if we do have constraints, figure out ways to clear them. You know, do we need to reorganize some of the teams in the organization?


Um, do we need to, to find ways to automate things that aren't today or fix other sorts of constraints we have, and those flow metrics are really important. And of course, continuous delivery CD is the driver of everything that's going to make things better. And these DOR metrics are key for measuring those, the health that our CD processes at the team level, but they are not goals. And then that all lays on the foundation of continuous integration. These are things that the team has absolute control of and things that we can provide tips to improve at the team level that will help improve everything else and help uncover issues with CD and flow and why we can't meet our okay, ours. And with all of these metrics, you need to understand how that data's being generated so that you can make the right decisions that the CCI metrics are coming from the tools themselves, you know, they're coming from people actually making changes to code.


And that data is very high fidelity. Um, as you start moving up, the stack things get a little bit more fuzzy flow metrics are coming from people moving cards. And my experience has been that the quality of that data is going to be tightly tied to how useful people find moving cards to communicate work that's being done within the, at the team level. And so we have to find ways to make sure that they understand how to use this, to help themselves so that they will focus on doing it. Well, we should also be reporting data quality issues on this. And as you get up to the top, he gets much more abstract. I mean, profits, you know, pretty concrete, but I mean, when you, when you're talking about value becomes more abstract and you need to understand the, where the data's coming from. So you can make the right decisions and not just look at a number and say, oh, this number means something very concrete when it may not, you also need the right culture.


You need to make sure that we have trust because trust within the organization is the first impediment to delivery. Along with that, we need a shared mission that people care about because we can't just hand things off to teams and expect them to deliver good results if they don't care about the mission. So we need to understand that mission and focus on it. We also need to have a culture of learning and improving so that we don't have a fear-based culture where failure is not an option. We need to make sure that failure is not something that's happening every day. But when we fail, we take the appropriate steps to learn from it and move forward without pointing fingers at people, we need to look at how the metrics stack up and understand that if I focus on improving flow, but I don't focus on improving continuous integration, I can only improve it as it is to the point at which the CII becomes a constraint.


And with teams not focusing on CGI, that can be really, really quickly. We need to make sure that we're focusing on how to improve CGI for one to improve flow. But we also need to make sure that we're investing our education of the, of the teams. So the teams know how to do that and not have our business depend on hobbyist learning in their spare time, how to do continuous integration and hopefully find the right information to how to get it done. Well, we really need to invest on providing good to them and give them the time to practice, how to get this, get better at this. And all of the skills required metrics require balance. We need smaller batches because they uncover pain, but we also need to make sure that we are focusing on quality as a guard rail to make sure that the smaller batches don't aren't, aren't incentivizing people to skip quality steps, but we still need to make sure that we're getting feedback from the teams that they're working in a sustainably sustainable way, that we're not seeing a metrics improvement because they're working 16 hours a day to get it done, that we are sustainably improving because if we don't, we can absolutely see a metrics improvement for a short period of time, but then people get burned out and that was all fake.


It was all through heroics. We don't want heroics.


So some closing thoughts there aren't four Dora metrics. Yes. There's four outcome metrics from continuous delivery and continuous delivery is incredibly important, but there's more than 14 metrics and accelerate. I encourage everyone to actually read the book a few times because there's a lot of subtlety in there. Also product development is a very complex interaction of people, products, and process. There are no simple metrics. I can't give you one or two metrics to tell you how things are doing. You need to go and really understand what it is you're measuring and measures require guard rails, prevent perverse incentives, always focus on we're measuring this thing. Now, is there anything else going wrong? What could go wrong and track that as well, metrics are a critical part of any improvement toolbox, but we can't measure our way to improvement. We can use them to monitor improvement and inform our next improvement experiment, but the numbers themselves won't get us there.


And don't measure people. Don't focus on individual productivity. You should only measure the lowest common level to measure is going to be the team level. If you go below that, it's going to cause destructive outcomes. I've seen it multiple times invest in our people because our most valuable asset and the people in the they are going to cause us to have success. So some resources that you can pick up, of course, accelerate, I highly recommend sooner, safer, happier by John smart. It's a really good book to talk about patterns and patterns for an enterprise improvement. And then Gary groovers engineering. The digital transformation is an excellent nuts and bolts guide that does a really good job of translating from manufacturing to software and talks about what's same, what's different, um, and gives you really good ways to methodically improve your quality process. And also Dave Farley, one of the coauthors of continuous delivery has been releasing a ton of really good content on YouTube.


And I created a short link here for his YouTube channel. I recommend everybody check it out and just, you know, spend a few days watching his videos. It's really entertaining. So help that I need. How are you measuring value? I know very well how to measure waste, but trying to measure value in a way that we can sort of indicate at the abstract level of the entire organization, the value we're delivering, what it's not simple, one product I'm really interested in this and please reach out to me afterwards and let's, let's have a conversation about how to do that better. And let's talk more about this. I love talking about improving. I'm very passionate about it, cause I know what it does for the organizations, but more importantly for the people doing the work. And you can look at my medium blog where I have the five minutes of simplifying dev ops. Also I'm Brian Fenster on LinkedIn, or just email me at Brian dot Fenster defense, And let's, let's talk about how we might be able to help. Thanks very much.