OKRs & DevOps: From Micromanagement Misery to Finding Flow

OKRs & DevOps: From Micromanagement Misery to Finding Flow

DM

Dr. Mik Kersten

Founder and CEO, Tasktop

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

Thank you, Admiral Richardson.

Okay. The next speaker is someone who should be familiar to most of us in this community. Dr. Mik Kersten wrote the amazing book, Project to Product, three years ago, and it summarizes his 20-year journey understanding how to truly unlock developer productivity.

And so over the years, he's taught us how to properly diagnose flow problems in the technology value stream so that technology can help best achieve our business goals, and how organizations can move from the world of project management to product management.

Throughout this past year, we've had numerous conversations about how so many organizations are having trouble rolling out OKRs, or objectives and key results, especially when top leaders overreach into the planning process, using OKRs to bludgeon their favorite project or feature into the plan. Dr. Kersten will talk about his seven years of experience using OKRs and what he's seen go right and wrong in large, complex organizations. Here's Mik.

Dr. Mik Kersten

Hello, everyone. I'm coming to you here from Vancouver and thrilled to be participating in the DevOps Enterprise Summit, my favorite conference of the year, both in London and in Las Vegas.

And the topic for today is OKRs and DevOps: how we can go from micromanagement misery to finding flow.

So I've now had eight years of experience managing my own company with OKRs. I've noticed that over the past, especially the past few months, but really the past couple of years, as DevOps has become a CEO, an executive suite, and a boardroom topic, and as the shift from project to product has really helped more organizations focus on technology and understand how to start innovating, how to start measuring and tracking for value, this has become a really key topic where some organizations are using OKRs very effectively to better understand how to connect technology and business and deliver more for their customers, whereas others are failing in what, to me, were very surprising ways.

So let me quickly talk to you about what I've learned within my company, supporting other organizations who are using this methodology, and then how you can apply some of these lessons to your own initiatives.

So first of all, objectives and key results are just one way of tracking plans and goals, and some of these concepts will apply to any other ways that you're measuring value. There are OGSMs and other approaches, but we're going to dig in specifically to these OKRs.

So the objectives are the what, and these inform actions, and they're really about finding a way of measuring value to how much we're delivering for customers or market or colleagues. And the KRs, the key results, are the benchmarks for how we're doing that. So the point of OKRs is that you have around three to five of those. They're the self-focusing factor in how they cascade and how they spread across the organization.

And they're really summarized very well by John Doerr's book, Measure What Matters. It came out in 2018. I'd actually been applying them for years. It was very helpful for me to see this book. I had my whole leadership team read it.

But I found that with this book, there was a really significant gap, and I think this gap is where a lot of the deployments of OKRs fall into, and that's the how. How do we measure value for software delivery?

Because if we don't have a consistent and clear way of measuring value, then what happens is we start measuring the wrong things. If we measure the wrong things, we often get into these pitfalls of actually slowing down our teams, of inspecting at the wrong levels, of cascading far too low, mucking and micromanaging where we shouldn't be.

And so in Project to Product, I introduced the Flow Framework, where the Flow Framework was really meant to capture a way of measuring flow and connecting that to these business results at that top-right part of the Flow Framework, where flow metrics we've been using for many years now, for eight years now at Tasktop, for measuring key results related to flow, related to how we're delivering value for our customers, and then combining those with the business objectives around the other financial metrics and all the other metrics in the organization.

So I'll show you how we can combine these things and have them work effectively and how, in the end, it really is all about making sure that you include flow metrics in your key results in order to connect business and technology.

So first of all, the question of why flow? I think in recent news, we've had really interesting examples of why flow is so important and how problematic bottlenecks can be. So we saw the entire shipping industry grind to a halt and our products all get delayed in terms of getting to consumers because the Suez Canal became a bottleneck.

And if we look at this and we actually realize that when we have this kind of bottleneck, the entire system slows down. There've been these amazing visualizations that show how it wasn't just the ships around the canal, but the global shipping system ground to a halt. And yet we're seeing these kinds of things all over organizations, and the question is, are we actually seeing them and improving them or not?

So I think if we take, as leaders of teams, of organizations, of value streams, if we take our job as improving flow, really focusing in on that second ideal of focus, flow, and joy for all our staff, all our colleagues, and for what we deliver to our customers, we need to know where our bottlenecks are.

As Gene Kim channeled Goldratt in The Phoenix Project, any improvement made anywhere besides the bottleneck is actually an illusion. So if we're investing elsewhere other than the bottleneck, we're not helping flow. We're not helping that focus or that joy.

And the thing that made me so excited about OKRs is that I found them as a really effective tool for speeding up flow, for changing organizational conditions, investing in bottlenecks. But what I've also found fascinating is a lot of my colleagues and large organizations who have been deploying OKRs have been finding that the OKR is actually slowing them down. And so this seems horribly ironic and something that we really need to fix.

So let's look at how these things have gone wrong. I've had a lot of conversations, studied a lot of organizations since they've been deploying these OKRs and having things go sideways. And so these are some of the key problematic things that I've seen.

First of all, OKRs are being used for micromanaging teams and value streams. So rather than setting these organizational directions and paths and really helping everyone navigate towards that common goal, they're actually being used to track specific features, muck with roadmaps, and reprioritize things for people. And I didn't really think they were meant for that, because, in the end, if that's what's happening, we're actually reintroducing waterfall planning into our DevOps transformations and our agile processes, which is, again, kind of antithetical to the goal of why we're all here.

Another really fascinating thing is using the business and financial metrics as the only key results. So what will often happen is that financial metrics such as a cost metric or a revenue metric could be quite divorced from what a team is doing. You might have a platform team creating the new cloud infrastructure, ramping up a whole lot of SREs, who might not map directly into a revenue metric, but are still going to help this organization succeed and eventually drive to faster revenues and more market leadership.

Another key pitfall that we'll go into is using only team or proxy metrics as the key results, that being all that we measure. So teams have a lot of detailed metrics. There are a lot of different KPIs that they need to use. We'll have proxy metrics that are not metrics of value or flow, but are very specific to teams' work that can really derail OKR efforts.

And the key thing is that if these practices of using OKRs for something they weren't intended for, for not measuring value, result in conditions that make OKRs work against you, they really make you feel like things are slowing down, which means chances are they are slowing you and your teams down.

So let's look at what bad OKRs really come from and what they do. So again, if we look at a specific value stream, you can see this value stream has a pretty significant bottleneck in it. A lot of features and other flow items are trying to get through, but they're not getting through. And what will often happen is leadership will actually be using OKRs to micromanage what gets through, to muck with the roadmaps, to prioritize features, to ask, "Where's my feature?" As that's happening, and the OKRs are being used to manage deliverables, the bottleneck is not visible because everyone's just wanting more features more quickly.

And we're not seeing these dynamics of flow. And of course, if these OKRs are coming top-down, being pushed on the value streams, what's happening to their value streams is that their capacity is being overloaded, there's more work in progress, the flow load increases, and even less makes it to the customer, even less value is delivered.

Contrast that with good OKRs, which should not touch any of what's being prioritized. We've got prioritization frameworks, agile frameworks for that. But instead, those OKRs focus on reducing that bottleneck, removing the impediments, and allowing value to flow more quickly, providing more focus and flow.

So these OKRs are actually about looking at how we get more value through and remove those wait states, those impediments, and surface those bottlenecks, because sometimes the bottlenecks will be within a value stream. There's a lack of some continuous integration, deployment automation, some overly slow security reviews. Sometimes they'll actually span value streams. So OKRs could surface those bottlenecks if multiple value streams have dependencies on, say, one platform or lack of APIs.

And of course, the key thing is that they can help balance with the features that we need to deliver. So OKRs can help balance roadmaps because they allow room for improvement, which the key results can drive, while of course roadmaps are about delivering more value to the customer more quickly.

So the key thing to understand is that we've got different planning dynamics that we need to keep separated. So we've got roadmaps and we've got plans, and those really define what gets delivered and in what order and how those things are prioritized. So that's analogous to the order of the container ships trying to get through that Suez Canal.

We have value-stream-specific OKRs, and this is where the value streams themselves, this team-of-teams construct, can accelerate the flow and can surface these bottlenecks. For example, if there's a shortage of staff in one area, a lack of automation, some other process impediment. This is really the value streams themselves trying to widen that canal, and they need the autonomy to do so. No one understands better how much investments should be made in automation and tech debt reduction and process improvement than the people working on that value stream themselves.

As you've heard Admiral Richardson say, we need to delegate. We need radical delegation to the value streams to set their own OKRs.

And then we do have the organization, the company-wide OKRs, the ones specific to a large line of business. And really there, the OKRs are an opportunity to create the conditions and organizational structures to enable flow. For example, if we've got a lack of investment in a common cloud delivery platform or automation on that front, these are an opportunity to accelerate that. And this is really analogous to building a new canal, to making sure that we're not reliant on a single canal and to really changing the conditions to enable faster flow across the entire organization.

So to understand how we do that, we really need to understand the different types of metrics that we're going to measure because each of these OKRs need to use different metrics. So the business metrics are generally well understood. Those are costs, those are pipeline conversions, those are retention rates, those are profitability numbers. But usually they're a lagging indicator of technology work. If we deliver these features, those tend not to turn into revenue overnight.

The flow metrics help tremendously here because they track the improvement, they track our ability to deliver more value, to delight our users with even more of the features that they're looking for. And so they're a leading indicator of value delivery that then turns into a business metric.

And very importantly, team metrics, the telemetry for teams, need to actually be specific to the teams and not conflated with the OKRs. Teams need a lot of telemetry to do their own decision-making. That decision-making does not need to percolate up through organization levels. It needs to allow the teams to move very quickly, respond very quickly to incidents, to outages, and to other needs, and again be a completely separate set of telemetry rather than the ones that support organizational improvement, such as the business-level OKRs.

So let's just quickly go through an example. And this example is an insurance organization that's deployed OKRs. We're actually using some real data sets here and some example OKRs, where what they want to do is become the most innovative insurer in their industry. This is great because the objectives actually link to their core values around innovation for their customers.

They want to see that turn into 30% market growth, a 50% reduction in the time that it takes to provision an insurance policy. So that's really important because that OKR cascades down to how prioritization's done. If a roadmap item is going to make it easier to move customers faster, remove calls, or places where things are being dropped, we'll know that we're on track of this OKR.

And then the key thing is these two OKRs are balanced with a flow efficiency improvement. So flow efficiency is the ratio of wait states to work states. This actually helps fuel that third ideal from The Unicorn Project around improvement of daily work because it allows everyone to take time in support of this OKR to actually improve how quickly they can get value to customers and drive that efficiency.

So let's take a look at how the platform value stream, which really supports all of these new applications, all of this new innovation, actually interpreted those OKRs and turned them into some very significant change for the organization.

So what had happened is, in terms of moving fast and wanting to grab more market share, there was a whole lot of technical debt incurred in the shift to cloud and the shift of all of the different customer views, the policies, and all of that infrastructure. As a result, the costs had ballooned. So there was some lifting and shifting, more than anyone wanted to admit, that was going on, and it was clear that it was no longer scalable to keep building these applications in this way.

The team themselves realized what had happened because they were not making efficient use of storage services. They realized that there were much better ways if they could just focus in on technical debt. So they actually used and set their own OKR saying, "If we focus in on technical debt, we can dramatically change this hosting profile and have finance stop worrying about this, and everyone realize we really can scale to 30% market leadership, as is our goal." They did that over the course of three months of investment in technical debt. They dramatically reduced this hosting cost bubble. They reduced hosting costs by 75%, and they had proven that this can be done and that the entire portfolio can actually be hosted in this modern way.

Now let's take a look at the actual customer-facing application. So this is the policy value stream, where really the focus was making products that these customers love. There's a lot of competition from insurtech organizations, and so a 20% improvement in NPS was really a key part of hitting that objective. The objective that was set by this value stream itself is the ones producing all of the mobile and web experiences.

And to do that, it really was about delivering those features that delighted customers more: better authentication experiences, making sure that they can make it through the entire provision process quickly and with some joy.

So thankfully, the organization already had a flow efficiency experiment underway, so they knew where to target the process improvements. It turned out that the bottleneck, and hence the only one at the time, was in verification. The business was needing to verify all those features making it to customers. They already had good automation of feature delivery in place, but all these wait states came from waiting on the business and all the meetings that were happening.

So the team targeted and was very public in their own OKRs around this: zero days wait state on business input for shipping those features. Once that happened, flow time was cut dramatically. We can see it go down from almost 30 days to averaging under 10 days. New features were being delivered to customers, mobile and web, in under 10 days for this organization.

From that flow time reduction, we actually did see the net promoter score, how happy the customers were with the applications, rise. Of course, as a lagging indicator, because it took time for them to start adopting using those features, and this really helped that company on that objective of cutting the time to provision policies in half because they had happy, engaged customers who were getting to successful policies.

So the whole point here is, as you're planning your OKRs, as you're looking at rolling them out, make sure that you're measuring the flow of value and the flow of value by that fifth and most important ideal of customer centricity. It really is all around what you're delivering to the customer.

To do that, use flow metrics as the key results to remove those impediments to the teams because fundamentally the teams know how to deliver that value. Do some radical empowerment of the value streams to set their own KRs and allow them to deliver on those business objectives, remove the bottlenecks, and signal to the organization as a whole if there are these systemic bottlenecks slowing them down, such as missing platform API capabilities or other infrastructure.

Let the teams use their own metrics. Let the value streams use their own roadmaps and decouple these things completely. OKRs are really for that organization learning and improvement and should be kept separate from the OKRs, which are really this North Star.

In the end, our organizations live and die by the agility, the quality, and the speed at which we deliver software, and OKRs are a great tool for accelerating that.

So in terms of help that I'm looking for, I would love to hear more about your experiences in terms of what's gone wrong with your OKRs. I gave some examples of things that I've learned. Again, I think they're an extremely effective tool. If you're using them successfully, please share this out on the DevOps Enterprise Summit Slack, and I look forward to learning more and reporting on the next set of experiences on how to accelerate flow through OKRs.