Las Vegas 2020

The CIO Who Mistook Tech Debt for a Hat (and other Flow Diagnostics)

Over the past year, Dr. Kersten and his Flow Advisory team have been collecting value stream data sets from enterprise IT organizations undergoing digital transformation treatments. They used the Flow Metrics defined in Project to Product to trace the path that hundreds of thousands of software artifacts travel from inception to running software. As they analyzed and correlated Flow Metrics to business results, some fascinating clinical tales emerged from these unique new data sets. The team discussed the findings with the patients to gain a more personal perspective on the outcome of the treatment plans.


In this talk, Dr. Kersten will take us through the most common, the most problematic, and the most bizarre flow diagnoses that he has encountered. By holistically considering the data as well as the human and organizational aspects of the diagnosis, he will follow the approach of the neurological case histories recounted by Oliver Sacks in The Man Who Mistook His Wife for a Hat. For each flow diagnosis, Dr. Kersten will recount misdiagnoses that were previously applied, followed by a detailed case history, summary of the remedies, and the often-surprising results. From common maladies to rare pathologies, each of the stories offers powerful lessons to help us understand the biggest impediments to achieving DevOps at scale.

DM

Dr. Mik Kersten

Founder and CEO, Tasktop

Transcript

00:00:07

The next speaker is someone who you should all be familiar with within this community. Dr. Mick Kirsten wrote the awesome book project to product two years ago, and it's summarized his 20 year journey to understand how to make developers truly productive. I think this is such an important book because it helps define the common language needed between business and technology that we need to be able to compete and win in the marketplace. His flow framework defines the four types of work that software engineers work on its features, defects, debts, and risks, but as book also define the flow metrics, which I must admit I love, but I'm not sure I could actually explain it to someone else. And this is despite knowing how powerful of a diagnostic tool it is because I've seen MC use it to reach astounding conclusions about technology dysfunctions with the precision and ease that I've always admired, which is why I've asked him to give this presentation. I love it because I can now finally understand the cause and effect of certain symptomologies that I've seen in my career for decades and can now better follow along and see how and why Mick makes his diagnoses and the resulting treatment plans. I trust you will find it as dazzling as I do. Here's MC

00:01:29

Hello everyone. I'm Dr. Mik Kersten. Uh, I need to admit that I am not a real doctor. Uh, my PhD is in computer science and software architecture and flow, but I do promise to make some very specific prescriptions for you today. I'm happy to be joining you from Vancouver, Canada is my favorite conference of the year. And my, to me, the most important part of it is actually this, these experienced reports, the case histories that we learn as organizations share their transformations, their journeys through DevOps over the years. So for me, that learning accelerated about two years ago, when at this very conference, I released project the product than it revolution press ever since that time, I've been helping executives understand value in software delivery and spot some very common pathologies and dysfunctions. I'll be sharing with you today. All of this data is grounded in some very detailed clinical study that was done by your float team, coming the Ardo Dominican to ground us.

00:02:18

And then give me Lurie who analyzed dissected all of these different data sets across many enterprise organizations. So I'm going to share with you some of the most interesting and some of these very high stakes transformations, where the futures of companies and jobs were on the line, but where technology was a bottleneck and where transformation was moving too slowly for me really that's some of the inspiration behind this was this book by Oliver sacks, the man who mistook his wife for a hat. The headline story of this book was, uh, Dr. P he had a problem with his vision. He went to the eye doctor and he was having trouble recognizing faces. Now he was getting interesting and embarrassed about this office, visit him and noticed that there was actually a pretty, a deeper problem than just as eyesight. It was originally thought he actually to pick up his wife's head and put it on as a hat, which obviously socks notice was kind of a profoundly wrong thing to do.

00:03:06

Next. He brought a rose to Dr. P. He went through his home, handed the rose to Dr. P and said, what do you have? What are you holding? Dr. P said, it's convoluted red form with a linear green attachment. And so me, this was fascinating because this is exactly how I've seen people. Describe technology problems with a sort of similar perspective. That seems so foreign to me, as someone who spent a lot of time on this technology problems, uh, without basically seeing parts, but never seeing the hole. So you can only imagine the kind of treatments without the understanding that obviously it's built up that would have happened. Saks realized that this was not an eyesight problem. This was a visual agnosia. He encouraged Dr. Peter uses other senses. The moment doctors piece smelled the rose. He instantly realized that he was holding a rose and he was delighted by the rose.

00:03:49

The moment has website moving and talking. He realized she was his wife yet for so many of the executives I've worked with, who've never had a chance to code, never honed those senses. What it's like to be solving technical problems way down at those lower levels. Uh, they ha I've noticed they have this inability to see technical debt for what it is, but they badly want to. So the question to me was, are there other senses that they have that organizations that leadership has, that we can use to make tech that visible to those who cannot see directly? And here's my quick story around that. So this is the story of the CIO who has stuck that for a hat. Uh, this is a highly capable of technology executive that I've been working with them closely. Other inspirational to me, I've been learning a lot from these interactions.

00:04:27

And interestingly, they took some of the concepts from the enterprise DevOps community and the project approach very seriously. And they started making a massive investment in technical debt. Now, the interesting thing was, and the kind of first red warning around this patient for me was that they were investing over 50% of all teams, capacity, dozens of teams, capacity and tech, that reduction because of the problems that they were seeing in their platforms and their delivery. And I thought that was actually quite high. This was also putting a lot of pressure on development teams. So feature capacity of course becomes dramatically reduced because of this balance that you need to have in the four flow items, features, defects, risks, and debts that comes from the flow framework. And the assumption was that there was this huge bottleneck that was purely located in development yet when we took a different lens on this, when we actually used our flow diagnostics, uh, approaches and tools, we saw a very different picture when we analyze flow times.

00:05:21

So how long it took to get time to value. We noticed that there were all things happening on the business side because this technical, that investment was actually causing all of these scope changes to be made all the time. And those sculptures had been happening for years. And those scope changes were actually the source of that technical debt. So while they were working on it, they continued work being fast-tracked made urgent, and it was a never ending battle. There was no way to take down this, this ball mic in the middle, because that was not the cause of the bottleneck. It was those scope changes and the interactions with customers and the business, uh, and how these platforms were basically anchored to old sales models and customizations of the software. There was also another downstream bottleneck in terms of the operations side. So those features and the tech that improvements were actually never making it to customers.

00:06:05

So of course, over focusing on the middle was the wrong thing to do. The, this organization needs to focus on both the downstream problem with outsourcing and infrastructure model that they had on that side. And of course, change to investing in the platform rather than all these bespoke software deliveries, which was really the goal that was actually the entire goal of this digital transformation. So the effort was going into the wrong place and the wrong amount of effort and all being basically placed on dev where the problem wasn't. So the key thing here is that this executive started to see the real picture. That software is not static. They started seeing the dynamics. They started being able to visualize this flow, understand where to invest. And that was really done by helping they use their other senses of business results. Because when we focus the conversation just about time to market, it was obvious that some business processes and outsourcing relationships had to change.

00:06:51

And so the key thing that we realized through this is that measuring flow actually opens the door to these kinds of clinical diagnosis, where digital transformations and some very specific actions on how to treat them. And these treatments need to be guided by measurement. Now we know how to do measurement, uh, in terms of the human body, we've got vital signs like temperature and heart rate and blood pressure. So with the project, the product book introduces is using these flow metrics, the small set of flow metrics, both business leaders and technologists to actually measure the health of our value streams. So the formation is our full velocity, how much work is being done for each of the rest of time, but w Don has done, we're done this, the work actually got to the customer and that all, all that matters is that customer's perspective of what value they received through the value stream.

00:07:36

The next one is flow efficiency. So what is the ratio of basically waiting states to, uh, to active work states for we flow efficiencies, high teams are getting a ton of work done. They're very productive. They're very happy, and the customers are receiving a ton of value because of course, all these full metrics are completely end-to-end all the way from work intake, from a feature request that business strategy to running software, but we've seen common pathologies. We're actually flow efficiencies heading down to zero below 30% and down as you'll see, uh, in an upcoming diagnostic where nothing's actually making it to the customer and efficiency is getting worse and worse full time. And this is a key thing is the time is measured completely. End-to-end all the way from work intake through on Justin calendar days, uh, to when the customer received the software, we've noticed that great organizations have flow times that can be measured in days.

00:08:22

Whereas organizations with these that's functions have full times that are measured in months. And so can learn fast enough to adapt to the rapidly changing economic landscape and pace of business today, flow load. This is how much work is actually currently being actively worked on in your value stream. So how much work has started, but there's not yet been completed and has not yet been delivered to the customer. We know within followed goes to high productivity and flow for, since they go out the door, nothing gets done and finally flow distribution. So you heard Scott Prue mentioned, uh, how, what a great win it was to make, to get some of his value streams to go from 15% to 55% feature delivery in terms of their flow distribution, where they're always balancing features, defects, risks, and deaths, the four or five items in the full framework.

00:09:05

That's exactly what we want to see at Tasktop. We always target over 50% feature delivery. And the question is how we get there. Uh, and also the one that ironically, when I was kind of pathologists, we see as an over-focused on features, starving the other flow items and creating these death spirals. So the point here is to give you the diagnostic tools to measure these things for yourself, uh, the flow frame, it provides those, the flow framework has creative commons license. You can grab it, get it from the book. And right now I'm going to show you some of the key learnings that we've seen over the past couple of years. They've been really unearthed and solved through these, uh, through these measurements and metrics. So the first one is the tech that death spiral keep in mind. These are all enterprise patients. We're going to look at the first patient's chart, and this is all real data from life value streams that really span delivery teams, uh, from a customer perspective.

00:09:52

So let's take a look at this. What we see here is that this patient is a financial services company. And the interesting thing is they've got a very mature agile rollout and CICB pipeline, and some, some actually exceptional in the enterprise DevOps practices. However, the business is still complaining. That features delivery is painfully slow, and that there's a lack of innovation that they're being innovated by others in the FinTech sector. So let's take a look at their chart and what we see in the top, right? If we squint there is that there's not enough. Green. Green is the feature, the flow distribution of how many features are being done. So that's the net new business value being delivered for customers. And there's not enough, even with those large investments and the successes they've had, they can't budge. Uh, then they can't really move the needle on that to have new net new, uh, delivery feature delivery to customers.

00:10:41

So now let's actually take a look at their flow load chart. This is the work in progress. And remember when flow load is higher, when whip gets too high, everything slows down lower is better. Meanwhile, these backlogs are just growing and growing the light green as the size of the backlogs dark greens, how much is being actively worked on, and they'll never catch them this way. So no wonder the business is unhappy about this. Uh, let's actually look at the cause of that. So let's actually analyze the flow, uh, within this organization's value stream. And let's look at what was going on. Let's get deeper into this, into this diagnostic. What we see here is that where everything weights, where you've got 73 different user stories stuck, uh, is on something called core backend services. So all of these things are trying to deliver to customers, uh, this, and this is a banking platform.

00:11:26

So it's got mobile applications, web applications, all of this innovation, new user expenses around delivery, keep getting stuck on core backend services, that painful legacy constraint. And it's actually turns out to be that's a monolith. And the technology teams have known this for years. They've actually understood that if they don't invest in breaking apart, the smallest and slang the small lift and turning it incrementally into microservices and moving people to those, moving the business places, microservices they'll never improve. But finally, what happened is the executive saw the right picture. They saw the same thing that the technologist teams have been seeing for over two years. And they finally had the data at to help them make the investment that should have been made actually, frankly, two years ago. They've got it. Now they've been able to improve. So let's look at this tech that, that spiral we're delivering basically slow from a crawl to a stance though, even while all these agile teams actually improving the practices, the symptoms that we can measure.

00:12:19

And these actually quite easy to measure is that flow time keeps increasing. So basically the time to value how long it takes to get new, uh, new features to customers is just slowing down. Philosophies is decreasing so less and less. It's getting to customers, uh, technical debt investments, not visible. So is it not being worked on or is it being worked on with heroics and what's happening is when it's not visible. We know there's a problem because it's not being managed. It's not, uh, teams are not taking pride for it. And defect rates and incidents rates are just increasing. They're taking more and more of the flow velocity, the very real and very problematic business symptoms from this. So time to market is unacceptably well, innovations to slow. The cost of delay is massive and all of this, of course, because features aren't making it to market.

00:13:00

Aren't helping the business. Our team happiness is decreasing because those teams don't get to work on features. They keep being blocked on this in this case, this massive architecture problem, uh, and new hire onboarding is slowing because it's so hard to ramp up on these, on this kind of system. So mistreatments and this organization had two years of mistreatment on sustainable work and heroics, uh, adding Def counter those business applications, adding developers is the exact wrong thing to do. You actually need to flow that talent to the platforms, uh, and yet stop basically investing at the top players while putting the, uh, the lowest cost resources in this case, their organization was putting all these contractors on the backend while putting the top down on the business applications. So completely, completely the wrong treatment plan. The actual treatment plan that worked here was first make all tech that work visible, take pride from it, showcase it to the executives, show that you're actually tackling this tech debt, but measure it in a way that's meaningful to the organization.

00:13:56

So measure it with flow time. Basically the only reason to invest in tech debt is to reduce feature float time, which just means improving time to market, getting things to market faster. The business will see that very quickly. All of a sudden exactly it was we'll understand the tech, the investment at Tasktop. We've now gone to the point where, when we're doing our release planning, uh, we actually don't make tech that investments that don't have a feature payoff, a feature full payoff that's six months or shorter. That's how aggressive we've become, because we realize when you've got that faster feedback loop, you do much more of the right things faster than doing these big bang, boil the ocean approaches, uh, other organizations. I know of course, they're using the strangler pounding. And when let's say business applications have direct access to a database, get that service stood up, measure how many things are using that service and then measure how that increases.

00:14:42

Full-time all of a sudden, you'll have your business case to shut down direct access to the DB and move to services for that. The key thing is that you're giving the organization a way of measuring those benefits and that you have these frequent regular checkups, every sprint every month, every, every quarterly business planning or you're measuring the flow time improvement from your investments. Uh, you're seeing lower difficult rates and full distribution, which means more investments actually going into value delivery for the customer. And you'll actually see this happiness increase across teams as they're able to get more features out the door faster. Of course, the unicorn project ideal here is locality and simplicity. Uh, that was the bottleneck here. There was this painful legacy constraint. Once there was an investment in locality and simplicity and extracting these things into services, everything was much faster. And the quote from the patient here, this is the executive responsible as your ball next to staring you in the face and wave it at you to add insult to injury.

00:15:34

Once you start looking at your flow metrics, but of course after that initial shock, you then know where to invest and you start moving faster and faster. Uh, next for diagnostic. This is neglected with neglected work in progress. And this, I have to say across all of the patients that we've studied is the most common value stream health problem that we see. And also the one with the most, with the easiest remedy, but it's for many executives, for many organizations, it's a very counter intuitive remedy. So let's take a closer look. This patient is a healthcare company, very large public health care company, somewhat ironically, the rolling out the skeletal framework. And the interesting thing is they've already been championing tech that work they're doing a lot of tech technical debt work already. However, the thing I keep hearing from executive to meeting the executive meeting is it can't keep up with the business.

00:16:24

So common pathology that we here let's look at why this is happening. And what we see is the flow load. Remember the flow load is the work in progress. The more of that we have the more work the teams have, the more context switching there is. And there's a mismatch here between the flow load and the flow velocity. That means this team is taking on more and more, getting less and less done as they take on even more, they'll get even less done and everything is about to get deadlocked. We can actually measure this. You think queuing theory and Little's law. It becomes very start that this basically the productivity here is trending to zero. We can even see it in the flow efficiency chart of this particular patient, where they're currently at 30% flow efficiency, but because we see the flow load rising, they're actually going to Chen down to zero.

00:17:06

So they're taking on more work than they can ever deliver. And you know, the very difficult thing for everyone, including for these teams, is that any promise that they now make is unlikely to be met? So how do we fix this? How do we basically get rid of what Dominica the grant is called? One of the meantime fuse too much work in progress and actually remedy this problem. And the answer is not by taking more work on, right? But the symptoms are obvious flow load is going up and up. Backlogs are rising. Thrashing is increasing, and you can measure that flow efficiency, just starting to drop like a stone, the business symptoms. I think we all know feature workers waiting indefinitely. You start to lose credibility with customers and unplanned work is being chronically fast-tracked, which of course is then just adding two more flow load to the scope changes and phew, you're pouring fire on this pathology, uh, Ms.

00:17:52

Treatments, plus even more work for the teams, get more heroics going, uh, suggest that teams get better at multitasking and taking more of a con, which is exactly the wrong thing to do. The goal. Uh, gold threads should have taught us that we can't overload teams in this way, actually get less out. Not more. The treatment plan is stop starting, start finishing, give the teams a chance to catch up on their work, reduce the flow load, make sure everyone's on the same boat and understand when you reduce flow load, both the business side and the technology and the teams. I need to understand this. You'll get more done. You can give them a quarter or two, if you need to. We've had to do that in past when we've gotten into a corner on this, but reducing the flow load will actually make sure that you get more done as you do that, of course makes you find the right work in progress, limit the right flow, low limit for your team.

00:18:35

It'll be different for different value streams, and you have to find it when it goes too high, you'll see efficiency decrease. And of course our goal of course, is to be able to do more. So we actually want to increase the team's ability to deal with a higher flow load to do that. You have to find where the constraint is as you'll see from the next story, and then invest in that constraint. And you'll actually be able to make sure that the value stream is able to effectively take on more work. So for the regular checkups, you should see improvement and flow velocity, and they'll be very quick. You'll actually see this very counter-intuitive thing which will help the executives see using their other senses. Then when that, when flow load went down, velocity went up, they were able to deliver more to customers. It seems a little more efficient.

00:19:14

And the key business thing is you've now got better predictability because you're able to note that work will actually get finished. Um, there are two key income project ideals here to target all of this improvement to the main one is the improvement improvement of daily work. It's not enough just to focus on tech that you have to allocate time across your teams, across your value streams, to improvement of daily work and find these process problems because they're not found in dev alone. Oftentimes this will actually come from, uh, uh, pathologists that that span the development and the business side and the work they way that work is coming into value streams. However, once you do this, you'll actually get to this point of focus, flow and joy from all the productivity benefits. I've had an teams focused on delivery and being able to deliver more rather than constant being stuck, overloaded and wait states.

00:19:57

So the quote here from the patient was flow metrics expose that our backlog was growing. We now see the dynamics of how to manage the balance between backlog on other work. And what you see here is once this is visible, uh, you're able to actually manage the web rather than having it manage you. Now the last, and this is more in, I think to me at least that a more interesting and more subtle for diagnostic that again, we're seeing across enterprise patients, and this is workflow obscurity. So this particular patient, uh, this is a health insurance company, a very large health insurance company, and they have a mature agile deployment. And they've actually been quite effective in the shift from project to product, but just a little spoiler alert. I didn't say they're amazing in their DevOps practices just yet. So the ailment they see is that dev is moving fast.

00:20:44

Uh, there, people are very pleased with that, but the is just not seeing the results that they expected to see from this. Now let's look at what's going on here. And if we look at this space into charts, we see that the flow load, the first time I saw this, I was, it was just very stark to me because it's, uh, it's a, uh, this is about six different agile teams and the flow load is over a thousand. So it looks from this they're working at on a thousand features, which can be many user stories. Each the way that the flow work items are mapped into the four flow items. Um, they're working on over a thousand large features concurrently. So this already seems like a red light. This patient's got some malady, uh, but let's look at what's, what's happening with that Mallory. Let's actually analyze the load and let's see how that work is flowing.

00:21:27

And what we see is some suspicions on what done means here is done done. When the development team has done, as it looks like here, or just done, uh, done when the customer receives the value. If we actually now change the mapping of done, we actually update this flow state to be customer centric, not team centric. We see a completely different picture. And what we see here is that there was a complete, a false sense of what was going on. All of a sudden, instantly the flow load. When we changed the model, went down from over a thousand to 260. What's actually happening is that dev teams extremely effective at getting work done. Uh, they've managed their work in progress. They've managed their technical debt. They've got a great flow distribution that we see here in terms of what they're taking on. Uh, but all of that work is queuing up and some serious infrastructure problems that were and lack of DevOps, automations that should have been done that were thought to be done, but never where the dev team is not the bottleneck.

00:22:23

The bottleneck here is downstream of the dev team. And of course the dev team was being blamed for all of this. We saw this through the flow of symptoms where flow load was artificially highly high, that made no sense. And the business results were not improving on the business symptoms. Again, they were quite severe. Customers were perceiving a lack of innovation, even though it seems like the work was being done, uh, and the bottleneck was not visible. It was not understood how big a problem, the lack of automation and infrastructure maturity was at this organization. And then there was this really interesting, subtle twist where psychological safety to make that work visible was not quite present in the organization. So the dev teams were not exposing the fact that they're done state was not matching a done state from the customers or the business's point of view.

00:23:04

So of course the mistreatment is just to keep doing this focus on dev being done, focus on the infrastructure and operations team, getting faster. And that's just the wrong thing to do because the problem here was a big handoff. So if you now shift from a team, focus, a silo focus to a customer focus, and you measure that end to end flow. You identify those manual handoffs. All of a sudden you'll see that where the bottleneck is that infrastructure automation is critical, that there was a much too slow security review. And there's a big testing problem. Uh, another organization that had a float diagnostics, almost identical to this, uh, recently fired over a hundred manual testers and their efforts to move to automated testing. And they actually did the exact opposite thing of what they should've done. They completely slowed down in their value stream by doing this.

00:23:47

Um, so the key thing is you need to slow down on this and the checkups is you need flow time to be dramatically reduced by actually seeing the right kind of workflow, make those bottlenecks visible. And once you do that, team satisfaction will increase. And of course the key thing is customer satisfaction will increase. So, uh, the key thing is here is customer focus and psychological safety has to guide everything that we do, that teams need to expose their work. So you can see the bottlenecks and everything. It needs to be structured in your value stream and your delivery pipeline from a customer's perspective. Once you do that, you're able to actually get the resources allocated. This particular organization, got an allocation of resources outside of their budget cycle for the first time to remedy this bottleneck that was downstream of development, which is exactly the kind of result that we all need to see.

00:24:35

So in conclusion, um, you can change the system from within the system. We have to measure, uh, these value streams from the outside. We can't focus on one particular subset of the value stream. We have to measure more holistically and take that approach. These are complex dynamics. That means you need to understand the trends, oftentimes it's bottlenecks or a whack-a-mole problem where you leave one, another one pops up over here. There's another dependency that you didn't see. So you're always measuring dynamics and always using these four metrics in particular flow time, uh, which is the time to market metric, uh, to guide your decisions. And of course, frame all your decisions, frame all of this in a language that where your organization, your leadership can understand in terms of using their other senses. So stop talking cycle time, these very important things that are specific, the teams, and actually started looking at Antonin flow time and measuring that as your time to value metric as your time to market metric, which will actually help you get the investment that you need for the next step of your transformation and elevate all of these discussions to where they belong, which is in the boardroom.

00:25:30

Uh, so with that, you can learn more, um, on flow frame.org. And the help that I'm looking for is really to learn more of these diagnostics, learn more of your stories. We've created a new portal on flow from that org. We can share these, or there'll be publicly shared and where we can discuss and help our organizations really take those next steps, uh, in learning and, uh, diagnosing these common flow problems. So with that, thank you very much. And I look forward to hearing from you.