The CIO Who Mistook Tech Debt for a Hat (and other Flow Diagnostics) - Tasktop | Las Vegas 2020

Login or create a free 7-day trial account

Las Vegas 2020

Slides not available

The CIO Who Mistook Tech Debt for a Hat (and other Flow Diagnostics)

Over the past year, Dr. Kersten and his Flow Advisory team have been collecting value stream data sets from enterprise IT organizations undergoing digital transformation treatments. They used the Flow Metrics defined in Project to Product to trace the path that hundreds of thousands of software artifacts travel from inception to running software. As they analyzed and correlated Flow Metrics to business results, some fascinating clinical tales emerged from these unique new data sets. The team discussed the findings with the patients to gain a more personal perspective on the outcome of the treatment plans.

In this talk, Dr. Kersten will take us through the most common, the most problematic, and the most bizarre flow diagnoses that he has encountered. By holistically considering the data as well as the human and organizational aspects of the diagnosis, he will follow the approach of the neurological case histories recounted by Oliver Sacks in The Man Who Mistook His Wife for a Hat. For each flow diagnosis, Dr. Kersten will recount misdiagnoses that were previously applied, followed by a detailed case history, summary of the remedies, and the often-surprising results. From common maladies to rare pathologies, each of the stories offers powerful lessons to help us understand the biggest impediments to achieving DevOps at scale.

DM

Dr. Mik Kersten

Founder and CEO, Tasktop

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

Gene Kim introduces Dr. Mik Kersten as someone the community should know well. Kersten wrote "Project to Product" two years earlier, summarizing a 20-year journey to understand how to make developers truly productive. Gene says the book defines a common language between business and technology that organizations need in order to compete and win.

Gene highlights the Flow Framework's four types of software work: features, defects, debts, and risks. He says he loves the flow metrics, even though he is not sure he could explain them to someone else, because he has seen Mik use them as a diagnostic tool to identify technology dysfunctions with precision and ease. Gene says the talk will help explain the causes and effects of symptoms he has seen for decades, and how Mik makes diagnoses and treatment plans.

Mik Kersten

Mik Kersten opens by joking that he is not a real doctor: his PhD is in computer science, software architecture, and flow, but he promises specific prescriptions. He says DevOps Enterprise Summit is his favorite conference because of the experience reports and case histories organizations share about their DevOps transformations.

Kersten says his learning accelerated when he released "Project to Product" at the same conference two years earlier. Since then, he has helped executives understand value in software delivery and has seen common pathologies and dysfunctions. The data behind the talk comes from detailed clinical study by the Tasktop flow team, including Carmen Diardo, Dominica DeGrandis, and Naemi Lauri, who analyzed data sets across many enterprise organizations. Kersten says the cases are high-stakes transformations where companies and jobs were on the line, technology was a bottleneck, and transformation was moving too slowly.

Kersten frames the talk through Oliver Sacks's "The Man Who Mistook His Wife for a Hat." In Sacks's story, Dr. P had trouble recognizing faces and at one point tried to pick up his wife's head as though it were a hat. When handed a rose, Dr. P described it as "a convoluted red form with a linear green attachment" until smelling it helped him recognize it as a rose. Kersten says executives who have never coded can similarly see parts of technology problems without seeing the whole. They may not directly see technical debt, but they want to. The question is whether flow metrics can give leadership other senses that make tech debt visible.

Kersten's first story is the CIO who mistook tech debt for a hat. A capable technology executive had taken "Project to Product" and the Enterprise DevOps community seriously and started investing heavily in technical debt reduction. The warning sign was that more than 50% of dozens of teams' capacity was going into tech debt reduction, reducing feature capacity and putting pressure on development teams. The assumed bottleneck was in development.

When Kersten's team used flow diagnostics, they saw a different picture. Flow time analysis showed that scope changes on the business side had been happening for years and were themselves a source of technical debt. Work kept being fast-tracked and made urgent, creating a never-ending battle. A downstream operations bottleneck also meant features and tech debt improvements were not reaching customers. The organization needed to address business process, outsourcing, infrastructure, and platform investment issues, not simply put more effort on development.

Kersten says measuring flow opens the door to clinical diagnoses for digital transformations and specific treatments guided by measurement. Just as medicine uses vital signs such as temperature, heart rate, and blood pressure, "Project to Product" introduces flow metrics as vital signs for value streams. Flow velocity measures how much work is completed over time, where done means delivered to the customer. Flow efficiency measures the ratio of active work to waiting states. Flow time measures end-to-end calendar time from intake to customer receipt. Flow load measures work started but not completed or delivered. Flow distribution measures the balance across features, defects, risks, and debts.

Kersten says good organizations can have flow times measured in days, while dysfunctional organizations have flow times measured in months and cannot learn fast enough for the pace of business. When flow load is too high, productivity and flow efficiency drop. He notes that Tasktop targets more than 50% feature delivery, while also warning that an over-focus on features can starve other flow items and create death spirals. The Flow Framework is Creative Commons licensed, and the goal is to let organizations measure these pathologies themselves.

The first diagnostic is the tech debt death spiral. The patient is a financial services company with mature Agile, CI/CD, and strong enterprise DevOps practices, but the business still complains that feature delivery is painfully slow and innovation is lagging fintech competitors. The flow distribution chart shows too little green: not enough net new feature value reaching customers. Flow load and backlogs are growing, so the teams will never catch up.

Deeper analysis shows 73 user stories stuck on core backend services. The banking platform has mobile and web applications, but new customer experiences keep getting blocked by a painful legacy constraint: a monolith. Technology teams had known for years that they needed to break the monolith apart incrementally into microservices. Flow data finally let executives see what technologists had been seeing and justified the investment that should have been made two years earlier.

Kersten describes the symptoms of the tech debt death spiral: flow time increases, velocity decreases, technical debt investment is not visible, defect and incident rates rise, time to market becomes unacceptable, innovation slows, cost of delay grows, team happiness drops, and onboarding slows because the system is hard to understand. Mistreatments include unsustainable work, heroics, adding developers to business applications, and putting low-cost contractors on the backend constraint.

The treatment is to make all tech debt work visible, take pride in it, show it to executives, and measure it in a way meaningful to the business: feature flow time. Kersten says the only reason to invest in tech debt is to reduce feature flow time, meaning better time to market. At Tasktop, they avoid tech debt investments that do not have a feature-flow payoff within six months or less. He gives the strangler pattern as an example: stand up a service, measure adoption, and use flow-time improvement to justify removing direct database access.

Regular checkups should measure flow-time improvement every sprint, month, and quarterly business planning cycle. Organizations should see lower defect rates in flow distribution and increased team happiness as more features get out faster. Kersten maps the remedy to the Unicorn Project ideal of locality and simplicity: once the organization invested in the legacy constraint and extracted services, everything moved faster. One executive said that flow metrics make bottlenecks stare you in the face.

The second diagnostic is neglected WIP, or neglected work in progress. Kersten says it is the most common value stream health problem across the patients studied, and also the one with the easiest remedy, though that remedy is counterintuitive. The patient is a large public healthcare company rolling out the Flow Framework. They already champion technical debt work, but executives repeatedly say IT cannot keep up with the business.

The flow load shows why. Flow load is rising while flow velocity is falling, meaning the team takes on more and more but gets less and less done. Queueing theory and Little's Law show productivity trending toward zero. Flow efficiency is currently about 30%, but rising flow load means it will trend down toward zero. The team is taking on more work than it can ever deliver, so any promise it makes is unlikely to be met.

The symptoms are rising flow load, rising backlogs, increasing thrashing, and flow efficiency dropping like a stone. Business symptoms include feature work waiting indefinitely, declining customer credibility, and unplanned work being chronically fast-tracked, which adds even more flow load. Mistreatments include pushing more work to teams, demanding heroics, and suggesting better multitasking. Kersten says Goldratt's "The Goal" should have taught organizations that overloading teams gets less output, not more.

The treatment plan is stop starting, start finishing. Give teams a chance to catch up, reduce flow load, and make sure both business and technology understand that reducing flow load will produce more output. Organizations may need a quarter or two to recover. Each value stream must find its right WIP or flow-load limit; when load goes too high, efficiency decreases. To increase capacity, find the constraint and invest in it so the value stream can handle more work.

Regular checkups should show flow velocity improving quickly. The counterintuitive lesson for executives is that when flow load goes down, velocity goes up: teams deliver more to customers, become more efficient, and regain predictability. Kersten connects this to the Unicorn Project ideals of improvement of daily work and focus, flow, and joy. Improvement work must span value streams, not just development, because many process problems come from how work enters the value stream from the business. The patient quote is that flow metrics exposed backlog growth and made it possible to manage WIP instead of being managed by it.

The final diagnostic is workflow obscurity. The patient is a large health insurance company with mature Agile deployment and a successful shift from project to product, but not yet strong DevOps practices. Development appears to be moving fast, and people are pleased with that, but the business is not seeing expected results.

The flow load initially appears to be more than 1,000 across about six Agile teams, suggesting the teams are working on more than 1,000 large features concurrently. Kersten investigates what "done" means. Is work done when development is done, or when the customer receives value? When the mapping is changed to be customer-centric instead of team-centric, the flow load drops from more than 1,000 to 260.

The revised picture shows development teams are effective: they manage WIP, manage technical debt, and have healthy flow distribution. But their work queues up downstream because of serious infrastructure problems and missing DevOps automation that were assumed to be done but were not. The development team is not the bottleneck; the bottleneck is downstream, yet development had been blamed. Business symptoms include customers perceiving a lack of innovation even though work seems to be done.

Kersten says a subtle problem was the lack of psychological safety to make the work visible. Development teams were not exposing that their done state did not match the customer's or business's view of done. The mistreatment is to keep focusing on development being done and to push infrastructure and operations teams to go faster. The real issue is a big handoff.

The treatment is to shift from team or silo focus to customer focus, measure end-to-end flow, and identify manual handoffs. This reveals where infrastructure automation is critical, where security reviews are too slow, and where testing problems exist. Kersten warns that another organization with nearly identical flow diagnostics fired more than 100 manual testers while trying to move to automated testing, which was the opposite of what they should have done and slowed flow further.

The checkup is that flow time should drop dramatically once the right workflow is visible and bottlenecks are addressed. Team satisfaction and customer satisfaction should increase. Kersten says customer focus and psychological safety must guide the work. Teams need to expose work so bottlenecks can be seen, and value streams and delivery pipelines must be structured from the customer's perspective. In this case, the organization received resources outside the budget cycle for the first time to fix the downstream bottleneck.

Kersten concludes that you cannot change a system from within the system. Organizations must measure value streams from the outside and holistically, rather than focusing on one subset. These are complex dynamics, so leaders need to understand trends. Bottlenecks can behave like whack-a-mole: when one is relieved, another dependency appears. Flow metrics, especially flow time as the time-to-market metric, should guide decisions.

He closes by urging leaders to frame decisions in language the organization can understand. Stop talking only about cycle time and team-specific metrics; start looking at end-to-end flow time as time to value and time to market. That framing helps secure investment for the next step of transformation and elevates the discussion to the boardroom. Kersten points the audience to flowframework.org and asks for more diagnostics and stories through a new portal where organizations can share and discuss common flow problems.