Las Vegas 2020

Are We Really Moving Faster? How Visualizing Flow Changed the Way We Work

The automotive industry is currently undergoing a disruptive change. Driven by electrification, automated driving, and connectivity, the classic vehicle is being transformed into a software-defined internet-of-things (IoT) device.


Elektrobit (EB) has been a global supplier of embedded and connected software products and services for the automotive industry for more than 30 years. EB’s software powers over 1 billion devices in more than 100 million vehicles. Since 2015 Elektrobit is part of German automotive giant Continental AG.


After working as a CTO for a successful start-up for several years and having developed a DevOps mindset, I (Roman Pickl) joined Elektrobit in 2018 shortly after it had reorganized to become a Product Company (Project to Product). As technical project manager in EB Assist, which provides state-of-the-art hardware and software products to successfully develop, test, visualize, validate, and build ADAS and automated driving functions and systems, I’m is in charge of a distributed 30 person product development team and part of the company wide DevOps Community of Practice.


The main business problem we faced is delivering value in a flexible way, at speed and high quality to our internal and external customers. We were hindered by long development cycles, 6 month budgeting periods, high workloads and priorities that were often changing. A full cycle of building and testing one of our software and hardware products took more than 24 hours. So when you did something in the afternoon, you sometimes didn’t get feedback at the following day, but the day after. It had a negative impact on developer moral and felt like quicksand: the more we fought it, the more it pulled us in. We knew there must be a better way.


Applying the 3 Ways of DevOps, especially by experimentation and by identifying bottlenecks in the build and test run, we were able to cut the full build/test cycle by a factor of 3 in the first few months of 2018. Moving our code to a git mono repo and containerizing our build environment in 2019 allowed us to provide feedback to our developers on every commit within minutes, not hours. Furthermore, automating our delivery allowed us to provide a new version of our software with the click of a button. This was great and we felt happier and so freaking agile.


Briefly after moving to a new office in 2019, and knowing about the importance of making work visible and after having learned about the Flow Framework, I implemented a dashboard using an open source solution (smashing) which automatically gathered and visualized, among others, Flow metrics (Flow Load, Flow Time, Flow Efficiency, Flow Distribution, Flow Velocity) for our value stream. After putting in countless hours eliminating waste, improving the deployment pipeline, investing in automation and deploying new technologies, I wanted to answer a fundamental question: “Are we really moving faster?”


It took me a while, and listening to Beyond the Phoenix Project and reading The Goal, to understand:

-We were creating a lot of inventory.

-We had a fast lane for fixes, but it still took us too long to ship features.

-We delivered more often, but the new bottleneck shifted to testing.


It became clear that we were trapped in local optimization (now described by Jon Smart as the Local Optimisation & the Urgency Paradox), we had a limited system focus, and needed buy-in to influence the process up- and downstream. At the same time the organization identified three focus programs to improve flow: Delivery Performance, Organizational Development and Empowerment which will be rolled out in 2020.

DM

Dr. Manja Lohse

Head of Incubators and Demonstrators, Elektrobit

RP

Roman Pickl

Technical Project Manager and Continuous Improvement Agent, Elektrobit

Chapters

Full transcript

The complete talk, organized by section.

Roman Pickl

Hello everyone, and thanks for taking the time to watch our presentation for DevOps Enterprise Summit 2020, Las Vegas Virtual. We are so excited to tell you our story.

My name is Roman Pickl, and for the last two and a half years I have been a technical project manager at Elektrobit in the advanced driver assistance systems domain. After being a process manager at the Austrian Postal Service and being the CTO of a medium-sized company, which was sold to a major industry player, I decided to escape the startup roller coaster and look for a position where I have more knobs to turn.

I am now working on hardware and software for data logging and replay, as well as hardware-in-the-loop simulation solutions. Lately, I have also joined Elektrobit's continuous improvement initiative as a continuous improvement agent.

I have a background in software engineering, business administration, and computer and electronics engineering. CI/CD and DevOps is really the sweet spot for me, as I love how the things that I learned in my production management and operations research courses are nowadays applied in the IT domain.

Last but not least, I am here to learn, which means that I am really looking forward to the questions and discussions afterwards, especially if your opinions and experience differ from mine. You can catch us on Slack throughout the presentation if you have any questions. Today, I am joined by Manja Lohse. Hi, Manja.

Dr. Manja Lohse

Hi. Thank you, Roman, and hello everybody. I am also really excited to be here and really looking forward to learning a lot from the presentations and from your questions, of course.

I am at Elektrobit in the role of the continuous improvement team lead. Roman mentioned that he is involved in this continuous improvement program. My role is to lead this improvement. Actually, also my role is to be the head of incubators and demonstrators, so this is where we try to do something new, where we try to figure out where Elektrobit has to go in the future.

Both these roles go well together, and I am here today to tell you something about the bigger frame that our story of really moving faster, how visualizing flow changed the way we work, is embedded in.

What is the situation that we are in right now in the automotive market? I am pretty sure you have heard that we are facing disruptive change. The automotive industry is really undergoing a big change right now. We are moving from combustion engines to electrical vehicles. One consequence is that fewer parts in the cars are needed, so of course there is also less to supply.

Elektrobit particularly is in the software area, so we supply software to automotive players. For us, actually, this change is a really big chance, as software is getting more and more important in the cars. But also for us, at the same time, it is a big challenge.

Last week I heard a number which really alarmed me. In the next five years, from what we foresee right now, there will be one complete year of car production lost due to corona, due to the changing economical situation. This means that every year, about 20 percent less cars than usually assumed will be produced. Of course this will also have an effect on all the companies connected and all the companies supplying to the industry.

What we try to do to deal with the situation is to drive technology further, to produce or to supply technology that will really help the automotive industry in the future. And at the same time, we will have to improve the way we work. This is why we call this whole initiative continuous improvement.

Let me start with the technology part first. The areas we are in are automated driving, a really hot topic; vehicle infrastructure, steering the car. The car is basically now a computer on wheels, right? It is an IoT device. We deal with topics like how can we steer these cars? How can the infrastructure work?

Connected vehicle is a really big topic. I guess most of you, if not all of you, have a smartphone, and this thing is so connected and it is a smartphone. It is so smart, and you expect it to update once something is broken or once there is a safety or security issue. The same thing goes for cars. We want to make sure that cars are connected and get these capabilities.

Of course, all this is really closely connected to user experience, because if you go back to the smartphone example, you really expect a lot of these devices, how they are handled and so forth. We have to take this to cars as well.

Of course, this is not a European topic or an American topic. This is a worldwide topic. This is why we have locations all over the world, in Asia, the US, and Europe as well. Our headquarters is in Europe, and Elektrobit has about 3,400 employees.

In the different regions in the world, we see different ways of handling the change, different ways of dealing with the disruption in the automotive industry. But I think all of them, all the different producers or OEMs, have in common that they see a strong move towards software, and they see a strong urge to improve all the time.

We started our improvement adventure last year with some what we call time-framed initiatives. We had certain programs, for instance called delivery performance. We knew we had to improve our delivery performance. It did not take long for us to notice that it does not make sense to have something that is restricted in time and that people try to do basically next to their daily business.

We had a group of people working on this topic that had completely different jobs, and they had a timeframe to get the topic done. Then again, when they started working on the topic, they did not even know what it would require in the future. That is where we moved to continuous improvement.

What we want to achieve with continuous improvement is the from-customer-projects-to-products view. The name of this slide or title of this slide is no coincidence. Of course, we were really inspired also by Mik Kersten's book on the topic.

What does this mean for us? Elektrobit, traditionally from our history, is really a project company. We work for big OEMs. We supply what they ask us to supply. But of course this view does not really scale. We were producing really specific solutions or specific projects for an OEM, and then we had a really hard time translating this and selling this to other OEMs. Of course this approach does not really scale well. We want to deliver value to all customers that we could potentially have in the market and that help us with our strategy.

Another thing that we realized in the last, let's say, year is that we need to have a stronger focus on optimizing the overall value stream. We often optimize locally. There are teams finding out that there is something in the process that could be improved. Then again, if the team is improving something, it is moving faster now, but it is still building the wrong feature, this will not create a whole lot of value for the customers. This is also a perspective that we are trying to take much more strongly now, to look at these whole value streams.

Since we started this journey of improvement, there are really some challenges that are becoming more and more visible. For us as a company, it might be hard to imagine, but Elektrobit worked for nearly 30 years without having a really pronounced vision that people were working towards.

Why did this work well? Because we were mainly serving our customers in projects, and they told us what to do anyway, so we did not put a strong focus on developing our own vision. Really, it is a challenge for us to learn this.

Also, this is reflected in contradicting goals. If you are a project company, one of the strong KPIs is utilization. We make sure that people are working, that they are busy all the time. Are they working on the right things? Yet another question. This is something that we still really struggle with.

Also, a strong focus on daily business is something that is really inherent in our company culture and that we are struggling with because we need to take more time to reflect and to learn. Of course, a lot of these things can be achieved also with DevOps practices to deal with these problems.

As you see in this red circle, this is an observation place. Assume that this is management looking down at the big valley. What is really important for us to learn now is that the things are happening in the valley. Actually, these are the arrows in the valley, and this is the flashes. This is really where things happen. This is where people are who understand what is going on and where people are who can come up with strategies of improving things.

Actually, we should be getting away from monitoring from above and telling people what to do, but we need to listen to the stories that are told in the organization, and we need to enable that these stories can be spread and people can learn from them. So I now want to give back to Roman because he is actually one of the people really active in our company creating such stories, and he will tell you his story now.

Roman Pickl

Okay. Thank you, Manja. When I joined Elektrobit in 2018, the main business problem we faced was delivering value in a flexible way at speed, at high quality to our internal and external customers.

A full cycle of building and testing one of our software and hardware products took more than 24 hours. When you did something in the afternoon, you sometimes did not get feedback until the following day or the day after. It had a negative impact on developer morale, and it felt really like quicksand. The more we fought it, the more it pulled us in.

We knew there must be a better way. Applying the three ways of DevOps, especially by experimentation and by identifying bottlenecks in the build and test run, we were able to cut the full build-test cycle by a factor of three in the first few months of 2018.

Moving code to a monorepo and containerizing our build environment in 2019 allowed us to provide feedback to our developers on every commit within minutes, not hours. Furthermore, automating our delivery allowed us to provide a new version of our software with the click of a button. This was great, and we felt happier and so freaking agile.

But after putting in so many hours and improving the deployment pipeline, investing in automation and deploying new technologies, it was time to ask a fundamental question: are we really moving faster?

One aspect that I really liked about my job in the operations department of the Austrian Parcel Service back in 2009 was the fast physical feedback and visibility of problems. There were more subtle and hard-to-find process errors, of course, but if one of the main systems or processes did not work as expected, boxes started to pile up at the bottleneck, providing a hard-to-ignore indicator of a problem.

I moved on after about one and a half years, but since then I always missed this clear feedback signal in my IT jobs. I was missing ambient awareness. I think I first read about this concept in Michael Nygard's 2007 book, Release It! The idea is to create an ambient display, an interface between people and digital information, which represents data, for example the health of a system, with the help of sound, visuals, movement, or other cues.

There are various ideas out there, from simple displays to ambient lamps, lava lamps, beer lamps, USB rocket launchers, traffic lights, you name it. These kinds of information radiators should be put in a highly visible location to promote responsibility in the team. They show that you have nothing to hide. They provoke conversation, and they can be traced back to, you may have already guessed it, the Toyota production system.

Steve Poole from IBM held an inspiring keynote about dashboards and culture at DevOps Enterprise Europe 2018. He told a story about how sharing insights on the dashboard closes communication gaps, forces discussion on how to generate accurate data, and changes your culture. Putting data on a dashboard makes a problem real again. Before that, it is just data in a spreadsheet.

I had already collected the data that I wanted to show in a wiki for our weekly status meeting, but I wanted to collect them automatically and have them up to date all the time. What I really wanted was an automated dashboard.

What is more, I remembered the quote from Winston Churchill: we shape our buildings, and afterwards our buildings shape us. It also reminded me of a Skoda plant and the BMW project house, discussed in Thomas J. Allen and Gunter Henn's book, The Organization and Architecture of Innovation, where the employees have to pass certain points of the assembly line or up-to-date prototypes before arriving at their workplace.

Working in a distributed team, I wanted to have the data available on our intranet, but also on a highly visible screen in the entrance area where everyone passes by a few times a day.

I already had a Raspberry Pi at hand, but I quickly learned that getting my private device on the company network was, of course, impossible. What was even more startling for me was that it was very difficult to get an additional monitor to show the dashboard. I asked for it every other month, but given that we were growing, there was always a scarcity of time, also my own, and resources.

In retrospect, I was very successful last year with piggybacking a planned change. Part of our company moved from one office to a new office down the street. The new office was renovated and a wish list was created. I asked IT staff that I need the stuff for my dashboard, basically the Raspberry Pi and a monitor, and I think that given that they were in a change mode of solving problems, buying hardware and setting up the network, it was easier to get the stuff approved, and I got these devices.

There were also some logistical problems, which I guess is normal with setting up the new office space, and I had some time to play around with a framework called Smashing.

Based on the metrics I had collected by hand for a few months, I wanted to visualize the following things: the next milestones and important dates of releases; open pull requests and their age; open support tickets; Jira tickets per status, as also visible on our Kanban board; and the status of Jenkins jobs, including build time and failing tests.

I was ready to put some time into implementing a first version of the dashboard, and I set it up in the hallway, ready for the official opening of the office. The dashboard sparked a lot of interesting discussion during the opening party, and ever since, the dashboard has been part of the new office and evolved into an important indicator of the current status of the product and project and the source of new change initiatives.

I had succeeded in bringing back ambient awareness. But that is when I noticed a problem. There is a scene in Eliyahu M. Goldratt's classic book, The Goal, that I read after listening to the Beyond the Phoenix Project audiobook. Alex, the leading character of the story, is very proud of the increased productivity they get in the plant by applying robots. Then Jonah, the management guru, asks him some questions.

In summary, the dialogue evolves something like that. Is the company now making more money? No. Did you ship even one more product? No. Are plant inventories down? No. Are employee expenses down? No. And then Jonah says, then you did not really increase productivity. Your inventories are going through the roof, are they not?

Looking at the dashboard, the inventory was staring me in my face. Imagine all these tickets were boxes lying around in the hallway. They would have been way harder to ignore. They do not have any value as long as they are not released. Furthermore, it does not really make sense to add more. The bottleneck in development had shifted to testing, and we were creating a lot of inventory.

However, when I read about the typical progressions of bottlenecks in the DevOps Handbook, I was reassured that our efforts were going in the right direction. It also reminded me of the J-curve of automation mentioned in the 2018 State of DevOps Report, which states that you really need this relentless improvement, refactoring, and innovation to reach a state of excellence.

At this time, I had just finished reading the Project to Product book by Mik Kersten. Looking at our current state, I was especially interested in the throughput part and aimed to measure flow as defined in the book. I aimed to measure flow load, flow time, flow velocity, flow efficiency, and I also wanted to visualize flow distribution.

I also had a look at tracking business value, cost, quality, and team happiness, which we do with a survey quarterly, and correlate it to the flow metrics. However, I quickly noticed that it is really hard to get some of this data, especially in an automated fashion.

I created another dashboard that visualizes these metrics. Notice that while it provides some insights into features and defects, we currently do not track risks and technical debt. Some are in the improvement category. Every 60 seconds, we rotate through this flow metrics dashboard and the status dashboard.

After the next deployment, I stood in front of the dashboard in the hallway, and it dawned on me: we were shipping more often, but as we did not deploy from master, but rather patches from a release branch, on average we got slower. We had a fast lane for fixes, which were fixed on master and backported to the release branch. But it still took us too long to ship features which were waiting to be released. So we were looking into cutting our release cycle for major releases from every half year to each quarter or even more often.

Still, it seemed as if we were always late, with priorities and requirements changing in between these cycles, and it felt like we were improving our development process, constantly running but remaining in the same spot as in the Red Queen's race in Lewis Carroll's Through the Looking-Glass, and What Alice Found There. Well, in our country, said Alice, still panting a little, you generally get to somewhere else if you run very fast for a long time, as we have been doing. A slow sort of country, said the Queen. Now, here, you see, it takes all the running you can do to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that.

As I later found out, we were trapped in the local optimization and urgency paradox described by Jonathan Smart. Valuable ideas sit 12 months to 18 months in big upfront planning phases with no sense of urgency, no questioning if they might add more value than what has already been locked in the plans for the year. And as soon as they reach the product development team, they are urgent. I had already heard of a similar phenomenon called water-scrum-fall, but this we-are-so-freaking-agile picture from Klaus Leopold really drove it home for me.

By the way, if you are interested in the dashboard, you can find the code on GitHub and a blog post on my website with some more technical details. However, after I implemented the dashboard, I found that there are now several professional and open source solutions that cater for the same or a similar problem. If you have a more complex setup or want to do something more serious, you might want to look into these. If you want to know more, Forrester has a very recently updated report on value stream management solutions that might be of interest.

We found that we had a limited system focus, and we needed buy-in to influence the process up and downstream. Luckily, at the same time, the organization identified focus programs to improve flow, and based on these, started this continuous improvement initiative that Manja talked about. We were able to connect to the program and harness what we learned to drive further change.

For example, we moved to a lightweight quarterly planning cycle, as discussed in Gary Gruver's book, A Practical Approach to Large-Scale Agile Development, to solve changing priorities and the urgency paradox. We now track work in progress more closely. We work in smaller batches, and we doubled our release frequency in 2020, providing monthly patch releases and quarterly minor releases.

We are heavily investing in test automation, and we built a new test track as well as simulation and emulation capabilities to test more of our use cases automatically. Last but not least, we are also discussing reorganizing our teams based on their cognitive capacity, as discussed in Matthew Skelton and Manuel Pais' book, Team Topologies, as the cognitive load of our team was too high.

While we made considerable progress in our journey, challenges still remain. Manja now will tell you more about the road ahead.

Dr. Manja Lohse

Thank you. I guess it is basically always about finding the balance between all these activities that are going on in the team, like the story that Roman just told you, and bringing this to the company level, which is really more of what my job is.

What are the challenges that we are facing in the future? What are the next steps? There are some steps that are quite concrete.

As Roman pointed out with the image of Klaus Leopold, if you optimize locally, you might get faster but not better in a way. What is really on our list of things to do is this cross-product prioritization and planning. We need to have a good way, also across our products, to find out whether we are doing the right things and whether we are doing the most important things first. This is really something that we need to improve.

We need to establish meaningful KPIs, and of course the flow framework is a good inspiration here. Also, this is part of our approach: try out in places, try to find out what works, what makes sense for us in our business context, what really tells us something about performance, and then put this on the whole organization because, of course, we also need to compare between products, between departments. But this is really the second step. First, we really need to find the things that are meaningful for us in certain circumstances.

We are conducting value stream analysis, not only for this product but generally for our products, which also helps us to find the problems, identify the bottlenecks, try to determine what the biggest problems are and how we can move forward, but also, again, to show what is going on between our products. Maybe what is more valuable for us as a company and for the customer, of course, and what is not. This is also why we talk a lot about these cross-product topics.

We identified some products, some topics that can only be solved on a company level. For instance, budgeting is one of the best examples. If we want to work in this way that we just described, or in a more DevOps-oriented way, we need to rethink the way we do budgeting. But this is something that cannot be done on a department level, of course, but that we need to address on the whole company level.

We need to scale continuous improvement, of course, because if we have these stories, we need to take them to the company. We need to ensure that other people are learning from it, are adding to it, and that we as a company can move forward. Right now, we are working with the CI team and the so-called continuous improvement agents in the several products. This is the setup that we currently have, and where we try to bring topics to various departments and to scale.

This is rather concrete, if not very concrete at all. There are also some more abstract topics on the next slide, in the sense of really big challenges that remain.

Making our achievements relevant and sustainable for the organization is one of these topics. For example, we are often asked: by how many minutes, hours, and so forth did you improve your build times? And then we can say, well, also as Roman said, we improved them by a factor, or by 90 percent. But then we need to ensure that the next time we do not start in the same place and have to improve for another 90 percent again. We need to make sure that we learn what we did wrong in the first place to make these results relevant. We need to learn from this, and maybe next time even start better than we did in the current situation.

Actually, a lot about continuous improvement and also DevOps for me is removing impediments that are arising from our company structure and from our culture. I am certain you know this: there are certain ways of doing things here, and these ways are not always welcoming a new way of working. Of course, because people's positions are at stake, everything needs to be discussed. This is really part of the hard struggles that we have here.

We need to reflect and to learn about our organization, which sometimes is really hard because we are part of this organization, so it is really hard to tell where the problems are. But we need to train ourselves to get better in recognizing them and in solving them.

One more click to what the actual challenge in the end for us still is: transforming into a value-creating product company that is really future-proof, given the situation that we have in the automotive market. We are on our way. We make good progress in getting there, and now we are really interested in your questions and in your feedback. Thank you for your attention.

Closing

Thank you very much. Bye-bye.

Thank you.