A Guiding Map for DevOps

We're getting good at DevOps, with work on maintaining our code in version control, writing good tests, running intelligent CI pipelines, and being able to deploy regularly meaning that the gulf between Dev and Ops has never been narrower. But is this enough? Does doing this well and building competencies in this area shine a light on other problems, and does doing DevOps well always help the whole business improve? I'll review where we've got to today, how it relates to the rest of the business, and then explore the link between DevOps and Value Streams. Finally, we'll explore some ideas of how it all fits together and what to do next.


This session is presented by Adaptavist.

MS

Matt Saunders

Head of Internal DevOps, Adaptavist

Transcript

00:00:14

Hello there. My name's Matt Saunders. I'm with a doctor first. Thank you for joining us on this presentation, or I'm going to talk about our guiding map for dev ops. So on the head of dev ops, but that's the most, um, but I'll spare you of talking about it, that the rest for now I'll talk about them at the end. So what are we going to go through in this session? Um, so it's a fairly well established pattern of how to do DevOps. Well, um, we've been in the dev ops era for over a decade now. Um, and there's some pretty good information out there. Lots of sources of how to do DevOps. Well, I'm going to go through some of these because that's a kind of prerequisite for what I want to talk about, the actual, um, the bread and butter of the presentation a bit later on. So we're going to go through dev ops. Um, what doing dev ops world looks like, um, and how we measure that. So how we measure succeeding at dev ops, then I'm going to take a slightly different direction and go off and look at how we find the value of what we're contributing when we do dev ops. Um, and then we're gonna talk about value streams, um, and Wardley mapping and mapping the streams. And then right at the end, we'll talk about how to put it all together in one big coherent whole, hopefully.

00:01:24

So let's start by going into, what do we mean by doing dev ops? Well, well, I think it comes down to, um, basically three things here. Um, firstly having good source case management and continuous integration, the second thing is having good access to environments. And the third thing, um, is around getting changes, flowing, or getting changes, flowing smoothly through your organization. So let's dig into those in a little bit more detail, so good source code management and continuous integration. Um, this should be, um, uh, a panacea that everyone is aiming for. And we think a lot of organizations have reached this, um, in my adventures as a consultant with adaptive vest and also working internally within adaptive. Um, we'd like to think we're doing this reasonably well. And we see this done fairly well across many organizations out there. So the three tenants of good SCM NCI, firstly, keeping everything in version control systems because it's never too soon to have a peer review of your data.

00:02:27

Um, and we can tie that sort of stuff back to the sprint planning. So everything's in the virtual control system. We can work out exactly what we're going to do up front with our sprint planning and then go and do it and see the results, um, in our FCM that lets us then build continuously. We'd like to do continuous builds in a dev ops world because smaller, incremental changes are easier to manage. We're getting away from that waterfall environment where, um, lots and lots of changes would go into a single release. And because nobody knew how those changes were interoperates. When the build inevitably broke, we wouldn't really know what caused it so we can find and fix divergencies very, very early on by doing continuous integration. The third thing here is about being able to deploy that well so we can test and check our changes as soon as we possibly can.

00:03:16

After they'd been written off, they've been integrated so that we can tighten those feedback loops. If we go back to gene, Kim's three ways of dev ops. This is one of the key things here is being able to test these things. It's being able to get that software out there, either in front of users or in front of, um, a developer who's seeing the results of their, of their work or a tester let's tighten those feedback loops so that we can see if everything went well. And if not, we can adjust and go again, access to environments as the second tenant here, a good dev ops. Um, we want to allow devs access to environments. And by that, what I mean is an environment that looks a bit like, or a lot like your production environments, um, gone are the days where having a production environment alone and a developer's laptop, um, where tests were, uh, sorry, were changes for tested was good enough, um, what we want to be doing here and what we're seeing increasing our consumption of is having pre-built or on demand environments that mimic production, maybe not quite on the same scale, uh, but a good enough so that a dev can use an environment with everything mocked out for the things that he's not already, or she is not already using, um, and dedicated to that cause, um, we don't want to make all devs queue, either waiting for environments harms flow.

00:04:33

So again, anti-patterns that we see here are things like a single dev environment with multiple developers waiting in line, uh, to test their changes out, uh, triggering these sorts of things from CIA environments is now something that, uh, people should be doing production like, well, it's difficult, but it's necessary to make influx environments production. Like, and what I mean by that is that often we find that production environments are built with a combination of automation and some other stuff, some hand-coded stuff over here, um, some things that were clicked on an Amazon web services console over there, um, those sorts of things make it difficult to replicate production environments because we want to be doing everything is code because if everything is code, um, then we can, or we can replicate those things. Um, other issues, we find it with scale, perhaps our production environment has got a hundred million entries in the production database. Uh, if we want to be testing our systems out, um, then we possibly don't need all those 100 million entries, maybe this personal information in them as well. So you have to look at data masking as well, so that, um, that personal information isn't replicated further than it needs to be. So getting those environments of an appropriate size and scale is really, really important.

00:05:55

So the key thing we want to be getting to with all of these, uh, pieces of automation and, and, and code is to be able to get our change flow well-regulated and well-controlled, um, we want to be able to make it easy to get through any gates that the organization has. Uh, we see things like change advisory boards mentioned a lot in regards to deploying things to production, but again, it's 2021 now. Um, and we're figuring out ways of aligning people's concerns so that we don't have to have a checkbox, um, or lots of people ticking off on a spreadsheet, but it releases good to go. Um, with automated testing, browser-based testing, for example, for web apps, uh, which we can do, um, in an increasingly sophisticated manner these days. So we can test real world real world scenarios. Um, you've got a web app, um, can use this to log into it.

00:06:47

We can automate those sorts of tests just as an example. And finally, on this point, decoupling comes in here. Uh, we often talk about microservices, um, and monoliths. Um, and the key takeaway here is that if you reduce the dependencies that individual components have, then you can, and if you make them independently deployable, then you can deploy them more easily. There's less friction. There are less things to go wrong, fewer interfaces, fewer moving parts, and that all helps us go faster and more reliably. Um, it's dev ops happened. So let's talk about how we measure that. How do we actually measure the success of dev ops? Um, and again, this is relatively straightforward in this day and age, we've had the state of DevOps report that has come out of Dora, um, for the last, um, six or seven years probably now. Um, it's a well established means of, um, uh, finding metrics that we can use to measure your success.

00:07:44

And the key thing here now is that this report and the science behind it is now just evolutionary, not revolutionary, the, the metrics that the state of dev ops report highlights, um, as being ones that indicate that your organization is high-performing, um, uh, nicely, uh, nicely bedded in. Now, there's not really any need to change those, even though they're constantly revisited. So let's just rework, uh, uh, revise what those are the measures for, for DevOps success, um, as defined by Dora and the Google cloud people, um, basically four there's, four of them, four things to measure deployment frequency, um, because this shows whether your deployment processes are mature. Um, and if the organization is willing enough to push change through the more frequently you're deploying the greater the, the, uh, the indicator of, of a successful DevOps style organization, lead time for changes. Um, so this is all about seeing, um, how long it takes from a change being decided that well, yeah, we want to do this a product owner, for example, um, coming up with some changes that they want to get onto a website or into an application, um, and the time it takes for those changes to get out there into production.

00:09:02

And the reason that's a key indicator of success is because it shows where the teams are able to deploy changes without getting held up. And without any of the red tape that we traditionally associate with, with bureaucratic organizations, the third thing then is to change failure rates. So all these changes that we're now pushing through the system, how many of them fail? Um, it's, uh, we used to go fast and break things, um, attributed to mark Zuckerberg at Facebook. Uh, we no longer, um, when maybe no longer quite that cavalier, but we're still gonna have failures. Um, if we're not having any failures, then we're probably going to slowly. Um, but the percentage of failures shouldn't be that great if we're deploying lots and lots and we have the occasional failure, but that's probably all right. Um, this shows up whether there's a maturity and testing ability, because every time we make a change, we should be testing that change.

00:09:54

The test should have good enough coverage to reveal whether or not the change is going to be successful or not. Um, and flagged that before it gets anywhere near production. The final metric of DevOps success is the time to restore services. Um, so if something has gone wrong, if our aforementioned change has failed, um, then if we're able to respond quickly and get that service back online, either by, um, rolling forward to a new software release, or by fixing a bug, uh, or rolling back even to a previous iteration, then that gives us an understanding of the organization's ability to swarm on problems and solve them. Um, they're inevitably going to happen, um, and measuring those, um, and seeing how, what an organization does there, uh, is a key metric.

00:10:43

So all those things, again, I'm talking somewhat, um, uh, retrospectively in that a lot of organizations have been there and done that, you know, yeah, we were cool. We were doing this stuff. No problem. Um, where we see some problems start to creep in is, um, when you scale up a bit, so these sort of ideas, they work well in small teams, perhaps the two pizza teams, uh, that we often talk about, uh, when we're looking at team typologies and scaling teams, uh, what is the ideal size of a team? Well, one that could be fed with two pizzas or name, your junk food of choice. Um, however, modern organizations generally need to scale up. Maybe it's a factor of success. Um, and so the two pizza team thing doesn't quite apply anymore. Um, so we ended up with, uh, themes forming, perhaps DevOps teams, perhaps.

00:11:33

Um, um, that's a theme that you have within your organization now, uh, which traditionalist would say is an anti-pattern the right, no such thing as a dev ops team. It's only a team that actually does DevOps as part of everything else that they do. Um, and the final thing I want to mention here is, um, is platform teams. So we sometimes find that if you have an organization that has many, many things to deploy and they all look kind of similar and they all run on similar kits, then there are a lot of reasons to go off and make a platform team that can host all those things for the organizations you get the best common practice. Um, you get economies of scale, um, out of platform teams. Um, but we can't find problems starting to creep in.

00:12:18

And the rest of my talk is largely aimed at organizations where this sort of thing is starting to become a little bit of a problem. And there's no shame in admitting it. You get to a good maturity state in small teams in dev ops. Um, that's a hard thing to do. Um, what's even harder is to make it scale up. And we'll talk about some of the problems that creep in. Um, when you try to do that, you start to find silos coming back. Uh, you find isolated departments or teams that are fulfilling specific functions within this whole dev ops remit. Um, because we've now carved up the jobs, um, that are involved in doing DevOps. Well, maybe I'm doing CGI source code management, um, running environments, perhaps they're actually done by a separate team because we've got lots and lots and lots of them, and we don't want individual teams to bear that burden.

00:13:08

Um, you start to get, um, these anti-patterns creeping back in, so silos, um, delays, uh, we often talk about how, uh, specialisms are not really the problem. The problem is the interfaces between them. So if you have a highly trained team of devs and a highly trained team of operational people and a highly trained, trained team of infrastructure people, um, if they're, if they're not working together day in, day out, um, maybe they're communicating through, um, queues, JIRA ticket, service desks, et cetera. Uh, then those interfaces, you can lose meaning in those you can add delay, uh, you can add friction and you lose flow. You also find that constraints. And sometimes for between stores when working across teams, um, or larger teams that don't just fit in one room, um, platform teams can often, uh, suffer from overgeneralization, where we look at, um, building a common platform to do everything for, uh, a large number of development teams, but they're not quite the same and then nuances for each one of them, uh, that maybe needs to be ironed out. And the final thing I want to mention here is that objectives that are conflicting start to come back in, again, platform things, for example, get measured on availability again, or the time it takes them to deploy a large environment that solves everyone needs everyone's needs. Whereas, um, the development teams and the product teams are focused on delivering value to the customer. So straight away you have those conflicting objectives, which if you're not careful can start to cause problems again.

00:14:46

So all of this can lead to a reduction in flow. So things flowing the system. And I mean, the, the whole operational system, not just the technical bits, um, and increased friction, um, more trouble where people are, um, people's timescales could be radically different across platform teams, dev ops teams, um, development teams, infrastructure teams, et cetera. So let's move on now and let's talk about how do we find this value? How do we actually find a consolidated view of what value actually is that these teams are delivering and how do we make sure that it all joins up together? So a few ideas here, um, we want to look at flow. So flow of ideas, again, we talk about in the, uh, the state of devil's report talks about the metric of how long it takes, um, to get an idea, uh, out into production.

00:15:45

So an idea created ideated developed, tested, deployed, and out into production. That's our flow of the organization, and that's where we're, um, that's where we should be imagined because that's where we're actually delivering real value. Um, the value there is intrinsic value, of course, in all the component parts, uh, the awesome cloud platform, for example, the awesome piece of coding there, awesome. Some product design, the comprehensive tests, but overall, it's all about this flow. And then there's active organization in an agile organization that we want to get ideas flowing through the organization so people can see them, people can use them, um, a big surprise where you're from fine is that scope of these things is bigger than people think. Um, so if we're looking internally within our dev ops organization, um, or within our dev ops team, um, you can fair, you can reasonably easily see the scope of that.

00:16:38

It gets some code written, it's get it deployed, uh, run it. Um, but the scope, when you're looking at value streams can often be a lot broader than that extending, right? The way out to, um, to products owners, to people in the business, um, to the objectives that, um, project, um, sorry that product owners want to get out to their customers all the way through to getting this stuff deployed. Um, and looking at that in a business context can often be very, very helpful, um, rather than just limiting it to the piece that we were concerned with, all the things we're concerned with. I don't want to bring in situational awareness here as well. Um, so widening that scope and looking at the context of all the things that we're doing, the continuous integration, the cloud projects, Kubernetes, um, Lambda functions, whatever it is, um, it's all meaningful stuff, but it's much more meaningful if you can place it within the situation of the overall business.

00:17:35

So let's talk about mapping those streams and what that means. So basically we do value stream mapping. We just make a map. Here's an example from IBM, where we look into how customers are actually deriving value from what we're doing, um, and how the things that we do could affect the value that they they derive. Uh, so here's an example for a defect and let's sort of look at some of these things on here. Um, we go from on the left for where a customer opens a defect report, which maybe takes five minutes. Um, and if we look on the other side, we can get a fixed, deploy that to production, which again only probably takes a few minutes. Um, but those, um, I hesitate to call it a delay, but the time scales are much larger in the middle. So triaging problems, um, assigning severity, um, them into a queue for someone to do some work and to fix it.

00:18:34

These things all take time. Um, and not only did the things themselves take time, but the handoffs between them can take time and the cues, um, of work, um, can take time here. For example, the big takeaway is the, the value in the yellow, um, oval here, which is that there's a week's delay between triaged or a problem, um, and effects being written. Um, it's an old cliche sometimes that, um, uh, we wonder why on earth. It takes so long to fix problems when actually writing the code that bit, just writing the code to fix a problem. And it takes five minutes. Um, or in this example, let's make it into, uh, uh, putting in tests, making sure the fix is correct. Peer review, all those good things, maybe that takes two days. Um, but yeah, the key thing here is that delay of a week. So we can see here that it takes about two weeks, um, to, um, to get this thing all the way from start to finish.

00:19:35

Here's another value map. Um, this is from the, uh, the lean enterprise book, um, by Jess humble. It'll, um, I'm not going to go into the whole detail of this, but it shows just how much, um, more detail you can add to a value map, um, in order to find out how things, how long things actually take, and you end up focusing in, on not necessarily how long it takes developers to write code or how long it takes the product owner to write up a spec for something. Um, but the handoffs, the, the gaps between the silos in your organization, you start to see, um, where these delays were actually coming in by making a map. Um, and what we found is that, um, and here's a quote from CPR from visible, which is that you can get improvements on this just by creating the map. You just write a map out and then straight away, um, again, somewhat anecdotally, but also there's some science behind this from Steve. Um, in that, yes, you can get perhaps a 20% improvement just by creating that map and seeing what the consequences are of those, or there's little delays that add up all the big ones. Indeed. Um, once you're aware of them, then you can start working on fixing them.

00:20:46

I want to go a little bit deeper into maps just for a few minutes by talking about Wardley maps. Um, so Wardee maps, um, or a map for business strategy. Um, it was conceived by Simon Wardley back when he run, um, a photo processing website and application called, um, for tango. Um, and this is all about situational awareness and it's about mapping out, um, which things people should be working on what's commoditized, and therefore you shouldn't be working on them. You just buy that in, um, and working out where we are or where we want to get to. There's a whole load of science behind it, which is absolutely brilliant. And I, I recommend you go and look that up. Um, if you're at all interested in, um, understanding how your own value streams within a technical organization map out to those of the business, um, in our context and the devil its context, Wardley maps could help us with direction and trajectory of what we're doing, uh, as a, as a supplier, um, we're supplying DevOps to the rest of the company, um, hopefully in a fairly integrated fashion, um, but understanding where we sit, um, in the global business and also considering the functions that we, um, that we formed as a business, um, service is really, really valuable because you can look at where you've come from, where you want to get to and what you need to change, um, to get there, whether that's through developing new stuff or going and buying stuff off the shelf.

00:22:13

Um, a simple example here is, um, would you design and build a CIA system when you can go and buy one in from any number of different vendors? Um, you probably shouldn't, unless it's, unless you're doing something really, really special with CGI, of course, on a map and especially awarding Matt can help you figure those sorts of things out. There's an example where the map, um, from learn more balloon mapping, you can also look at, um, there's some great talks from, um, an event called map camp, which is organized by Simon, Simon Wardley and his team, uh, runs in London every couple of years. Um, so yeah, there's a map which shows, um, basically the, um, the situation of the business compared to where it wants to get to, um, what they should be buying in, what it should be developing, um, and what they need to focus on.

00:23:00

That's great stuff. So there we go. So the results of doing a mapping exercise, um, should let you help work out where you can add measurable value. Uh, it's very easy for platform teams for dev ops teams to do really cool stuff, um, and, um, innovate with new tools. There's lots of, uh, technology and tools out there, which we all love to use and which we love to make, make the best of, but where is it actually really adding value? I'm not saying we're not adding value, but if we can measure that we're adding value to the rest of the business and helping other people get their stuff done, uh, that's surely a better position to be in. So again, yes, we look at what can we buy in? Um, there are commoditized, um, solutions for many, many of the technical challenges that we solve, um, or that we need to solve for the rest of our organizations.

00:23:53

Uh, we should buy them in where it doesn't make any sense to build them. Maybe there is a differentiator where if we're building something because, uh, we have some particularly, uh, different needs, um, or perhaps we're building something that we ourselves want to sell on, then maybe that's the case for building. Um, but often we want to buy and mapping these things out, can help us by these things. So, yeah. And just to finish off on this slide, what are we building for others to consume? Let's be absolutely clear on what those things are. Um, and that aren't just a technical shopping list.

00:24:28

So there we go. So how are we going to put it all together? Um, let's just join all this sort of stuff up. Um, so here's some takeaways, um, some things that I think we need to be doing, um, that kind of summarize, um, from what I've talked about for the last few minutes, what we're here to do is fundamentally to help people work together. Um, and it seems like an obvious thing. Um, but when people's, um, needs and objectives are different, those sorts of things can be, can be skewed. Maybe a developer is, uh, is, is incentivized by, um, the amount of features that they add to, to a website or to an application. Maybe some ops people are incentivized by keeping those systems up. We've all heard that classic conundrum of don't change anything because it might go down and effect on numbers, but we have to change something in order to put new features out.

00:25:24

Um, our job, um, in dev ops or one of them is to make sure that those sorts of frictions don't come to a head and that we can always, we can always sing from the same hymn sheet that we have to understand where we are mapping is, is key here. Um, situational awareness, where does our team sit within the business as a whole? Where does it sit within the, uh, other similar, um, departments in other organizations, we need to understand what we're doing, why we're doing it and how that's different to what we often see models in play, um, that we adopt from other organizations. And we, and we can see that blindly adopting those models, um, either in agile development or in dev ops, um, cloud engineering, et cetera, um, without our own situational awareness, it's probably going to lead us to failure or doing the wrong things.

00:26:18

So once we're aware of where we are, um, we can then work out where we need to be, again, sounds obvious. We're here. We want to be here. These are the things we need to get from a to B. Um, and those things are generally going to be things that are, um, things that we can do, things that we can change technologies, we can implement processes. We can change in order to get the most of that flow of the value. Cause we focusing on the value, uh, the value that we're delivering to customers, the, uh, and again, those customers could be internal or it could be external. Um, but the broader you can think, um, at a strategic level, the better that's going to be. So yeah, slim down what we're doing. Um, let's not go and write and alternative to Terraform. Terraform will do the job.

00:27:06

It we'll do it brilliantly. Fantastic. Um, if it doesn't, then maybe we can buy an, a product that will do that for us. We're not in the business of, um, uh, of, of writing Terraform plants, we're in the business of delivering value to our customers. So don't work on things that don't add real value or that we can buy in external touch points are crucial, um, talking to our customers or the people who pay us money, um, externally or the people who were kind of sort of cross charging, um, technical and process implementations to within the organization, uh, because their needs and making sure that we're still on top of vendings will always sit as straight and above all. Let's not forget the key tenants of DevOps feedback loops, great automation making the best of our people. Um, those are the key. Those are the things that we need to keep on doing within this wider context of value streams, uh, in order to help our organizations succeed.

00:28:07

So, as I said earlier, I worked for adapt vest. Here's some of that too, working happily away, and here's some more of them there we are back in the pre COVID days. I remember them, hopefully we'll get to them soon. Um, we've got a lot of stuff going on around digital transformation. Um, for example, um, we, uh, um, have a program of helping organizations to develop an agile organization. Um, so what does it take to develop a truly agile organization? Sorry, we're getting a bit sales pitch here, but this is the sort of stuff that we do. We do really, really well. Um, we help organizations become more agile. Um, decision sprints is something that we, uh, we work a lot with, um, through our sister company brew digital was developed an awesome way of, um, relieving, uh, analysis paralysis and, uh, um, bringing teams together so that you can make decisions on things that are maybe, maybe dragging on where maybe we don't have consensus on something and haven't had for a long time.

00:29:03

Um, so awesome work. We can help you with, um, incision sprints, um, and finally, um, right about now on the 7th of October, I'm prerecording this talking September. Um, but I think this talk is going to go out roundabout the same time as our DevOps value stream management with get lab webinar, um, where, um, a job in, from our professional services teams talking about how to do value stream management, uh, with get lab. So I recommend you either sign up for that webinar off of our websites. Um, if this goes out after the webinar, then you'll be able to see a recording of it. Um, so expecting great things from that. So it's a great tool and a lot of value to be derived from it. And that's about it really. So my name's Batson is I'm from an activist. Um, as I said, we do all this great stuff. Uh, we're also in the last in platinum partner. Uh, so we work heavily with JIRA confluence, those sort of tools. Um, and we also partnered with companies like get labs, Sona type, et cetera, um, to help organizations absolutely get the most out of the tools and processes and people, uh, are those three key DevOps tenants. So that's it. Thank you for listening. Hope you enjoyed it. Goodbye.