A Guiding Map for DevOps

We're getting good at DevOps, with work on maintaining our code in version control, writing good tests, running intelligent CI pipelines, and being able to deploy regularly meaning that the gulf between Dev and Ops has never been narrower. But is this enough? Does doing this well and building competencies in this area shine a light on other problems, and does doing DevOps well always help the whole business improve? I'll review where we've got to today, how it relates to the rest of the business, and then explore the link between DevOps and Value Streams. Finally, we'll explore some ideas of how it all fits together and what to do next.


This session is presented by Adaptavist.

MS

Matt Saunders

Head of Internal DevOps, Adaptavist

Transcript

00:00:00

<silence>

00:00:14

Hello there. My name's Matt Saunders. I'm with Adapt. Thank you for joining us, um, on this presentation where I'm going to talk about our guiding map for DevOps. So I'm the head of DevOps at as activist, um, but I'll spare you to talking about as activist for now. I'll talk about them at the end. So what are we gonna go through in this session? Um, so it is a fairly well established pattern of how to do DevOps well. Um, we've been in the DevOps era for over a decade now. Um, and there's some pretty good information out there. Lots of sources of how to do DevOps Well, I'm gonna go through some of these because that's the kind of prerequisite for what I wanna talk about, the actual, um, the bread and butter of the presentation a bit later on. So we're gonna go through DevOps, um, what doing DevOps well looks like, um, and how we measure that. So how we measure succeeding at DevOps. Then I'm gonna take a a slightly different direction and go off and look at how we find the value of what we're contributing when we do DevOps. Um, and then we're gonna talk about value streams, um, and wardly mapping and mapping the streams. And then right at the end, I'll talk about how to put it all together in one big coherent hole, hopefully.

00:01:24

So let's start by going into what do we mean by doing DevOps? Well, well, I think it comes down to, um, basically three things here. Um, firstly, having good source code management and continuous integration. The second thing is having good access to environments. And the third thing, um, is around getting changes flowing and getting changes flowing smoothly through your organization. So let's dig into those in a little bit more detail. So good source code management and, and continuous integration. Um, this should be, um, uh, a panacea that everyone is aiming for. And we think a lot of organizations have reached this, um, in, uh, my adventures as a consultant with adapt and also working internally within iv. And we like to think we're doing this reasonably well, and we see this done fairly well across many organizations out there. So the three tenets of good CSCM and ci, firstly, keeping everything in version control systems 'cause it's never too soon to have a peer review of your data.

00:02:26

Um, and we can tie that sort of stuff back to sprint planning. So if everything's in the version control system, we can work out exactly what we're going to do upfront with our sprint planning and then go and do it and see the results. Um, in our SCM that lets us then build continuously. We like to do continuous builds in a DevOps world because smaller incremental changes are easier to manage. We're getting away from that waterfall environment where, um, lots and lots of changes would go into a single release. And because nobody knew how those changes would interoperate when the build inevitably broke, we wouldn't really know what caused it. So we can find and fix divergences very, very early on by doing continuous integration. The third thing here is about being able to deploy at will. So we can test and check our changes as soon as we possibly can after they've been written, after they've been integrated, so that we can tighten those feedback loops.

00:03:21

If we go back to Gene Kim's three ways of DevOps, this is one of the key things here is being able to test these things. It's being able to get that software out there, either in front of users or in front of, um, a developer who's seeing the results of their, of their work or a tester. Let's tighten those feedback loops so that we can see if everything went well and if not, we can adjust and go. Again. Access to environments is the second tenet here of good DevOps. Um, we want to allow devs access to environments, and by that what I mean is an environment that looks a bit like, or a lot like your production environments. Um, gone are the days where having a production environment alone and a developer's laptop, um, where tests were, uh, sorry, where changes were tested was good enough.

00:04:06

Um, what we want to be doing here and what we're seeing increasing, uh, consumption of is having prebuilt or on demand environments that mimic production, maybe not quite on the same scale, uh, but are good enough so that a dev can use an environment with everything mocked out for the things that he's not already or she's not already using, um, and are dedicated to their cause. Um, we don't want to make our devs queue either waiting for environments to harms flow. So, uh, again, antipas that we see here are things like a single dev environment with multiple developers waiting in line, uh, to test their changes out. Uh, triggering these sort of things from CI environments is now something that, uh, people should be doing production. Like, well, it's difficult, but it's necessary to make invo environments production like. And what I mean by that is that often we find that production environments are built with a combination of automation and some other stuff, some hand coded stuff over here, um, some things that were clicked on in an Amazon web services console over there.

00:05:11

Um, those sort of things make it difficult to replicate production environments because we want to be doing everything as code, because if everything is code, um, then we can or we can replicate those things. Um, other issues we find are with scale, perhaps our production environment has got a hundred million, uh, entries in the production database. Uh, if we want to be testing our systems out, um, then we possibly don't need all those 100 million entries. Maybe there's personal information in 'em as well. So you have to look at data masking as well so that, um, that personal information isn't replicated further than it needs to be. So getting those environments of an appropriate size and scale is really, really important.

00:05:55

So a key thing we want to be getting to with all of these, uh, pieces of automation and, and, and code is to be able to get our change flow well-regulated and well controlled. Um, we want to be able to make it easy to get through any gates that the organization has. Uh, we see things like change advisory boards mentioned a lot in regards to deploying things to production, but again, it's 2021 now. Um, and we're figuring out ways of allowing people's concerns so that we don't have to have a checkbox, um, or lots of people ticking off on a spreadsheet that it release is good to go. Um, with automated testing, browser-based testing, for example, for web apps, uh, which we can do, um, in an increasingly sophisticated manner these days. So we can test real world, real world scenarios. Um, you've got a web app, um, can use is still log into it. We can automate those sort of tests just as an example. And finally, on this point, decoupling comes in here. Uh, we often talk about microservices, um, and monoliths. Um, and the key takeaway here is that if you reduce the dependencies that individual components have, then you can, and if you make them independently deployable, then you can deploy them more easily. There's less friction, there are less things to go wrong, fewer interfaces, fewer moving parts, and that all helps us go faster and more reliably, um, into DevOps heaven.

00:07:21

So let's talk about how we measure that. Um, how do we actually measure the success of DevOps? Um, and again, this is relatively straightforward in this day and age. We've had the stated DevOps report that has come out of Dora, um, for the last, um, six or seven years probably now. Um, it's a well established means of, um, uh, of finding metrics that we can use to measure your success. And the key thing here now is that this report and the science behind it is now just evolutionary, not revolutionary. The, the metrics that the state of DevOps report highlights, um, as being ones that indicate that your organization is high performing, um, uh, are nicely, are nicely vetted in now, there's not really any need to change those, even though they're constantly revisited. So let's just rework, uh, just, uh, revise what those are the measures for for DevOps success.

00:08:14

Um, as defined by Dora and the Google Cloud people, um, basically four, there's four of them, four things to measure deployment frequency. Um, because this shows whether your deployment processes are, are mature, um, and if the organization is, is willing enough to push change through, the more frequently you are deploying, the greater the, the, uh, the indicator of of a successful DevOps style organization lead time for changes. Um, so this is all about seeing, um, how long it takes from a change being decided that, well, yeah, we wanna do this. A product owner, for example, um, coming up with some changes that they want to get, um, onto a website or into an application, um, and the time it takes for those changes to get out there into production. And the reason that's a key indicator of DevOps success is because it shows where the teams are able to deploy changes without getting held up and without any of the red tape that we traditionally associate with, with bureaucratic organizations.

00:09:15

The third thing then is the change failure rate. So of all these changes that we're now pushing through the system, how many of them fail? Um, it's, uh, we used to go fast and, and break things, um, attributed to Mark Zuckerberg at Facebook. Uh, we no longer, um, we maybe no longer quite that cavalier, but we're still gonna have failures. Um, if we're not having any failures, then we're probably going to slowly, um, but the percentage of failures shouldn't be that great. If we're deploying lots and lots and we have the occasional failure, then that's probably all right. Um, this shows up whether there's a maturity in testing ability, because every time we make a change, we should be testing that change. The tests should have good enough coverage to reveal whether or not the change is gonna be successful or not, um, and flag that before it gets anywhere near production.

00:10:04

And the final metric of DevOps success is the time to restore services. Um, so if something has gone wrong, if our AFO change has failed, um, then if we're able to respond quickly and get that service back online, either by, um, rolling forward to a new software release or by fixing a bug, uh, or rolling back even to a previous, um, iteration, then that gives us an understanding of the organization's ability to swarm on problems and solve them. Um, they're inevitably going to happen. Um, and measuring those, um, and seeing how well an organization does there, uh, is a key metric.

00:10:43

So all those things, again, I'm talking somewhat, um, uh, retrospectively in that a lot of organizations have, have, have been there and done that, you know? Yeah, yeah, we we're cool. We we're doing this stuff, no problem. Um, where we see some problems start to creep in is, um, when you scale up a bit. So these sort of ideas, they work well in small teams, perhaps the, the two pizza teams, uh, that we often talk about, uh, when we're looking at team topologies and scaling teams. Uh, what is the ideal size of a team? Well, one that could be fed with two pizzas, um, or name your junk food of choice. Um, however modern organizations generally need to scale up it, maybe it's a factor of success. Um, and so the two peaks of team thing doesn't quite apply anymore. Um, so we end up with, uh, teams forming perhaps DevOps teams, perhaps.

00:11:33

Um, um, that's a, a team that you have within your organization now, uh, which traditionalists would say is an anti-pattern. There ain't no such thing as a DevOps team. It's only a team that actually does DevOps as part of everything else that they do. Um, and the final thing I wanna mention here is, um, is platform teams. So we sometimes find that if you have an organization that has many, many things to deploy and they all look kind of similar and they all run on similar kits, then there are a lot of reasons to go off and make a platform team that could host all those things. For the organizations, you get the best common practice. Um, you get economies of scale, um, out of platform teams. Um, but we can find problems starting to creep in. Um, and the rest of my talk is largely aimed at organizations where this sort of thing is starting to become a little bit of a problem, and there's no shame in admitting it.

00:12:27

You get to a good maturity state in small teams in DevOps, um, that's a hard thing to do. Um, what's even harder is to make it scale up. And we'll talk about some of the problems that creep in. Um, when you try to do that, you start to find silos coming back. Uh, you find isolated departments or teams that are fulfilling specific functions within this whole DevOps remit. Um, because we've now carved up the jobs, um, that are involved in doing DevOps, well, maybe, um, doing CI source code management, um, running environments, perhaps they're actually done by a separate team because we've got lots and lots and lots of them, and we don't want individual teams to bear that burden. Um, you start to get, um, these antipas creeping back in. So silos, um, delays. Uh, we often talk about how, uh, specialisms are not really the problem.

00:13:20

The problem is the interfaces between them. So if you have a highly trained team of devs and a highly trained team, operational people, and a highly trained, trained team of infrastructure people, um, like if they're, if they're not working together day in, day out, um, maybe they're communicating through, um, queues, JIRA tickets, service desks, et cetera, um, then those interfaces, you can lose meaning in those. You can add delay, uh, you can add friction and you lose flow. You also find that constraints can sometimes fall between stall when working through across teams, um, or larger teams that don't just fit in one room. Um, platform teams can often, uh, suffer from over generalizations where we look at, um, building a common platform to do well, well, everything for, uh, a large number of development teams, but they're not quite the same. And there are nuances for each one of them, uh, that maybe need to be ironed out. And the final thing I want to mention here is that objectives that are conflicting start to come back in. Again. Platform teams, for example, get measured on, um, availability again, or the time it takes them to deploy a large environment that solves everyone need everyone's needs. Whereas, um, the development teams and the product teams are focused on delivering value to the customer. So straight away you have those conflicting objectives, which if you're not careful, can start to cause problems again.

00:14:46

So all of this can lead to a reduction in flow. So things flowing through the system, and I mean the, the whole operational system, not just the technical bits, um, and increased friction, um, more trouble where people are, um, people's timescales can be, uh, radically different across platform teams, DevOps teams, um, development teams, uh, infrastructure teams, et cetera. So

00:15:17

Let's move on now, and let's talk about how do we find this value? How do we actually find a consolidated view of what the value actually is that these teams are delivering? And how do we make sure, um, that it all joins up together? So a few ideas here. Um, we want to look at flow. So flow of ideas. Again, we talk about the, in the, uh, the state of DevOps report talks about the metric of how long it takes, um, to get an idea, uh, out into production. So an idea created ideated developed, tested, uh, deployed and out into production. That's our flow of the organization, and that's where we're, um, that's where we should be measured because that's where we're actually delivering real value. Um, the value there is intrinsic value, of course, in all the component parts. Uh, the awesome cloud platform, for example, the awesome piece of coding.

00:16:08

There are some product design, the comprehensive tests, but overall, it's all about this flow in an adaptive organization, in an agile organization that we want to get ideas flowing through the organization so people can see them and people can use them. Um, a big surprise we often find is that scope of these things is bigger than people think. Um, so if we're looking internally within our DevOps organization, um, or within our DevOps team, um, you can fa you can reasonably easily see the scope of that. It's get some code written, it's get it deployed and run it. Um, but scope, when you're looking at value streams can often be a lot broader than that. Extending right the way out to, um, to product owners, to people in the business, um, to the objectives that, um, project, um, sorry, that product owners want to, to get out to their customers all the way through to getting this stuff deployed. Um, and looking at that in the business context can often be very, very helpful. Um, rather than just limiting it to the piece that we, we we're concerned with or we think we're concerned with. And I wanna bring in situational awareness here as well. Um, so widening that scope and looking at the context of all the things that we're doing, the continuous integration, the cloud projects, Kubernetes, um, Lambda functions, whatever it is. Um, it's all meaningful stuff, but it's much more meaningful if you can place it within the situation of the overall business.

00:17:35

So let's talk about mapping those streams and what that means.

00:17:41

So basically we do value stream mapping. We just make a map. Here's an example, uh, from IBM where we look into how customers are actually deriving value from what we're doing, um, and how the things that we do could affect the value that they, that they derive. Uh, so here's an example for a defect, and let's have a look at some of these things on, on, on here. Um, we go from on the left from where a customer opens a defect report, which maybe takes five minutes. Um, and if we look on the other side, we can get a fixed deployed out of production, which again, only probably takes a, a few minutes. Um, but there's, um, I hesitate to call it a delay, but the timescales are much larger in the middle. So triaging problems, um, assigning severity, um, adding them to a queue for someone to do some work and, and, and to fix it.

00:18:34

Uh, these things will take time, um, and not only do the things themselves take time, but the handoffs between them can take time and the queues, um, of work, um, can take time here, for example, the big takeaway is the, the value in the yellow, um, oval here, which is that there's a week's delay between triage of a problem, um, and a fix being written. Um, it's an old cliche sometimes that, um, uh, we wonder why on earth it takes so long to fix problems when actually just writing the code that bit, just writing the code to fix a problem when it takes five minutes. Um, or in this example, let's make it into, uh, uh, putting in tests, making sure the fix is correct, peer review, all those good things. Maybe that takes two days. Um, but yeah, the key thing here is the delay, um, of a week. So we can see here that it takes about two weeks, um, to, um, to get this thing all the way from start to finish.

00:19:34

Here's another value map. Um, this is from the, uh, the Lean Enterprise book, um, by j Humble etal. Um, I'm not gonna go into the the whole detail of this, but it shows just how much, um, more detail you can add to a value map, um, in order to, to find out how things, how long things actually take. And you end up focusing in on not necessarily how long it takes developers to write code or how long it takes a product owner to write a spec for something. Um, but the handoffs, the, the gaps between the silos in your organization. You start to see, um, where these delays are actually coming in by making a map. Um, and what we found is that, um, and is a quote from CPI from, uh, visible, which is that you can get improvements on this just by creating the map. You just write a map out and then straight away, um, again, somewhat anecdotally, but also there's some science behind this from Steve. Um, in that, yes, you can get perhaps a 20% improvement just by creating that map and seeing what the consequences are of those, all those little delays that add up or the big ones indeed. Um, and once you're aware of them, you can start working on fixing them.

00:20:46

I want to go a little bit deeper into maps just for a few minutes by talking about wardley maps. Um, so Wardley maps, um, are a map for business strategy. Um, it was conceived by Simon Wardley back when he ran, um, a photo, uh, processing website and application called, um, for Tango. Um, and this is all about situational awareness, and it's about mapping out, um, which things people should be working on, what's commoditized, uh, and therefore you shouldn't be working on them. You just buy that in. Um, and working out where we are and where we want to get to, it's a whole load of science behind it, which is absolutely brilliant. And, uh, I recommend you go and look that up. Um, if you're at all interested in, um, understanding how your own value streams within a technical organization map out to those of the business, um, in our context, in a DevOps context, uh, wardly maps could help us with direction and trajectory of what we're doing, uh, as a, as a supplier.

00:21:45

Um, we're supplying DevOps to the rest of the company, um, hopefully in a fairly integrated fashion. Um, but understanding where we sit, um, in the global business and also considering the functions that we, um, that we form as a business, um, service is really, really valuable because you can look at where you've come from, where you want to get to, and what you need to change, um, to get there, whether that's through developing new stuff or going and buying some stuff off the shelf. Um, a simple example here is, um, would you design and build a CI system when you can go and buy one in from any number of different vendors? Um, you probably shouldn't unless it's, unless you're doing something really, really special with ci, of course. And a map and especially awarding map can help you figure those sort of things out.

00:22:31

There's an example, awarding map, um, from, uh, learn war mapping. You can also look at, um, there's some great talks from, um, an event called Mapc Camp, which is organized by Simon. Uh, Simon Wardley and his team, uh, runs in London every couple of years. Um, so yeah, there's a map which shows, um, basically the, um, the situation of the business compared to where it wants to get to, um, what they should be buying in, what they should be developing, um, and what they need to focus on. It's great stuff. So there we go. So the results of doing a mapping exercise, um, should let you help work out where you can add measurable value. Uh, it's very easy for platform teams, for DevOps teams, um, to do really cool stuff, um, and, um, innovate with new tools. There's lots of, uh, technology and tools out there, which we all love to use and which we love to make, make the best of.

00:23:25

But where is it actually really adding value? I'm not saying we're not adding value, but if we can measure that we're adding value to the rest of the business and helping other people get their stuff done, uh, that's surely a better position to be in. So again, yes, we look at what can we buy in. Um, there are commoditized, um, solutions for many, many of the technical challenges that we solve, um, or that we need to solve for the rest of our organizations. Uh, we should buy them in where it doesn't make any sense to build them. Maybe there is a differentiator where if we're building something because uh, we have some particularly, uh, different needs, um, or perhaps we're building something that we ourselves want to sell on, then maybe that's the case for building. Um, but often we want to buy and mapping these things out can help us buy these things. So, yeah. And just to finish off on this slide, what are we building for others to consume? Let's be absolutely clear on what those things are, um, and that aren't just a technical shopping list.

00:24:28

So there we go. So how are we gonna put it all together? Um, let's just join all this sort of stuff up. Um, so here's some takeaways. Um, some things that I think we need to be doing, um, that are kind of summarized, um, from what I've talked about for the last, uh, few minutes. What we're here to do is fundamentally to help people work together. Um, and it seems like an obvious thing, um, but when people's, um, needs and, um, objectives are different, those sort of things can be, can be skewed. Maybe a developer is, uh, is, is incentivized by, um, the amount of, uh, features that they add to, to a website or to an application. Maybe some ops people are incentivized by keeping those systems up. We've all heard that classic conundrum of don't change anything because it might go down and affect our numbers, but we have to change something in order to put new features out. Um, our job, um, in DevOps or one of them, is to make sure that those sort of frictions don't come to a head. And that we can always, we can all sing from the same hymn sheet there.

00:25:35

We have to understand where we are mapping is, is key here. Um, situational awareness. Where does our team sit within the business as a whole? Where does it sit within the, uh, other similar, um, departments in other organizations? We need to understand what we're doing, why we're doing it, and how that's different to others. We often see models in play, um, that we adopt from other organizations, and we, and we can see that blindly adopting those models, um, either in agile development or in DevOps, um, cloud engineering, et cetera. Um, without the, our own situational awareness is probably going to lead us to failure or doing the wrong things. So once we're aware of where we are, um, we can then work out where we need to be. Again, sounds obvious. We are here, we want to be here. These are the things we need to get from A to B.

00:26:29

Um, and those things are generally going to be things that are, um, things that we can do, things that we can change, technologies, we can implement, processes we can change in order to get the most of that flow of the value. 'cause we're focusing in on the value, uh, the value that we're delivering to customers. The, uh, and, and again, those customers could be internal or could be external, um, but the broader you can think, um, at a strategic level, the better that's gonna be. So yeah, slim down what we're doing. Um, let's not go and write an alternative to Terraform. Terraform will do the job. It will do it brilliantly, fantastic. Um, if it doesn't, then maybe we can buy in a product that will do that for us. We're not in the business of, um, uh, of, of writing Terraform plans. We're in the business of delivering value to our customers.

00:27:22

So don't work on things that don't add real value or that we can buy in. External touch points are crucial. Um, talking to our customers, either the people who pay us money, um, externally, or the people who we're kind of sort of cross charging, um, technical and process implementations too within the organization, uh, because their needs and, and making sure that we're still on top of their needs will always set us straight and above all, let's not forget the key tenets of DevOps. The feedback loops, great automation making the best of our people. Um, those are the key. Those are the things that we need to keep on doing within this wider context of value streams, uh, in order to help our organizations succeed.

00:28:07

So as I said earlier, um, I work for activist. Here's some adapt working happily away, and here's some more of them. There we are back in the pre covid days. I remember them. Hopefully we'll be back to them soon. Um, we've got a lot of stuff going on around digital transformation. Um, for example, um, we, uh, um, have a program of helping, uh, organizations to develop an agile organization. Um, so what does it take to develop a truly agile organization? Sorry, we're getting a bit sales pitchy here, but this is the sort of stuff that we do, we do really, really well. Um, we help organizations become more agile. Um, decision sprints is something that we, uh, we work a lot with, um, through our sister company Brew Digital. We've developed an awesome way of, um, relieving, um, analysis paralysis and, uh, um, bringing teams together so that you can make decisions on things that are maybe, maybe dragging on, maybe maybe we don't have consensus on something and haven't had for a long time.

00:29:03

Um, some awesome work we can help you with, um, in decision sprints. Um, and finally, um, right about now on the 7th of October, I'm pre-record this talk in September. Um, but I think this talk's gonna go out round about the same time as our DevOps value stream management with GitLab webinar. Um, where, um, uh, bin from our, uh, professional services team is talking about how to do value stream management, uh, with GitLab. Uh, so I recommend you are sign up for that webinar off of our websites. Um, if this goes out after the webinar, then you'll be able to see a recording of it. Um, so I'm expecting great things from that. It's a, it's a great talk and, um, a a lot of value to be derived from it. And that's about it really. So my name's Matt Saunders, um, from Adapt. Um, as I said, we do all this great stuff. Uh, we're also an Atlassian Platinum partner. Uh, so we work heavily with Jira, confluence, those sort of tools. Um, and we also partner with companies like GitLab, Sonotype, et cetera, um, to help organizations absolutely get the most out of the tools and processes and people, uh, uh, those three key DevOps tenants. So that's it. Thank you for listening. Hope you enjoyed it. Goodbye.