Las Vegas 2020

DevOps 2020 Re:Think!

John has over 35 years of experience, focusing on IT infrastructure and operations. He has helped early startups such as Chef, Enstratius (now Dell), and Docker navigate the "DevOps" movement.


He is one of the original core organizers of DevOpsDays and has been a prominent keynote speaker at various DevOps events throughout the years.


He is also a co-author of The DevOps Handbook along with Gene Kim, Jez Humble, and “the Godfather” of DevOps, Patrick Debois.

JW

John Willis

Senior Director, Global Transformation Office, Redhat

Transcript

00:00:12

Hi everybody. This is John Willis. Um, presentation's called dev ops 2020 rethink. Um, I'm uh, this was a collaboration with a couple of my coworkers. I'll talk about them, but one of them is J bloom and Andrew cliche for, and Kevin Behr. But, uh, so I just want to make sure there's credit for some of the slides they collaborated with me on. So this presentation is about, you know, I called it dev ops 2020 to sort of set the stage of that last 10 years. We've done a really good job in dev ops. And so the question is, what are we going to do now? It's, it's the start of the next decade. And, um, so I, I, some things that I've been thinking about about sort of areas that I think we need to delve into a little deeper and, and the three areas I'm going to focus on is what I call organizational conversations, organizational design, and something that we're calling, uh, in my team.

00:01:04

And I'll explain my team in a minute, uh, the three economies. So, um, so this is my team. I started at red hat last October, and, uh, 2019 2019. And, uh, that's Andrew cliche for he's my boss. That's Kevin bear next to him. And that's me the small guy. And then, um, and then they, um, Jay bloom, who I've been working with, he he's, uh, he's getting, uh, a PhD in transition design. So a lot of these ideas that I'm pointing out really come from him about how does design and transition design apply as Andrew likes to say, we wrote some books, uh, I was the coauthor DevOps handbook, coauthor, a beyond the Phoenix project, Kevin coauthored, the Phoenix project and Andrew Cora web operations, and some of the site reliability engineering. So here's the deal, right? This is the sort of the Pete Chez locked joke.

00:01:53

We come to find out this, this, um, this slide was actually originally created by Patrick Duar, all things lead back to Patrick. But if you think about last 10 years, like this is sort of like, we've been this unicorn poop, if you will, on the enterprise. Right. And, and I mean that in the best possible, right? Like DevOps, enterprise summit, all the accomplishments we've made, but like, it's been a struggle, right? Starting off like the enterprise, the first conversations were, I don't think the enterprise can do dev ops. And then it was, apply the security things around dev ops and you know, who we should have 2020. I think we've got that pretty much solved in terms of everybody has the memo. Right. But if we look at digital transformation, like by all accounts, this conversation around digital transformation, it's been creeping up for the last couple of years in, in, in a modern discussion it's been around forever.

00:02:42

Um, there was, you see a lot of stories about failures, you know, different reports, different studies. This one particular one says 70% of all digital transformations fail. So, so sort of jokingly, right? The next 10 years maybe is digital transformation, unicorn poop going to get hit, but we're to have to, and I'm half joking because the truth of the matter is we do have to have a better, stronger, bigger conversation about what we're doing. And, and so, and again, I look for the, most of the people that are at this conference, you've all been doing that. It's for the people that we need to educate. And so, uh, Andrew, uh, came up with this idea of, of five elements and I'll talk to Andrew cliche for my boss, my coworker. And, um, and so I'll talk about in a little bit, but if we think about sort of the five failures, right, like leadership, right?

00:03:31

They, they, you know, again, the people that we interact with today and at this conference, right, we tend to be better than this, but a majority of the organizations out there are still leadership is preventing change either in the form of governance and risk or, or just general business, um, product building, things that don't matter. We really still haven't gotten Eric Reese's memo yet, uh, from lean startup development in a lot of cases are still building the wrong things, architecture, or basically designing or not even involved in the decision of the design. So they're building the things wrong. And then in operations, we still have a split mindset of incident, you know, operations outages in sort of half in, on service management, half in, on sort of dev ops or, or some of the new sort of thoughts about incident management. So what I wanted to propose is some of the areas that I've been thinking about as this rethink, we're sitting at 2020, the, you know, we could say, like I said earlier, that we've done a really good job and we have like, we should all collectively, and I'm not, I'm being serious, pat ourselves on the back, we've done a tremendous job in the industry, uh, improving communists, improving people's lives.

00:04:43

And, but the question now is at 2020, what are we going to do for the next decade? Because if we're still talking about get ops and CIC in five years from now, then we, we would probably have failed miserably. And the digital transformation discussion will have overtaken us, right? So we really need to start thinking about how we improve. If you go back, you know, five years ago, if you had a conversation about continuous delivery, that was a novel idea today, it's table stakes. So we don't want these conversations to beat. We want the new things to be tablespace. So the first thing I want to talk about is something I've been doing for the last three or four years. I call it, I've called a lot of things. We're just going to call it organizational conversations. And this is where I've gone into larger organizations.

00:05:27

And really just literally spoke to hundreds and hundreds of people usually come in at the CIO level. And usually it's a champion inside of the organization that says, you should talk to this guy, John, he sort of knows what he's doing. And then I get to talk to CIO and I convinced the CIO. So let me just have conversations. And I want the conversations is where I talk about the people, the edge, the people who put their fingers on the keyboard, because I'm more interested in that form of discussion than I am talking to leadership. And top, I want to go bottom up. And so, uh, I've done this over the last few years, very large banks, insurance companies. And one of the things I came up with, which is this quote I've made, which is you can't lean agile, safe, or even dev ops your way out of a bad organizational culture.

00:06:11

So the idea is that we, you know, we have these ideas like lean agile, safe, and, uh, great frameworks are great pattern and practice tools for us. But the thing that we, if we don't get to the bottom of how things really work or the truth, or have the real conversations with the people doing the work, these things actually can give us false truths. And so one of the things I've been doing over the last few years, I've had this thing that I called the seven deadly sins of dev ops. I won't go into this in detail, but there, there are patterns that you find when you have these conversations. And one of the most interesting ones is they all seem to funnel down into what I call security and appliance theater. In other words, your audits are basically nonsense and I've got full presentations on this and some of the work I've been doing automated governance and automated cloud governance.

00:07:01

Again, you can just look me up, you'll find that. So I love this story by Abraham Wald, right? Which actually it was a story that Sidney Decker told at one of the DevOps enterprise summit. It's I've seen him. Um, I've seen him give this presentation a few times. And so during world war two, there was a set of scientists and mathematicians and specifically statisticians that were looking at how to do the proper repairs of fighter planes that would come back. And so they figure out where the bullet holes are, the weight ratios and all that. And at one point, Abraham Wald had this aha moment. We said, we've got it wrong. What we're doing is we're looking at where we're repairing and fixing where the bullet holes are. Those are the planes that are coming back. What we need to do is look where the bullet holes aren't because they're the ones in there aren't coming back.

00:07:51

Right. And it actually was sort of the original definition of survivor bias. Sidney Decker says that we don't need to look at the absence of negatives. We need to look for the presence of capacity of things that go, right. Right. So I use this in this whole conversation, this, this organizational conversation dialogue Elliot go rat, um, who wrote the goal for most of, you know, um, the Phoenix project was, uh, uh, a modern day rewrite of the goal. Fantastic stories, both he also 20 years after he wrote the goal, he wrote, um, he did an audio only project called beyond the goal. And one of the parts of that, he talks about complex systems and complex adaptive systems. And what he talks about is if you look at these two systems, a system, be in a system, if you ask generally people, anyone, which one's more complicated, most people would say system B.

00:08:42

But if you ask the physicist or somebody really understands complex systems, they're more likely to say system a because it allows more degrees of freedom. So in a sense, when I go and have conversations with customers, you know, the working for CIO, but literally talking to what I call the edge, people who are doing the work, they tend to want to give me system B answers. They want to say, well, that works. Or my CMDB is fine. John, don't worry about it. Oh, don't worry about that. That works. And what I really need to do is get to beyond that, to the truth. And so I like to use this cartoon of the fine dog, right? So when people are constantly telling me is, don't worry about this is fine. And once you earn their trust or you create an open, collaborative dialogue, psychological, safe environment, which you actually wind up getting to is the real conversations where you're finding the places that are really on fire.

00:09:33

And what's interesting is when you get that psychological safety and trust, people will tell you the most fantastic workarounds and the real fire stories, which really are the sort of system, a discussions that I'm looking for. And if you look at like the Equifax breach, right in 2017 is a classic example of a system, a system B conversation. So, um, for those who know this, it was a library called stretch. Two was in, there was a Jakarta, it was a parsing module. Just simply, if you did a curl on a system that had that library, the chances are, you could actually compromise that system with this little command thing here, when it was all said and done, the CEO said that, well, we know what was wrong. The breach was basically a single person who failed to deploy the patch, right? That's a system B answer, right?

00:10:21

But when you go in you first off, you look at the 2018, uh, Congress did an oversight report on the breach and it was, it had tons of complex problems and systems one, the chief security officer reported to chief legal officer. So when the chief security officers are asked under Congress, um, in review, he was asked, why didn't you notify the CEO of the breach? The answer was, I didn't think about it. Right? And then you think about it because they were reporting to the chief legal officer, the IDs intrusion detection systems on the perimeter had 18 months expired search. He said there was all these things. So that, so again, what I look for is those types of things to find out all those sorts of complicated, honest answers and discussions. So the second area that I've been thinking about for 2020 is focuses organizational design.

00:11:14

And a lot of this, I get from Jay bloom in terms of transition design, thinking about design, when we talk about transformation. And so if we look at just a simple evolution, right, everybody knows this, we go from it agricultural economy to address your economy to a knowledge economy. So we're in a knowledge economy, but right now, if we talk about lean and we talk about Toyota GPS, right, we're still in this struggle, this conflict between how do we map the things that we know work really well in an industrial economy? And what are the things in knowledge economy? The knowledge economy is still sort of art. Now. Things like lean have been able to try to apply science, but we still have these debates on what really maps properly. Right? So, so we really need to sort of like get over that. We need to sort of not sort of, but we need to actually start applying food science, the way operations research, all the things that we could learn from the industrial economy truly in an algebra.

00:12:10

And I would say, we're still not doing a great job there. And so I talked about earlier, how Andrew, um, had come up with this idea of the five elements. So Andrew spent like five or six years over at pivotal, really large transformations. And one of the things, when we all came at red hat, we actually had this powwow like, again, the glass half full glass, half empty, what are the things we've done? Right. But what are the things that we've really, haven't done a great job on? And so if you think about a pre dev ops conversation, like it was all about development, it was agile manifesto. The D the dev ops conversation opened up this sort of balanced theory between operations and development, right? So it was really took differentiation versus scale. You could say it's dev ops, and it was an engineering focus discussion, right.

00:12:58

And we've done a really good job there. What we haven't done a good job is architecture, enterprise architecture. I taught large companies. I talked to, I mean, I get from the dev ops people, they get screaming, please, if you could help us get the enterprise architects on page, we've left in a lot of cases, maybe not your case, but in a lot of cases, enterprise architects are still working off the 1990s paradigm of architecture and in product in most cases is, is a mess as well. So it's like, you know, Chinese medicine and it's based on balanced there in leadership in the middle that we use this canvas to start a discussion about. So if you know, if you've got too much weight here in development or development ops, but not in architecture, we use this now to start a conversation of what's your balance, what's your balanced theory amongst these five elements.

00:13:49

And so one of the other things that if you go back to Toyota, one of the more successful parts of Toyota, you know, as we talk about lean as, as a definition of what Toyota proxy systems is, was somebody called the tour to supply chain and something, they called the four V's of learning, and that was variety, variability, velocity, and visibility. And so I think if, when we talk about that sort of middle area between an industrial economy and a knowledge economy, could we take the, this, you know, Andrew's five elements, or we're calling our sort of global transition transformation office five elements, and try to map that with the four VLS to get a better sense of how we can do knowledge economy based on these pure principles. So if you look at, um, what I did is I created a grid here. So I'm looking at the motivation and conflicts, pretty obvious.

00:14:41

But if you look at a developer, a developer has once increased variety, right? Sort of balance economic, more choices. Um, variability wants doesn't really want tolerance and locked down, wants to basically expand. They want, of course increased velocity, but they want a decreased visibility, right? They don't in general, want GRC governance, risk compliance. They don't want the cab. They don't want sorta NFR is if you will. And product is, is, if you remember, the grid is pretty much aligned, leadership wants everything, right? Like increase everything. But if you look at operations and architecture, which were reasonably aligned, at least on our five element grid, um, they want to decrease variety. They want to decrease optionality. They want reuse, they want scale. Um, they wanted, uh, decrease variation. They don't want to give you, they want to tighten your tolerance. And in velocity, they in general want to decrease the speed.

00:15:36

And I know that sort of the dev ops mantra, you know, everybody's gotten the memo, but in general, larger organizations still, um, a big party organization is trying to decrease, but they want to increase visibility. They want more NFR. They want more operationalization. They want some form. Like, again, I think a lot of people are getting better at cabs, but they do want some sort of audit and control and, and again, architecture, same way. So if we look at that, then we can sort of dive into, I'm going to just talk about, um, variety and, um, variance or variability here. And I saved the other two views for some other conversation, but a variety we're talking about optionality. We're talking about balancing marketing demands and operational efficiency. Uh, the Toyota supply chain book is an excellent book. If you want to understand the real details of how they competed the volt competed against the Prius.

00:16:26

Um, so we look at variety. We can look at some systems thinkers. Uh, this is Alicia Guevara. Um, she says in general constraints, enable freedoms basically says by telling the potential variation and component behavior. I know this is very sorta techie, but, um, it's context dependent, strange paradox. We also create new freedoms. So in general, certain types of governance systems enable freedoms. So we need to learn more about systems thinking. We need to learn from the four VLS. We had to understand what Toya did incredibly well with, with variety and another great, um, is the tragedy of commons by Garrett Hardin, right? And this is self-interest behave contrary to the common good of all users, depleting, spoiling the shared resources. So there's a balance again, and, and, and to just sort of summarize it, consumables must be managed to preserve the system. Too many cows consume all the grass and the field collapses, right.

00:17:24

And we got attributes law. And so ultimately I'm going to tell you that I think all of this has to be balanced in the five elements. And then what conclusion is going to be that you really need a platform, but I'll get to that in a little bit. So, as we explore is a assisted to be stable with a number of states of its control mechanism must be greater than or equal to the number of states of the system being controlled. So if you think about a platform, a platform that it does that balancing act between controlled and controller and control ease, and in general stable system controls must be greater than equal than that, than the controlled systems, uh, last but not least, um, Don Ramstein. Um, the problem at any prioritization decision is that it is a decision to serve one job and delay another one.

00:18:12

So in general, what he's saying without all the gory details or reading his intense book, focus on high value value, high probably items in your backlog. So the common theme here is economic balance and how to make those trade offs in decisions. And we've got tons of literature of science from incredibly smart people to help us. And so in general, um, what you have is, you know, the constraints enable freedoms, consumables, um, must be managed, reserved a system stability, uh, attributes law, and in costs by Don Robinson quickly, uh, their ability, which is variation. I love this quote. This is a unknown author. Misunderstanding variation is the root cause of all knee jerk reactions over control, micromanagement and tampering. If you, uh, if you go to Deming's, um, writings, basically he says the importance of operational definitions and collecting data without them, the data is suspect changed the definition and the data changes.

00:19:14

And when you don't have a written definition, the different I'm sorry, definition, the different opinions of those collecting data results in model data, right? And so here's, the bottom line is right. We cook the hell out of Deming, but we very rarely actually listened to them. Right. There's just, I mean, like every presentation that was, has this sort of Deming quote, but like, are we really doing operational research, right? Are we really use applying the science statistical process control system of profound knowledge, Demi and shoe, it starts about, um, plan, do check act or study act, and then just quickly, um, another place to look for, um, variants and how to create opportunity variances, uh, uh, to Gucci. And it could do to, to Gucci loss function, sorry, of course is more important than quality, but quality is the best way to reduce costs, right?

00:20:07

And so here's what he's saying is find the edges of your variability. It's not how tight your tolerance levels are. It's how far you can stretch them. Where can you get the value? The hidden values are at the corners, and then there's, what's called the red queen effect. And this is, um, basically from Alice in Wonderland. But in general, when we talk about sitting here in 2020 or 2020, if we're running in the same place we're losing. So in, in, in, in summary, uh, statistical process control tolerance into Gucci and the red queen effect. So one of the things I want to say here is that if you think about what's in common with all of these things, I just talked about their math engineering statistics. We need to do a better job in the next day of stop this knee jerk reaction of like, we get a failure, let's hire, or we get a breach and let's hire a hundred new security professionals.

00:21:00

And that's a true story, actually from a bank and doing finger in the wind. We have this knowledge and it's been used by industrial, uh, industrialization power plants. It's a hundred years of engineering. That's sitting in our face that we can actually apply and think about and be better at. And in fact, there's a great quote about, um, a shoe it Walter shoe, it, then Deming said that in 1980, that it will be 50 years before we actually figure out the real value of what Chu was saying. So it was actually created the Genesis of most of Deming's work, statistical process control plan, do study act, right? So he's like, we're still like 20 years away from Deming's prediction of doing it. So just so the bottom line of what I've said over the last couple of slides before I get into the next section is that there's a lot of really good information in industrial engineering operations research.

00:21:50

We need to stop just quoting that stuff. And we actually need to start looking at the real science and make breakthroughs. And again, I'm not trying to trivialize the people who have done tremendous work. I'm just saying in general organizations, you know, I, I had some person telling me, oh, I'll never get my management to, to understand to Gucci. And it's like, well, Toyota got their management to do it. And they decimated the market for 50 years. And so I'd end up with a couple of things about platforms. And so this is idea we've been talking about internally, and Jay bloom has been writing about it for a couple of years. He calls it the three economies. And so most of our discussions around how we think about infrastructure or scale, or even a dev ops conversations have been bound around this idea of two economies, differentiation and scale, we can call that a dev ops, right.

00:22:39

Or infrastructure and development. It's been a bi-modal discussion, right? And so differation economy, right? Velocity, novel niche, experimentation, incubation, right? The things that you would expect from a differentiation development, and then scale, we could say it's sort of the ops or infrastructure. It's a regulate reduce, uh, create resilience, reuse consolidation, right. But we, we understand those to scale those, those economies pretty well. So one of the things I have been fortunate enough to have great conversations with Mark Burgess and, uh, mark wrote the forward to the SRE book. And, you know, we were talking about sort of SRE and how Google has built their infrastructure over, maybe say the last 10 or 15 years. Um, and in fact, if you understand the history of Google, they started out with something called the Borg. They, um, they turned that into a project called the amiga. Ultimately we see that as the open source project called Kubernetes.

00:23:33

And what mark said was one of the brilliant things that Google did, which was they made a non-deterministic infrastructure look deterministic to their developers. And what they did was there's Elvis didn't know about the particulars of the virtualization of the storage platform. They just had API APIs or interfaces. And in most cases they didn't really, they didn't weren't even given the ability to know those things. So they just created applications and services through interfaces that were bounded by, and what Jay would call this is a scope economy. Now Google didn't call it a psychoscope economy, but it was, it was just this clutch, if you will, in between scale and differentiation. And it's not just a platform, it's an interface, it's an abstraction that separates the, allows the developers to get the best value. And as the infrastructure get those values, you don't always think of the differentiation kind of crushing in and scale crushing in towards the middle where the scope economy is the things that will actually create the adoption and the control and all those things.

00:24:36

Weren't. So if we look at this, like the differentiation, the scope of kind of gives us the ability to enable the viz, the velocity variability, and, uh, and, and, um, from a very variety allows re commenting all those things. The, um, uh, the tragedy of commons, um, scale controls, velocity, again, variable. So accused the best of both worlds for the scope. And like I said, it, it really becomes this clutch. And, um, if you Google, um, three economies, you'll find more presentations, more details on this discussion. James written a couple of there's a, there's some really good Wardley mapping examples with this as well. And so, so in the end, one of, I think the most important things we talked about 2020, and this is self-serving because I worked for a company that actually sells a platform, uh, open shift. But, um, but, but I do truly believe this is true.

00:25:27

You know, Steven O'Grady said that the, that developers are the king makers. I think when I sit here in 2020, I say, if you're not thinking about the next decade platform and how that platform is going to look how you're going to utilize that platform, what are the strengths of your organization use that platform? You didn't get the memo and you're probably gonna, um, you're probably gonna lose. I do. I fundamentally believe whether I worked for red or not. I've been believing this, you know, prior to six months ago that the, the new way forward is platforms. So platforms are the new kingmakers, like get the memo. Yeah. So platform, I designed it, the whole idea of this is if you think about what all the, what I would call cloud Titans. So the early experimenters in scale of infrastructure, your Googles, your Twitters, your, uh, Netflix, right?

00:26:17

Even Facebook, they all did this by platform. Now, the question is, what is a platform mean? How do you, how do you use it? And like, and don't get sort of lost on the marketing hype and which anybody can give including my company. Um, but what we were talking about here is think about what I said about what Mark Burgess, what Mark Burgess said about the brilliance of Google, right? They created this abstraction that allowed to developers to complete, completely divorced from the infrastructure. I mean, all they were given is a set of API APIs interfaces, and well-documented interface to basically do anything you needed to do. And that's where the enterprises have to get to. Now, it's a long haul to get there because we have, you know, Google is one application base or one infrastructure, you know, few applications banks are like, you know, you know, um, you know, 20, 30 lines of businesses, you know, thousands, maybe 10, you know, 10,000 services.

00:27:15

So it's a much harder ask, but in the end you have to get there, right. And to get there is stop thinking about platform as a differentiation economy, as a platform, as a service or a self-service or even worse, you know, I've got a container system that managed clusters, and you start thinking about the, um, what I'm taught called a scope economy and how the, um, a platform really becomes a platform as an interface, how it starts enabling the things that you need from the, um, you know, from the deflation of Asian economy, sorry, and the scale economy, those things get collapsed in your scope economy. The platform becomes as enabler. And so you, you tend to start looking more like Google. I always think when, whenever I hear a conversation, larger organization, I say, Hey, calm down. You're not Google, you're a bank and health care.

00:28:10

And you can like calm down because you're reading all this stuff about the way infrastructure is supposed to work. And then I say later in a presentation, by the way, you can be like Google, but you have to understand Google didn't really create a pass. They created a platform as an interface. And, and, you know, and as we see things like service mesh, Istio Envoy, all those things that really the platform becomes this experience. And so, you know, if I look over the last four or five years, some of the smartest people I know used to work for software companies, they're all taking VP of engineering jobs in large, you know, global 1000 companies, shoe companies and banks. Why? Because these are the people that the enterprises know have to get them through the next 10 years of, and it has to be based on a new generation, a new way of thinking.

00:29:04

And I would say it's a scope economy based on a platform is interface. So my ask for everybody is I would love to push this conversation about organizational conversations about the, um, some of the things that we've learned from Toyota. Some of the things we should be doing better from operations management. Are we applying? Maybe I'm right. Maybe I'm wrong. I don't think we're applying the right, you know, the right science. And I think we're a lot of platitudes and a lot of quotes. And then all this bound into a discussion about platforms. So my ask is anybody who wants to have a next generation conversation about these three, please, um, get me up, help, help me help you drive that conversation. Uh, thank you very much. My name is John Willis. Uh, botchagalupe on Twitter, jWillis@redhat.com.