Las Vegas 2020

Deploy more. Sleep better. The Walmart DevOps Journey.

We've been on an exciting journey at Walmart for the past few years as we continue to find better ways to meet the needs of our customer. We've also learned quite a few things and have had some remarkable successes.


Join us as we share what we've learned and the outcomes of a dedicated strategy to improve the world's largest retailer.

BF

Bryan Finster

Value Stream Architect, Walmart DevOps Dojo, Walmart

Transcript

00:00:12

Thanks for coming. My name is Brian Fenster and I lead the DevOps dojo for a small retailer in Northwest Arkansas. Uh, Walmart is the biggest retailer in the world with the biggest company in the world. With 2.2 million associates, uh, over half a trillion dollars in sales and 11,700 stores spread all over the world. As you can imagine, our delivery platform has to scale that as well. We have thousands of development teams worldwide within the global customer base. We deploy to the edge, private cloud public cloud, uh, with embedded systems and we have everything from legacy to the latest tech. And I want to talk to you today, uh, about, you know, what we've been doing to for the past several years, the journey we've been on to be able to support what we do today. Now, anybody who's in my industry will tell you that black Friday didn't start, uh, around Thanksgiving this year. It started in March and we were ready for it. Um, but it's been a journey. And I want to talk to you a little bit about that journey because it hasn't been an overnight thing and it has been work.

00:01:27

So back in 2015, I was still in supply chain and we were focusing on a challenge. You know, our leadership gave us a challenge that they wanted us to deliver a main warehousing system, much more frequently, um, order of magnitude faster has been delivered. And we had a dev ops day in early 2015, where gene Kim, Gary Gruver and Damon Edwards came to visit. And it was really inspiring. You know, we met with Damon and, and Gary later just with our area and had conversations with them. And it changed a lot of things that I had thought that I thought were true. Um, and as a group of senior developers suite, we got together and said, okay, well, what we need to do is we need to make sure that we're laying out the product domains to allow us to go faster, that we're aligning product teams to product domains, to stop doing projects.

00:02:27

We designed an architecture and allow those teams to be completely independent of each other, uh, and deploy in any sequence. We put together a platform team that all they did was build the delivery platform and we're using Jenkins at the time. Uh, there could template pipelines and make it easy for the teams to deliver. And then we started on the pilot teams. We focused very heavily on continuous integration. We thought that if we could understand the, with continuous integration, everything else would kind of fall out. We'd find all the constraints in the flow. And so that's what we did. You know, we started asking ourselves every single day, why can't we get to Trump today? Why can't we deliver today? And nothing was soccer signed. We've just found the things that were stopping that from happening. And then we started working on the things, and sometimes it was Howard developing. Sometimes it was knowledge of tests and sometimes it was just process internal and external that we had to go and clear working with other teams, doing all sorts of things, building relationships so that we could deliver. And sometimes we broke things, but we always tried to break things small and then we fixed it and move forward, just like same with them.

00:03:48

And we had a lot of good takeaways from this. The first was we really should have done a better job measuring upfront. Uh, it's really important to have measurable outcomes. And we really didn't know how to measure it. At the time we felt it was faster and it was, um, and the very first dev ops enterprise summit I came to was to learn how to measure. That's my main focus. And I brought back some ideas. So we've been expanding on that ever since, but that's very important. Um, the other thing we learned was that if you make the, the, the easy way, the way that's the good way, then teams can kind of flow downstream to success. And when the platform team was building templates, they were building templates that really supported the idea of, we want to do continuous delivery. We want continuous integration happening.

00:04:38

And then we pressed that as a way of working. And we also learned that it was really effective that to grow the platform and the team behaviors using the platform together, that we've talked about this many times since about how that was, that was so important that if we'd pushed CII behavior without the tooling, or we'd pushed it, then it would just would have been frustrating for the teams. If we push the tooling without the behavior, all we would have done is push garbage to production really fast, but bringing them along together allowed them to grow. And lot of partnership and the continuous delivery is the catalyst for culture change. Then when you take this approach, that CD is our goal. All other things are secondary to our ability to improve the flow of delivery. It's exciting. People are doing new things and they're delivering faster, which is makes people so much happier.

00:05:44

And that's true. The thing is that teams are happier. We learned how much happier it is to be on a team that's delivering better value sooner and safer a teammate. And I talked about this experience that's journey. I just told you about back in 2017 at this conference. And we had a slide that says, we, you know, we love development again, and it was true. And we became a really tight team. And every team that I've worked with since, as focused on this, you know, they start really jelling as a team and they just love working this way, but we also knew that replicating these outcomes, uh, couldn't be done with a cookie cutter that teams each have their own context. And so when there was a push to do this for the entire enterprise, uh, you know, I was recruited to move up to the platform area to help in this effort.

00:06:46

And w you know, we needed, we decided we needed a deliberate strategy for transformation, and that's really what I want to talk to you today. We've been on this journey, uh, and I want to talk to about what we were, what we've been doing for the last several years. So it was, you know, establishing clear goals, making sure we had a solid platform that everyone could use building communities to help each other gamifying delivery, to encourage people passively. So we don't have to go and talk to them all individually and developing the technical coaching skills. We need to help where people have questions or shovels. So, so first, you know, goals, we, we really need to define goals. And we had a challenge coming from the CTO who said, I want every team to be able to deliver at least once a day. Now that's a big ask, especially on teams that have dealing with old software, but that's a challenge.

00:07:47

Engineers love challenges, and it caused a lot of motion, right? We also want to make sure we're doing it at higher quality. It doesn't make any sense to deliver fast if it's garbage. So we really had to dig in on what causes quality, uh, and get there. But we really wanted happiness. We want engaged teams, happier developers, happy developers deliver more, secure, more stable applications, more frequently. And, uh, if, if we don't have happy developers, the quality will suffer as well. We also wanted to make sure we had a common context and, you know, this is something, uh, Gary Gruber wrote this book recently and produced some training recently that, that we think is really important for setting that context, where everybody is aligned on improving the flow of value, uh, with, you know, making sure we're getting good type feedback loops and educating everybody on those concepts so that, you know, the more we're bridging the gap to the larger audience that may not have dug in the way we have

00:09:01

And make sure we have shared values and measures that we're aligned on metrics. We have a common glossary of metrics. One of the things we did was we created a glossary of testing terms because strangely there's no domain language for testing, but we also want to create a glossary for metrics so that we were all looking at them and also training around those to teach everybody how to use those metrics, to, uh, encourage the right things and not discourage the right things. The next thing was, we really need to build a delivery platform that was not only easy to use, right? But it also allowed people to, like I said, to flow downstream now at the time we had platform areas all over the enterprise. And, you know, we had all of these different areas that were paying teams just to support their own bespoke platforms.

00:09:58

So we had a deliberate strategy to consolidate these things. First thing we did was we said, we will take on support for your existing tools. You don't have to pay for that support anymore. You can still use them while we're building things out, but we will support and pay for them. And then we started building out our platform and as we started replacing those capabilities, if there's other tools we're executing and we're working with those teams as customers to replace those capabilities with tools, or honestly, easier to use, then we started deprecating those old platforms, reducing the number of tools that we had getting it paired very much down. And we would help teams migrate from their platform to ours. We didn't just say, Hey, you need to migrate. You know, we would, we had teams dedicated to helping them come onto our platform. And as we removed that duplication, we started growing a larger group of people who were using the same platform and that allowed us to find other holes in the platform that we didn't know exist. Uh, and, and also having those people help each other, uh, with any sort of struggles I had instead of coming to us for how do we do? And we had training for it, but it's coming to us for the edge cases. Like, you know, I don't know. Um, that's, that's kind of an edge case that I haven't covered before, but I'm pretty sure that this team over here has done that. And we start building those connections across the enterprise.

00:11:29

And we made sure that the training emphasized continuous delivery, that the tools emphasize continuous delivery. And we were focusing on this irresistible developer experience and hearing, you've got actually a picture of a tool that one team built that using our tools and the API APIs built in our tools, they could safely deploy to production with a button press. And if there, they didn't have a green pipeline, it just wouldn't deploy, which I thought was really super cool. And, you know, the, we, we got feedback from developers. This is a literal quote that it's magic that he puts, he put emerged as a pull request. The magic happens and it goes to production and it takes all the toilet. And by, by doing that, the teams were really able to focus on what they're doing and sort of how they're going to deliver it. And at this point we have probably nearly 90% of the enterprise is using this common tool set, which allows so much ability to inject compliance and jug security and make it things that you don't, you can't forget. Um, they just happen transparently. You don't know.

00:12:47

The other thing that we thought was really important, as I said, was a growing communities. And, you know, we started with a continuous delivery community. Actually my wife, Dana Fenster started it called continuous chai because there's other talks we've done about that. And, um, that community continues to grow, continues to be active. But on top of that, we've started testing communities. We've started communities focused on particular, uh, you know, languages and frameworks that people are interested in and growing the excitement and, and having communities come together and share things. We've done internal dev ops days. Um, you know, anything we can get to get people talking across the organization we focus on and we provide material support. Uh, the platform area has bought t-shirts and swag for the continuous chai community. You know, we we've served lunch for the continuous drive community just to get people to come in and collaborate.

00:13:45

We also want the community to be the owners of all of this. We want communities of interest and communities of practice to grow. You know, a really good example of this is, you know, I, I talked earlier about having a glossary for testing, and that was something that, you know, I, and some other people from a QA had started back in 2015, we're working on CD and trying to help teams learn how to test. But, you know, as a broader engineering community, we came together and said, okay, we really need to look at some of these terms and hardened these up and, and, and all agree to what these are. And it was a painful process. There was constructive disagreement, but now we have a solid glossary that multiple areas can align to and say, this is what it is. And anybody else who, you know, they, they don't have something to say, well, here's something. And having the community owned, that means the community is invested. And we want the community to drive the standards. We want, you know, really smart people to say, this is what it should be. We shouldn't have tower people saying, these are the standards. Are the people doing the work should drive the standards? Um, because they should only outcomes.

00:15:03

We also work to gamify the metrics and this, I think this was my, one of my favorite things, uh, you know, playing this game, you know, how do I adjust the score? So what we did was we, we brought in Hygieia and then for all of the different things around, you know, the code, change frequency and build metrics and sonar metrics, we, we gave star ratings on those things by repository. And then we created roll-up dashboards to see the team could see what their aggregate score was across to all the repositories. And we didn't do anything. So to automate the onboarding of these dashboards, when they built using our tools, send them an email saying, here's your dashboard. And then we didn't say anything. And then teams would come to us. How do I improve my score? Like, well, let's look, and then we'd start talking to them.

00:15:55

Well, you know, if you use trunk based development and you start integrating code daily, you'll get a five over here. If you stabilize your build, this will get much better. If you deploy more frequently, this will get much better. And then teams will go and to try to solve that problem. The early adopters, especially people would actually start competing with each other who had the highest score we've. Then we started doubling down on that. We've built other dashboards that take the scores and use those metrics to point them, to playbooks, to help make those metrics better. So they don't have come talk to us. It's been very effective, the, the one side effect to the game to find those metrics. So it was, we did have a lot of people coming to a sort of support, not for the tools, but how to improve. And so we created a dojo.

00:16:47

This was something that I really admired that I saw Ross Clinton build these and Capitol one, talking about dojo's. And I kept asking per dojo's ever since 2015. And finally, they just told me that it was my responsibility because I kept asking for it. So they had me go and build a dojo. And, you know, I think there's been a lot of misunderstandings we've seen in the dojo, the broader dojo community around what a dojo is. And, uh, number one, that's very contextual to your company, but it needs to be not a center of excellence. It doesn't need to be that those people that tell you what to do, and it definitely doesn't need to be the place where you take teams and run them through the grinder. A dojo is there to help teams solve problems, to help them with immersive learning so that they can use their work to learn how to improve their work by doing their work, um, provide, you know, really good examples.

00:17:48

We, we spent a of times, uh, showing examples of different kinds of testing patterns, uh, showing different pipeline, examples, just little things to help people get started. Uh, may need a little bit more help there. We also do a lot of work on evangelism. We, since we're very closely aligned to the platform, we'll see some of the requests coming in the platform where people are struggling to use the tools because they're not doing continuous delivery. They're trying to do a legacy delivery flow using our tools and it's harder. And so we look for these problems and we say, Hey, can we help you start working towards a continuous delivery flow? We'd love to talk to your team, see what your struggles are, um, and, and maybe give some ideas of how to make your lives better so you can sleep better at night. And we do a lot of that. And, uh, at this point we were, we're the people, people point to, we don't want to be a constraints. We're working very hard right now to push that out into the community. We really want, um, people who know how to solve that problem out. And we're working right now to train those people, uh, certified that you know how to run an improvement project in an area that you've shown results, that you have the right mindset and build a community of those. A Guild of value stream architects is what we're trying to build right now.

00:19:17

And our leadership has bought it. We have executives who are asking the right questions. They are pushing the right things. They're supporting things. Uh, if somebody wants to start a dev ops day or get some advertising out around continuous delivery, they'll have executives, you know, jumping up to try to help to push. Um, you've, you've got a lot of partnership from the top and the grassroots going on, and you, you see teams being recognized for executing well. Um, and you see teams excited about it. And we see right now, a big push around looking at those delivery metrics, pulling those metrics that we've been pushing so hard for so long, making them global, uh, looking at them during planning meetings, looking at them constantly to see, uh, you know, if anybody needs any help in tracking improvement and seeing where you are instead of just do it harder

00:20:27

And platform. We also, you know, I said before, how important it is for teams and platform to grow together. And we really work on that partnership. You know, we first started building out the platform. We had early adopters coming in, who really wanted to drive it and wanted us to help the one to help us improve what we're doing. And this is some user feedback, right? Is it that they can go and extend our platform easily? You know, we've built something that's, you know, pretty simple for teams to, to do simple things on. And these are from do complex things on, and that we have an open source mindset that we want contribution. If you want to build something, to help us improve, build something, help us improve, you know, and you know, it's like it says here, they're, they're not just consumers, they're partners.

00:21:21

And that, you know, they they've gotten a lot of benefit from the documentation. We put together the playbooks and examples and things. So they don't have to come talk to us. Let's go to our website, look at our resources, get the help they need, and then come to us when what they need is just isn't there. And this is from, uh, an area for Walmart, Canada. This was feedback we got from them and using our platform, working with us, um, and focusing daily on continuous delivery, as a way of working has created better outcomes. They deploy 70 times more, 72 times more frequently than they did before. It costs 93% less to do a deploy. That's staggering. They reduce lead time by 92%. They can make changes really fast. And we saw the results of this recently with, well, I'll just show you because it's so cool. It enables better customer satisfaction. This is literally from Reddit where, um, honestly, somebody who isn't necessarily our normal Walmart shopper, they came to us because everything else was breaking, X-Box released. And everybody crumbled with us. And we saw all of these things flooding in from all these different media outlets where people were saying, well, Walmart's up, Walmart's up. And that's because areas like this have been focusing on how do we build resiliency? How do we get changes out reliably consistently use the platform, use the process, focus on CD and stop running fire drills.

00:23:16

It also allowed us to respond to COVID. Um, we know we were able to rapidly implement touchless checkout. So people didn't have touch screens. We rolled out express delivery and broadened our, our home delivery rapidly. So people didn't have to go to the stores. And, you know, this is a quote from a, one of our earnings calls that, you know, we had massive growth in orders per minute. And if you saw earnings from the last two quarters we had, uh, I think over 90% growth and.com last quarter, uh, from a quarter to quarter, or I'm sorry for comp, and We stayed up, we were stable and that's not because of massive heroics it's because we've been focusing on it.

00:24:09

So things that we've learned is that why is so much important, more important than what if you explain to people why they're doing something, instead of telling what they're doing, they get really bought in. We've learned that helping gives much better outcomes than directing. If you're just showering people with love and how can we help you? You know, these are the goals that will, that we're going after. And we, we want to partner with you and help you get it done. And so you must do this way. People are bought in ownership is better than accountability. Now, you know, I've heard many times in my career, people, we need to hold the teams accountable, but no, you don't because if you just hold them accountable, it's just a moralizing. But if you give teams ownership where they not only get to make decisions, but they're responsible for the outcomes of those decisions and they are invested in the goals of their product and they push those things forward, you get much better outcomes. And that if you give them clear goals, plus that ownership, you get improvement all the time, make sure the goals are aspirational, let them know they're aspirational, make it a challenge. Give them the ownership to meet that challenge. And they'll surprise you

00:25:28

Engineers want to solve problems. You just need to give them the right problems to solve. It has like, I always rant about people giving engineers the wrong metric to solve a problem for the code coverage. They'll solve the problem, give them the right problems.

00:25:47

So the other thing is you, you don't want to grow a big team, a big dojo team, in my opinion, you want to spread it across the entire enterprise, right? So make sure that change has grown as a capability that people are focusing on instead of something you're imposing on them, that everybody has common principles or going after not best practices, I only ever use best practice. Ironically, there are none recognize those who try. They don't have to succeed. Someone tries something new celebrate the fact that they did it encourages them and it encourages others. And really you, we, we want to embed improvement into the culture. We want to make it everyone living and breathing. How do I do better to be slightly dissatisfied, but where they are today? Now, Like I said, we've, it's been a challenging year for everybody. And recently, um, a store associate said that showed, uh, some feedback that they got from the customer, Leah and Olivia thinking us It's Leah and Olivia thanking us for keeping the stores open, uh, for making sure they had the ability to stay healthy, that we were looking out for them. That's why I'm a developer.

00:27:23

So We always want to know, you know, Jean always ask, what is it you want to know? I'm very interested in how you measure the effectiveness in your transformation. You know, there's many things that we could point to, but I'd like to know how do you know you're doing it well. Um, and you know, I'm going to be on slack, please come back to me and talk to me about it because I'm very curious And thank you very much for your time. I hope this was useful. Uh, if you want to reach out to me, uh, I've got a series of articles about some of the things I really believe in on medium, but you can always reach me on LinkedIn. Uh, I'm occasionally on Twitter and I will absolutely be on slack during this entire conference. Please hit me up with anything you'd like to talk about. I love this topic. Thank you very much.