Las Vegas 2019

What’s in the Way IS the Way - A Story of an Unlikely Successful DevOps Journey

You may not have the kind of resources you need when you start.

Your people have to spend a lot of time supporting the on-going development and release so they don't have the time to work on automation.

Your architecture may not be as clean and well defined as you would like.

You just started with test automation and your coverage is still very low.

Above all, your customer and delivery partners are doubtful of the new initiative and are not willing to change.


In this talk, I will share the story of a successful DevOps implementation in a traditional organization with many these constraints. My experience in leading this journey taught me that if you can take a realistic assessment of your situation and plan your implementation carefully, it is entirely possible for a traditional organization with large and complex computing environment to adopt DevOps practices, in order to deliver software faster with significantly better quality, and at the same time do it in a way that makes peoples? lives much easier.

WW

Wayne Wang

Senior Manager, DevOps, J.B. Hunt Transport Services, Inc

Transcript

00:00:02

Good morning. And thank you for coming to this sag. I'm really excited about sharing my DevOps experience with you. What's in the way is the way, what do I mean when we, uh, implementing type of ops in large traditional complex organizations, your journey is not going to be like this, no matter how much you would want it to be like this. Instead, it's going to be something like this. You're going to meet a lot of challenges, difficulties. At a times, those challenges can be daunting and overwhelming and we can get very easily discouraged. But the challenges and obstacles can also be your creative energy.

00:01:06

We need to take a little different attitude. We need a letter code. They expectation that our journey should be smooth and nice. The clear hedging scheme. We have no control over what we have and we have no control. What comes to us. We symbolic focus on how to fig figure out how to deal with the situation as has we, this what's in the week actually pointing to the way of a solution. This is what I mean, what's in the way is the way in early 2015, I finally got the chance to move from my ops to dev. Prior to that, I was the it operations manager in one of the most challenging it operations global automotive accompany. My plan was 24 by seven. It made high volume, high value product. Even my it system was shutting down the production line. The company will lose losing $45,000.

00:02:27

Every single meaning the pressure was very high. So you can imagine why I find all I'm went from obstacle that I was telling myself all kinds of my life is going to be so wonderful. I'm going to be like a Sammy retiree of cause quake live fun. That's not true, but I have received and SOC mall ops, my organization was a, uh, following the tradition, uh, what a flow waterfall method with a lot of a hand offs, long release the horse. We could only release a solve the whale every six to nine months, but it took guys eight weeks just to set up the lower environment for each release. The release we can is a nonstop is three days, three weekend. It's three days, three nights. We can, once we released the software, it's like this little robot on the floor in a rail at work.

00:03:34

So we'll ULS spend six to eight weeks just to mock things up. We call it a pasta launch you forum. I was thinking to myself, Hey God, we get bad aware, right? And so I spent the next one and a half years, I started have the best software company deliver their software. And in the process I discovered I learn about a CIC D DevOps, including reading Jean, every book from Jean Foster forward time-outs by May, 2018. We were about to release the software, every tool, every two months of more frequent. And it took us about a one week to set up instead of eight weeks. And we cut on the deployment time from the three days, three nights to about 10 hours.

00:04:32

If the spirit is good, quality's even better on this. A chart, the blue bars represent the total number of defects supplies, deployment of failures, or in Japan, pretty much of that deployment manifest. It was the first of four releases where the traditional manual and the last tool was used in CIC de pipelines. As you can tell, we cut down the defects that release of failures, but above by about a 90% releasing solve the well faster with better quality is a problem. Every company's dream that he's the end of going digital transformation, which is also why I'm a working at a JB hunt transport. Now leading these terrible Oxy initiatives, JBU Harney is a transportation service company from the beautiful Northwest Arkansas. It has the largest, the two mastic, a 53 food had content of fleet in north America. In 2018, we had about a 28,000 employees with a close to nine peeling dollars out the revenue. And we had a 2000 employees in it and we grew very fast too. And our flagship product is called a JB hunt. 360 is a digital breakage software, but the work I'm sharing with you today, actually the freedom from my previous employee, let's just call it a global automotive company. It's a 10 to 20 times larger than GB, huh? And he likes global food praying with millions, millions of cause tumors, the sub organization, my team and I worked to develop the CICB compacity has 250 developers divided into 26 squats.

00:06:38

Pretty even less play that between north America and Asia, with a cup of the squats in Europe, we support about a 50 applications. Most like courts have cots applications with very heavy, uh, customization. And the computing environment has a hundreds of servers. All three operating says terms of windows, Linux, and Solaris. And we support about a 30th Eileen users globally. So it's a fairly complex computing environment I had. We have a lot of challenges that's unique to us, but of the five lessons I shared with you today are pretty common across all companies and industries. Let's look at a number one in the beginning, we probably were brought don't do not have the kind of a tab ops expertise. Really. There is a two, two ways that you can get it as people hire from outside or develop your own. In the last three years, I have conducted these the 50 to 60 interviews for dev ops engineers. I can tell you how hiring from our side is a very difficult time consuming, very costly. And the quality is not was a guaranteed. So very quickly I realized, you know what? I better chose who to be developing our own people.

00:08:13

We all have good developed person manual Peggy years in our company who are patient at how about a software delivery. We also have good to say to them means who can code and whole like to automat. Those are the best that candidates have. Other tables of obviously engineer. What we, the challenge is we need to convince our people. They have to be auto automate now to be automated. What I did was, uh, telling my team, okay, this isn't new to us. I don't know how to do it. And you don't know how to do it. Let's work. Figure it out. We have three choices. Option one. Let us step up. Step up in the next three to six miles. That's a figure out how to do this thing together. Option tool, you may say, okay, that's an order from me. I'm a two lays out to that. It's a fine, we are a big company. I can help you to move on to something else.

00:09:21

Obviously stay where you are. Really. That's not a good option because if everybody is a stipend, that means you are sleeping downward. So, you know, and I did have a few people choose to do something else. And I feel people, I have to take actions to take them to find the, that different alternatives. Once we have a team in place, we need to create a Jivara man so they can learn. They can help each other. They can grow that again, including changing the physical working environment. Like from the left hand side to the right hand side, it's very critical for you to develop your people and also developing a high performing teams. So if, I think, think back if I worked in a lot of the mining, how a lot of people, I don't believe we were where we would be at the sexiest far as we are, because in the end, some of the best performing engineers, actually from our own team, the engineers, we developed it, not those ones we had from our side, the next or challenge. How do you balance the ongoing support with automation?

00:10:50

Many of us are not giving a big budget, hire new team chest of the development. Instead of we have to do ongoing support, ongoing release. Even you are giving a new team just to focus on developing, but you will fine. Once your thing is working, you are going to face in the same issue because people that lobby your staff, they, depending on you. So you have the hard choices you have to prioritize. And we all know making choice is not easy. Why my daughter was a four year. So we took our family to Disney for vacation. At the end of the first day, my wife, who has a padding, our two kids, each of you can pick up one toy from the store. So we spent the next 45 minutes in a store and sure the store was closed. My was a sobering. She was a cry shit. Did not pick a single thing.

00:12:01

She loves everything about a DC. So you first have to pick up a one thing. That means that she has to keep up with everything else. It's a lot of stress, but think about it. If it's a difficult or for a four year old to make hard decisions, I think it's even harder for as a grew ups for managers. Or you could say actives to make hard choices. I want to show our hand. We chose you are only working on one thing, one task out loud, it to do that by your boss. Okay. It's a karma.

00:12:49

All right. So we have to make choices, right? I call my way of a prioritizing is you will drown those phone quadrant and a charred and naughty important, important, not urgent, urgent. What's going to happen. Everything is a United quadrant Abra select this. We have a prioritization is not work, but why located this? If you have been in a corporate world for a little while, you understand anything asked by your managing man has to be important, has to be urgent. So if you are asking them to put something in this acquire you and you are, you saw it in their intelligence, right? Here's my improved version of the quadrant.

00:13:51

Believe it or not, I had a lot more sex this using this version, which is my stakeholders. So once you can prioritize it, now you can focus your resource on top of priority themes. So if I'm thinking back, if it were my team or whatnot, having to do with the support of so many things at the same time, we probably had not really spent time a priority, the highs, and we wouldn't not be able to deliver so quickly. But because of this, we were forced to, to focus on the sweet, most important and things, most urgent things.

00:14:37

I'm going to talk about it. The next two challenges together, the posts are very difficult technical challenges in UCI CD pipelines. Let's look at them, uh, show in the very beginning, I was very Charlize of those accompanies or who has, who have those micro services architecture, Charles. Okay. They can do newer, smaller chain views, continuously release those into production with a full automation. But I felt like this is what I was dealing with legacy architecture, and there is a new Creek and a use a wheel for Modi. Nice. Although it has to be done, out of my team's control. So the question to you as the leader of dev ops, what are you going to do before that happened? That's happening? What are you going to do in the mean time?

00:15:41

Similarly with test automation, when we start test automation, our dad was upstairs. We probably to not have a good test automated test to begin with. Even you have some tests out managing, I think of the chances. So you, we have quickly find out those pests, automations had not good enough for four CICB pipeline eyes on this chart. This is a data from a real data, from a very sexist four largest software company. And these are from one of their biggest, uh, product lie. As the audio orangy bar show on this chart assure you in the beginning of, they already had a lot of test automation, but they found the same thing we found of those test automation is not fast enough or too flaky for CI CD pipeline. So they had to spend the next two years to read who they are, entire test automation suites.

00:16:45

So the question again to you as a leader of that, what are you going to do in the meantime, one thing my team and I realized very early eyes, 80% of our customer requirements actually have various thing called Cosmetica very easy to auto man, very easy to release. You don't need to shut down the system even to apply those changes, but those are the features. Our costumer always wants to have the yesterday today, next week instead of a waiting for six months. So that is a all risk opportunity for us to construct the, our CACD release trains with two kind of two tracks. Why is the express chain we can release very quickly. We run the train more frequently. We could probably deploy 80% of our changes that we, and leave the other 20% to the free tree for the big batch release. This is how it works. Also let me in the next, uh, three miles or 12 weeks, we have a 300 changes. If we have this, a fridge express train, we release a 20 each changes per week for 12 weeks. That's 240 Erica Sam that left a 60 for the free tree for the patch releasing. And as you can see by doing this way, your big batch release will become a lot more smaller.

00:18:34

And when we started this, uh, express release, uh, program, uh, we pick up one simple thing style sheet, and there are five changes for the first time. It was X is right as we can. In more experience, we slowly increase the complexity and types of changes. And we also increase the total number of changes we can do each every week. And we got a very good at results. The first, the six release for the six weeks we released the 61 changes into production with only one failure. And we fixed the one failure in 30 minutes. So base color, the failure rate was 1.6% MTTI was at 30 minutes.

00:19:30

So as you can see, the legs are tech shorty not preventing us from using CIC Deanna release more frequently. Instead it taught us a lot of valuable lessons. Like we pick up the easy changes as a Laney experience for our team. It, Todd has the value of a small batch to release more frequent release. We clean up the backlog for the tabs teams and at the best off, or because we are doing a so frequently, we were forced to release it during normal business hours. Because if we were asking our folks to this kind of release, every weekend off hours, people are going to quit on us. So it ended up a very positive for, for us. Similarly, with the lack of test automation, we don't, we didn't, we did not have a lot of good test automation is who, but, uh, we develop a small set of credit Curry integration test. And we use those tests as a guide rails complimented by manual validation in both the lower environment and in production. I'm sharing this snip here of I am between myself and my business partner. As you can tell, as the release was improved stress, I was texting my business partner in real time. Okay, this job is done. This job is done. This job is done. You can check as a result. We developed a very close relationship, who is our business.

00:21:22

Now down to my last lesson, I want to share, um, this is a problem. The most deeply Cod aspect of 18 devolves he initiative is the people's resistance to change. This is just a fact of life, right? Our work habits, the priest processes, procedures, what a result of many years of work. If we can get by, we don't know, we don't want to change it. I don't like to change. Nobody likes change for no reasons. And you know, yesterday, the talk about company, well, the CEO come here. He said, yeah, we need to change everybody. So that's a rarity, right? If you have a CEO like at UCON herself has a lucky, but for the rest of us, you're gonna have to be very patient. And the type of collaborative approach with what you do, even with various small changes can become a big deal.

00:22:27

Here's an easy example in April, 2018, after the format itself, we're using CSED pipelines of a weekly release into production. I feel we were ready for the big batch release using CIC D and these them, my hand and Juwan release plan for that first is the ICD release. And one of the decision point is actually, where are we going to have this, a check point of meetings during the release because of traditional, our organization would have a one check upon a meeting every four hours in a previous weeks, my team already conducted four rehearsals using the same CICD pipeline for the release and a starter from April 16, we were taking 34 hours to deploy. And with each sheet raging, as we cut down the time, two days before that production release, we actually went down to nine hours. So I was very confident that if we bought it at 12 hours, we should be able to deliver actually the production release was the same an hour or something before. So that was a pleasant surprise back to the meeting. Talk about whether we should have those checkpoint meetings. Every four hours to hear is that a conversation between myself and my business partner? Well, I suggest, you know what? We don't need to have those meetings every four hours, if something goes wrong, it, we just fix it right instead of a week. And you happen to be off night. This is the response I got from my business. So you can say active. She said, it's a condo, Sandy.

00:24:21

Maybe I know you are not, you are not USDA affair, a new type of organization. And you probably don't know we are so used to get up at night, every four hours, even if it's three days, three nights, I said, okay, what about a way to do this? Um, let's have the first checkpoint at 10:00 PM of Friday. And instead of two it, and the next one at 2:00 AM, let's push the 4:00 AM. I said, because 2:00 AM, the Jenkins is going to be PZ, installing software. There's really not fast. Talk about, she said, okay, let's do it a fall. Here's my conversation. Visit my P's and his partner had a four M Saturday morning. I said, we are down. If you want to call that business, pastors, start passing, please go ahead. Do so. And of course she said, wow, that's a five hours ahead of a schedule.

00:25:21

Um, no, I think it's a lot of at this early. Uh, why don't you go, how to stand your people home. We are going to stick with the 8:00 AM. Original time. I can tell you that was my head. Here's the moment hurtful my dev ops during, so with the CSAD pipelines, we pray the magic. Caterino what week I called the pause, the launch release of sub party in the six to eight weeks. When I talked about that in the beginning and after the may release, we had a meeting on Monday morning and the meeting lasted about five minutes because all the issues were already resolved. Wasn't really new issues. And in the June 2nd release, we scheduled a meeting on Monday morning, no business people should even showed up. So we basically get a read of that in summary difficulties and challenges can be the source of your creative energy and your creativity. And you need take a different perspective and the unit, your expectation? What CRC CRCD Jenny, should it be local? Like what they've all seen initiative should be like, he expected a lot of challenges in a way, but also you can treat it as a challenges as an opportunity for you because what's in a way you can actually pointing to that. We have a solution. Um, we, this, I think you should be able to surprise and delight your customers, deliver value to the business above all your life is going to be much, much better.

00:27:27

Of course of the journey never ends. I know mine is not. I was a pug hop out of the architecture, the test automation. What do we do that used to just work her wrong, waiting for those to be changed. But the hard work piece is still there. How are you going to modernize your legacy system and how are you going to accelerate that highest automation? Right? Of course, in the conference. So we heard a lot about how to bring business to the IGI or so we are not doing this a big waterfall type of a planning up front. Thank you for your time, for your attention. I have a, I probably have a time of our two questions, any of you? Okay. No questions. Thank, oh, go ahead.

00:28:25

You mentioned that you had a lot of pots out applications. What was one of the biggest challenges that you face trying to model DevOps into utilizing the clocks space?

00:28:37

There are many challenges. Um, a couple of those come to the top of mine. Mine is as they are developing men, OSI covers his hours because they have either they're using waterfall and they probably gonna read this to you like every six months or 12 months. And how are you going to synchronize you already society? How is them and developing the features into the release of each way into production? It's a way number two is how are you going to manage it? There was a backward and forward compatibility, right? Anybody else? Okay. Thank you so much. If I attend.