Patterns for Enterprise Success: The DevOps Journey at Nationwide (San Francisco 2014)

Your business depends on software. So it’s critical that enhancements be meaningful and timely – but you are confronted with legacy IT systems, process complexity, and organizational silos that take too long to deliver the software changes needed to support your business. To tackle the delivery of complex enterprise applications, businesses are embracing DevOps, a software delivery approach that focuses on speed and efficiency without sacrificing stability and quality. In this session, join Carmen DeArdo, Director Application Development, Nationwide Insurance and Hayden Lindsey, IBM VP and Distinguished Engineer, to learn about DevOps. Carmen will share how Nationwide has improved software quality by 50 percent and reduced system downtime 70 percent by implementing DevOps processes and tools. Hayden will discuss IBM’s POV on DevOps, including current and future capabilities to drive ROI by integrating new systems of engagement applications with existing systems of records running on System z and Power platforms.

plenaryvegassanfran2014
CD

Carmen DeArdo

Director Application Development, Nationwide Insurance

HL

Hayden Lindsey

VP and Distinguished Engineer, IBM

TRANSCRIPT

00:00:08

I'm going to talk about dev ops and I'm going to focus on dev ops for the systems of record. And, uh, you know, you saw that little upside down, uh, peer or triangle, or what have you that showed that dev those systems move more slowly and maybe there's not too much dev ops happening there that doesn't have to be the way it is. And the fact of the matter is if the systems of engagement are moving very fast and the systems of record are not, you have a problem because the apps span both of these and that's what runs the business. And so we have to be looking at dev ops for the systems of record as well. And just to motivate a little bit, I just like to call these the fun facts. We forget these things. Sometimes, maybe you never knew them, but CRCs is used tremendously.

00:01:03

This is transaction processing system that runs on the mainframe 18 times more frequently than Google searches. And COBOL is still responsible for the majority of the business transactions in the world, not dimension PD-L1 and assembler and RPG and things like that. However, the people that are coding these systems, maintaining, writing new systems are not getting any younger. And so we need to do something about that as well. The only thing I want to mention here, since we, you already know what systems of engagement and systems of record are, is that there is this impedance mismatch there's need for speed on the front end. If you can't keep up on the back end, you're going to put the business at risk. So we have to do something better in the back end. And here we are dealing with inertia. People have been doing things the same way for decades, not years, decades, and it's time to improve.

00:02:09

And by the way, if, if you're not, most of you, I know are not working on these backend systems and these older, mature programming languages, you probably work with some people that do. And so it's important. Your job will be a lot easier if they embrace these technologies. Now this, this is my one from marketing slide. This is our, the IBM point of view of dev ops, which as you can see is the entire development and delivery life cycle. And we're not trying to just confuse everybody with doing this, but we thought about dev ops. And we said the principles of dev ops, like automating everything you possibly can is exactly what we've been doing with our collaborative lifecycle management or application life cycle for years. And so we said, let's just fold it all under this term. And if that confuses you, I'm sorry. The other thing it's not just about obviously speed of delivery. It's about speed at getting feedback and you all, you guys all know this, but that's why there are a lot of circles and arrows and such on this slide. All right. I condensed a whole lot of slides onto this one. And I'm just going to give a few examples of the things that I see out there because I do visit with a lot of clients around the world and talk to them about what is possible, but the state of the art or in the state of the norm out there is that people doing these backend systems on the mainframe or on power systems or what have you are using the most outdated tools. You can imagine how many people have heard of something called ISP F

00:03:57

All right. There's a few people who are going to understand how awful this is. This is a green screen editor that is over 30 years old. And I would assert that there at least 95% of the mainframe developers are still using it. Despite the fact for more than a decade, we have had eclipse based modern tools for doing cobalt development or a similar development, which is exactly the same as what you'd use for doing web or mobile or Java, or what have you. So there's a lot of room for improvement and the same as more or less true for the team tools, you know, source management, defect, tracking that type of thing. Second problem, as already mentioned, these people are not getting any younger. So the challenge of course, is to bring in new talent, to take care of these systems and enhance them. When folks decide to retire, which they will Additionally, the, especially the team tools, but also the IDs are totally different and they're disconnected from what the distributed web and mobile folks are using. And so you have a lot of time and effort wasted on coordinating across the environments. And then there is a lot of FID out there.

00:05:11

I don't know who to attribute this to, but there was a lot of FID saying, well, oh my God, they're not teaching cobalt in university anymore. Well, by the way, when I went to university almost three decades ago, they weren't teaching cobalt. Okay. I learned, I learned that very useful commercial language called Pascal. Okay. And then I went to work for IBM one summer and read Carnegie and Richie and wrote 3000 lines of C code Because programmers learn languages And the people are coming out of university. Now they have, you know, you don't ever hear them say, well, I can't learn PHP. I can't learn JavaScript. Well, they can learn cobalt as well. You just pay them. They will learn it. Okay.

00:06:02

So, so there's, anyway, that one, I like to get on my soap box just a little bit about all right. The other thing is manual testing is the norm. And if you can't automate your testing, that is going to be the bottleneck in your software development and delivery life cycle. And there's a lot of cross-platform coordination required and so forth. Now, what can you do in order to move from at least in the direction of the unicorns and start down this dev ops path. Now, when you're coming from the place where this backend development is going, it's a tremendous challenge. It's a tremendous cultural change, but of course the opportunity is way greater than if you're coming from a more modern place. So the things that you can do, you can have modern ideas. As I said, that are exactly the same as they're using for the distributed development.

00:07:00

In fact, we have one idea that has everything from mobile support to COBOL and assembler all in a single IDE. We have team tools that allow you to unify onto one team platform, all of your development people, and whether you're dealing with JavaScript or COBOL independent of where it's going to run in production. You have a tool to integrate and let the tool do the coordination. The builds deciding when to integrate into a build. If you have a defect requires mobile changes and cobalt changes, it will do that coordination. All right. There's a lot more tools, but that's just an example. As you've heard, if you're trying to do culture change, you're not going to roll it out all at once. So start with a pilot, gained some success and some confidence, and then put together a rollout plan. And if you're talking about large enterprise, it is going to take several years to roll out to hundreds or thousands of developers across, you know, dozens of teams.

00:08:04

And there's already been mentioned about having exec, you know, uh, sponsorship. When you think about this backend teams where they've been doing things for decades, inertia is the enemy you Mo you're not going to have the grassroots adoption like you have in the distributed arena. Okay? So you must have, you know, the, the top level support in order for the tree huggers to, to start changing. And in once you start getting some success, people will get on board. Okay. But nobody wants to be first. People are very risk averse. All right. And there's lots of other things. Clearly you need to automate tests. Then I think worry about automating, deploy. Uh, you can virtualized service or stub things out. So you can test earlier you, in the mainframe case, we have a solution for you to actually test off the mainframe. So you're not using MIPS, the mainframe you have to pay for your cycles.

00:09:04

And so if you're going to increase your builds and tests by a factor of 50 or a hundred or a thousand, and you're paying for those cycles, the CFO will not allow it to happen. So move it off. The mainframe, have your own test environment owned by the development community just like is done and then distributed around. All right. So I want to give just a couple of case studies very quickly. I'm going to do these very quickly. And this one is a little bit eye-popping, but you know, these are the customer numbers by adopting agile and dev ops developer productivity up 1600%. There is no company name on here, so it's hard to verify, but that's what they say. All right, by doing, I don't know, by automated deployment deployment time goes from up to two days to a few minutes. We all know that this is possible.

00:09:55

That's what automating deployment is all about. And automating impact analysis with a tool versus manually. This client says, you know, 10 to 20 times faster. Again, you believe it because that's what a tool can do, crunch through stuff. And when you're talking about millions of lines of code, trying to do that manually, you know, it's going to be hugely error prone as well. VP securities client, I work closely with they converted did an automated language conversion from an old Ford GL to a modern one that we have with virtually no problems. When put in production and Casa east, they decided to consolidate from three team platforms and SCMS and so forth onto a single one. And they feel like it improves communication and integration and so forth a lot now. And this is my last slide. And we'll ask Carmen to come up in just a second.

00:10:51

And I know I've taken too much time. The one help that I would say, and it was actually the IDC speaker talking about it. We have a lot of metrics now, this one, okay. I keep wanting the point of view cuts costs. Is it 1% or was it 75%? So we need to get a little bit more quantified results. And the other thing is be able to translate things, not just in technical terms like we deployed in two seconds. What does it mean to the business? So anything you guys can think about around business metrics that helps convince the CFO and the CEO, not just the CTO or CIO. Alright, Carmen, thank you very much.

00:11:37

Hi. So I don't have buzz light year. Um, I don't have any, I mean, I, I could have shown Peyton Manning eating the chicken farmers on sandwich. Anybody seen that commercial feedback for marketing department? Um, so nationwide, um, all aspects of insurance, pet insurance, financial systems, thirty-five thousand employees, um, lots of regulation in different industries. We have about 8,000 it professionals. Um, and we have now 105 agile teams, and that's growing at a rate of 35% a year. So 64% of our development work, new project build is going through agile teams.

00:12:34

So how do we get there? I joined nationwide nine years ago. We didn't have any thing going as far as agile goes. Um, Hayden, I think there was some reference point idiot. I don't know there was unicorns and horses and I guess I'm the idiot, but yes, go talk to the mainframe teams. If there's anything worse than talking about agile, I'll talk to a mainframe team who hasn't hired anybody, the century about continuous integration. And then, then it's like, w you know, you, yes. So how did we get there? So I think, uh, you know, we started a lot, like, I think you heard about GE. We started with a small area. We did have some believers. We had what I would say dabbled in agile. We had started projects. We hired consultants. We did the project. We did the norm form storm, perform wouldn't and whatever where that's supposed to be.

00:13:28

And then we declared victory. Everybody went off and we had nothing persistent our environment because everything was project centric. So we said, look, that's not going to work. We've got to change our model. We got to take, build agile teams around our most important assets and bring the work to the teams and keep those teams together and invest in practices, both engineering practices and management practices around agility. So we started with some true believers that were spread across the organization. Um, we had the cover of a VP, a senior VP, which was very important to get started because in the beginning we talk about dev ops, being a bad word, why agile was a bad word, and if anything went wrong, it was those agile guys fault, right? If you know the toilet overflowed with those agile guys in there, what's going on, right? It's agile, everything was agile.

00:14:22

So, you know, you need some time because you know, things are not going to go smoothly. At first. You need some time, you need to get your footing. You need to establish some common practices and tooling. Yes, you need innovation, but you need some kind of standardization. If you're going to have some kind of scaled approach, if you have 2000, we have 2000 developers. So we're going to have to have some discipline to go along with that innovation. After about a year, we started to get results, right? And you'll see them on one of my later charts, qualities of productivity was up. Um, everything was good. And so then you have a story to sell, but in that first year, you need a little bit of cover. And it was investing in that concept of an application development center and having the backing and some of the cover management that allowed us to begin.

00:15:13

We did start with systems of engagement. Okay. So we didn't jump right into a mainframe system, but we also were saying, we're going to apply this across all technologies. We're not going to build something that works just for one set. We're going to do this for Java. We're going to do it for.net. And yes, we're going to do it for mainframe. Um, so we did start very early on to bring mainframe teams into the, uh, into the application development center and play some of these practices. And we didn't, you know, so what this makes you think about, what does continuous integration mean for mainframe? Right? You need a development environment. One thing we learned early on was we were getting a lot of value out of the development environment. The teams own the environment. Um, they could deploy on a constantly multiple times a day, they would run all their tests, anything that was broken, they would fix the beginning of the next day.

00:16:08

We, we developed standards, uh, patterns of provisioning those environments, um, so that new teams could get them right away. They had Maven Jenkins, sonar, you know, security analysis, all those things in that environment when we needed to do the same thing for our mainframe teams. And, you know, there are, there is the ability within the IBM stack to start to bring some of that modernization, not just in the eclipse, um, ID, but also in having a virtual mainframe that you can run and treat as a development environment. And it's not quite as slick. I'm not going to say yes, it's just a slick and push button check in runs continuous integration, but you can meet the goals of having an environment that the team owns for their hard test cases. And the first time we gave it to them, I used to say, it's almost like bringing a Ferrari to the people who don't know how to drive that much.

00:17:04

They look at it. It's pretty, but I don't know if I want to touch it and I really don't want to get in it. Right. Because it required a skill set. They weren't used to, they weren't used to system programming level of really owning. It's like, yes, it's okay. You can IPL the box. Really. I don't know if the call five infrastructure people and then submit five requests. Now look, there it is. Oh, wow. You know, call the police. So it was, it was so, you know, people say they want to be empowered, but sometimes you sort of challenge that. Well, here it is, go run with it. Right. And we've started when we do have traction now with our mainframe teams taking advantage of some of those modernization tools. Um, so what was the road to change? As I said, we sort of changed the model as you've seen before, in some of the other presentations, we have cross-functional teams.

00:17:51

So we have teams of 12 to 24 people. We have paired programming, you know, eight to 10 developers, testers. There's a product owner, sits with the team. There's a scrum master iteration manager who sits with the team. Um, we have an infrastructure delivery lead who is assigned to an area who can help work through some of the environment issues and other infrastructure issues for that team. So if you look at it, what we did is we sort, although we weren't thinking about it at the time, we sort of took the dev ops model that Hayden described. When we optimize the metal, we optimize the design develop test part of it. Um, you have to start somewhere and that's sort of where our investment was and it is, and it's proved very successful, um, from that perspective. So where we are today is, as I said, we have about, um, a hundred or so of these teams and they're growing at 35% a year, and this is not being mandated.

00:18:48

So we did not come across with a heavy hand and say, everybody's got to do this. We did say, if you're going to do agile, here's the way you do it. We also obviously have a culture of learning a culture of continuous improvement. Um, so it's not as if we feel we have the answers. We know we don't have the answers and we continue to get better, but if you're going to do this, we have two organizations. We have the ATC is where we sort of have the pilot and where we've grown to scale this. And we continue to do new things there from a continuous improvement perspective. We also have an organization that I'm part of which helps do the training. Um, the practices makes, you know, the standard tools sort of drive some of that roadmap, um, because you do need to make an investment. If you're going to get 2000 people, you know, doing this at scale, uh, from a development perspective.

00:19:47

So as I said, we sort of optimize the middle, right. But I don't, we're not really satisfied with that because if you look at the end to end process, we still have some issues. So from the middle perspective, everything's very good. It shows up in a prioritization. We have, we have epics, we have stories. It shows up on the smack backlog. We do visual management, you know, it goes to a set of iterations. And then we mark it done in life is good, but we have some problems. First of all, where does these things really come from the backlog? Right? We have all these team spaces now, open spaces. Um, you know, we have tooling, we have some of the rational tooling for the visual system management, but it's like, where's the attachment to the actual business portfolio because we have a very complex business model.

00:20:39

We have around 20 different business units. So if you'll get our main offices, you know, life and, and auto and homeowners and pet insurance and financials and retirement plans, right? They all have their own portfolio leaders and plans and everything they have to do. So that planning component is kind of separated. It's not visible. We sort of have this hidden factory going on because the work gets planned. It's in spreadsheets. And then through a lot of manual flurry of activity, it may show up in a backlog and then it goes very smoothly through these iterations. But then where does it go? Well, we don't deploy. I mean, I liked the distinction made swearing between delivery and deployment. We don't have to deploy at the end of that iteration. It's still got to go through some managed environments, but as far as having our visibility, it's sort of like dropped off the face of the earth.

00:21:36

So we need to expand some of the agility and lean principles that we're doing across that life cycle. So some of the things that we're currently doing is things like, how do we start with the beginning, the projects, the work requests, you know, we use clarity for a lot of our initial planning work that starts it. We need visibility to that. As soon as possible. A lot of the work we do impacts multiple applications. There's a lot of dependencies. You need to be have visibility to that because if you don't, then what's going to road. Speed is going to be all those dependencies. We heard about that before. The other thing is trust. If you can't see what's going on and you have a bunch of manual activities, you're going to slow down because you're really not sure what's around that next curve. So we're, we're starting to work to build out the sand and model, which we have prototyped right now in the lab where you start to see the release planning.

00:22:38

These portfolios are becoming visible. They're attached to the team's backlogs. So you can start to do some flow leveling. I mean, dev ops is in continuous delivery is about continuous flow. So you need to be able to see where do I have opportunities to do work. If I'm, if I don't know when I have opportunities to do work, I'm going to miss those. We're very good at delaying things. We're not good at looking for early opportunities to do work. Sometimes we have teams sitting around star for work. Well, that's a form of waste they're sitting around. They need work to do. And yet we don't have enough agility and visibility to our front end processes to actually get that work to them. So they can continue to go through this design develop test model that we've created. And then on the other side is the deployment side.

00:23:26

We still have a lot of governance and a lot of, um, how shall I put it nicely? Um, I mean, we have this thing called four to one, right? Which essentially guarantees we're going to delay every delivery by four weeks. And that, and actually it's a good thing though, because, because we had so much chaos around delivery, you have to start somewhere. So we started with some standardization around scope, lock code, free scope, freeze, task, freeze. I'm sure this resonates with some of you out there and that's okay as a stopping point. But if we're still doing four to one in five years, we failed that we haven't gotten any better. Right. I was kind of joking with a release manager and I said, well, if I take the software and I put it on the corner for a week and I come back, if I put in the corner and come back a week later, did it get better?

00:24:18

I don't think so. Time does not make software better. Right? It's the fact we're doing all these manual activities to guarantee the quality of the software that causes us the go slow. It's like, again, driving a car. If, if you don't trust the road, if the weather's bad, um, you're going to slow down. Well, we were slowing down because we couldn't trust the quality of what we were producing. And one of the reasons we couldn't trust it was because we couldn't see the information, right. We have different information in different tools. You know, in our quality system, we have information about the facts in our, um, security system. We have information about security scans, right? When we deploy our deployment process was completely disparate from our release process, right? It was like another manual activity. Nobody knew what the path to production for these systems were, unless you talk to the tech lead or the system engineer, you don't know how many environments they have and what they're going through, or they have an it, do they have an S T one, are they doing PT or are they not doing PT?

00:25:21

Where's their UAT. You don't know these things. And so because of that, it was a bunch of manual activity. And again, we didn't really have a model of the system. So what we've done now and what we're, what we're piloting now is that complete end to end model where, you know, through some integration capabilities, you know, we have a lot of different tools. IBM is a great partner, but we have, you know, we have the tools, we have CA tools. We have other tools. So we're leveraging things like Tasktop to synchronize data across all that. So when a, when a test or puts in a defect in quality center, it shows up in our rational team. Concert is something as a work item to work the next day, right? If there's requirements were getting done in RC, it shows up in quality center so that you can write test cases against those requirements. So there's a lot of capability now to integrate tools, integrate practices for ours, because tools can only serve practices. And if you don't have some common practices, it's going to be hard to implement common tooling, but, but to integrate some of those tools in a way so that you can sort of see the work being visible and also drive continuous delivery. So SLAs that's sort of the next hurdle is to take what we've learned in the agile development space apply it end to end.

00:26:48

So here's some of the results talked about, um, you know, critical defects. Um, you know, we're getting to the point where we find very few defects and system tests were able to reduce those intervals, um, productivity, no on-time delivery. Our system availability is higher. So, you know, the kind of things that you would expect when you're starting to apply some of these practices, and you're finding things earlier in the process when they're less expensive to fix, and you have a lot of automated testing that you can rely on to, uh, so that you can embrace change.

00:27:29

So to the Yodas out there, what could we use help with? So I think if you've listened to me, you know, I need a lot of help. So, um, this whole planning thing, right, we need to get better at agile planning, right? We still have a yearly planning cycle. So even though we've should have changed everything behind the curtains, we've tried to keep it as the business sort of out of it as much as possible. But at some point the business has got to be more engaged. Yes, we have a product owner who sits on the line and helps drive the team. But, but the actual people in the business, senior levels, the people we're planning the work, we have to get off this yearly cycle and start to be more agile and also break things down into smaller chunks that we can feed them through the factory.

00:28:19

You know, product ownership for shared applications is a problem, right? I mean, the model sounds nice when you have somebody there who's the product owner and can prioritize. But if you have five different business units, all using this application and all thinking their work is the most important, how do you deal with that? When you're trying to schedule the work in the next set of iterations and releases, you know, the whole fear of silos, you've heard it, right. How do you, how do you overcome fear of the comfortable silos? Uh, one of my favorites, I tell we have a very strong ITIL presence. I had some passionate conversations about how people want to optimize the stack service problem change. And when they talk about release, I say, no, no release. Can't go there. Release is part of delivery, the continuous delivery. And that's when you get called an idiot again.

00:29:07

So yeah. How do you deal with that? Right? What I've seen some talks, well, I, till on this and they can all work together, but I, if you really have a, I mean, I'm, I'm, uh, I'm looking for suggestions on how to make some of those things work because you have to sort of pick what you're going to optimize and, you know, uh, there's there can be some clashes there. And then again, executives, right? Executives, like things planned, they like certainty, you know, this, this is creating more of an adaptive mindset, right? It's saying we may not be sure the next three months, three months for now, we know what we're doing with three months from now. I don't know. It depends what we find out. Well, that's not going to get you many points working with executives. Right. They would rather see the next 18 months plan, that sort of counter, where we're trying to go here and then metrics.

00:29:56

And I, I appreciate some of the metrics that have been shown cause I'm going to use them because I think one of the things we're trying to sell is I really liked that S and P metric. I mean, finding that, I mean, we could have all been on vacation, but anyway, um, I love, you know, those metrics are important because we don't really even have a metric right now around the around delivery speed. You know, I like the saying time is a new currency. I mean, I actually said, you know, we want to get, I was in meeting with executive, well, one executive and I said, you know, well, this is gonna help us get faster for delivery. And they said, well, do we, do we want to get faster? This is being taped on, like the Disney streaming site could be fired after this, but I'm

00:30:46

Should to sign that waiver. Um, I don't know. I think we want to get faster. So I mean, you know, you can, you can be very successful. We didn't hit rock bottom. Right? We had a talk about, well, you hit rock bottom, hit rock bottom. That's bad, by the way, that's good. Because then people were ready to change. If you do heroic work to keep from reaching rock bar on them and you hide all these things, people will say, well, it's not really a problem. You know, what do we really need to do? So those are the kinds of things I need help with. And, um, anyway, thank you very much.