Amsterdam 2023

Go Faster, Break Less: DevOps Transformation at HSBC

An interactive account of HSBC Technology’s DevSecOps transformation programme featuring David Keane, Global Head of DevSecOps Transformation.


This talk will provide rich insight into transformation from concept through to operational delivery, revealing the real and ‘in the field’ impact for employees and HSBC customers.

DK

David Keane

Global Head of DevSecOps Transformation, HSBC

Transcript

00:00:00

<silence>

00:00:13

The first talk of the conference comes from David Keen. He is Global head of DevOps transformation at H S B C, a global bank that operates in 70 countries with 236,000 employees, of which 50,000 are technologists. I'm so delighted that David is presenting for a variety of reasons. Certainly one of them is because of the massive scale of which David has influenced the technology practices across his organization, which I suspect will astound you much as much as it has astounded me. Another reason is that David attended this conference in 20 15, 20 17 when he saw John Smart, uh, present when he was the head of ways of working at Barclays. And David thought, I want to be able to get, uh, to the point where we can share our story too. So I'm so delighted that six years later, David is here to share his story with us. So here's David.

00:01:07

Thank you,

00:01:13

Marie, actually, of that, uh, that that, uh, discussion that, that John gave was this sense of relief. You know, my God, we're not, we're not alone here. Uh, there was somebody else experiencing, uh, what it's like to attempt to do a transformation in a, in a big old, traditional kind of kind of bank. And, and it was, it was a relief to understand that there's other people with the same struggles. Um, so I work for H S B C, uh, I'm sure many of you know, H Ss b c, the brand. If you don't, big old traditional bank, it's about 160 years years old, present in 70 or so countries. I remember when I joined about 12 years ago, uh, I was lucky enough to be invited to a, uh, to, to a a, an open forum like this, the newly minted c e o at the time of H S B C.

00:02:00

Uh, Stuart Gulliver was giving a talk, and there's one line from that that really has stuck with me down, down, down through the, the years. He said, he's often asked H Ss b C to Hong Kong, Shanghai Banking Corporation headquartered in London. Is it a Chinese bank or is it an English bank? His answer was, well, it's neither. It's a Scottish bank. And with apologies to any of my Scottish friends out there, his reasons were, were tight with money. I can confirm that from personal experience. Two, we're deeply conservative, uh, but three, we endure. And so he saw those as characteristics of, of H S B C, and I think it was probably the best introduction to the culture of, of the place, um, that anyone could, could get, and hope it gives you some idea of what it's like trying to do a, a transformation.

00:02:58

Some of the challenges that you're going to come across, as Jean was saying in the introduction, it's a large bank. You know, I think there's 50 countries in the world that have a smaller population than H S B C has staff numbers. Um, we are regulated in all the countries, so that's 70 regulators. We have three different business lines. You know, we have an investment bank, we have a commercial bank, uh, and we have retail and, and, and wealth bank. So all these things lead to great complexity and maybe don't lend themselves easily to a transformation of, of, of this sort. So, my own, um, my own journey, my own, my own DevOps journey started in 2014, uh, by a printer. You remember those things when we used to go to meetings and you had to print everything out before you went. We don't do that so much anymore, but I was, uh, I was based in, in, in UK at, at, at this time, and I was, uh, running, uh, a large part of IT operations for the investment bank, as well as doing some of the, uh, some, some transformation activities.

00:03:59

And the printer as it happened, was next to a colleague of mine, uh, Peter Towns, who was leading the charge, really on all things agile, uh, agile weight, ways of working within the fx, uh, department. And just as I got to the printer, or c I o rocked up Richard Herbert, and he kind of colored myself and Peter, and he said, Hey, guys, what do you think about this DevOps thing that Gene Kim has just invented? You know, uh, fortunately, Peter, uh, was more clued in than me, and he was a, he was able to, uh, give some, some fairly convincing answers. Um, anyway, as Richard walked away, he turned back and he looked at, he looked at it, he said, Peter, you're Dev and David, your ops, go do this DevOps thing for me, right? And so with that, very careful, you know, career planning.

00:04:43

A few weeks later, I found myself leading the DevOps transformation, uh, for the investment bank, and then a number of years later, uh, uh, for the wider bank. Now, uh, and one of the reasons that we're, that we're trying to do it, there's three really, you know, it's, you know, is, is a competitive advantage. We have to be able to deliver functionality to our users, to our, to our customers at, at, at pace. You might even say it, it's, it, it's survival. We need to be able to respond to cyber threats and other events, uh, much more rapidly than we used to do in, in the past. We also know from the, uh, state of DevOps report from 2014, the firms with high performing IT organizations are twice as likely to exceed their profitability, their market share, and their productivity goals. So it's good for your business, right?

00:05:32

Um, and also, finally, most importantly, we wanted to make it a better place to work. Uh, so how do we go about that? Now, this, this is not a 13 year transformation plan that would be a bit brave. So don't take this back to your management, to your board of directors and say, I have a plan for 2036. You know, this, this is, this is more a, a look back at the things that, you know, the main events over the last 10 or, or or so years. So where did we start? Small group of innovators. They taught us the art of the possible, right? With all the headwinds that you have in H S P C or no, like dust. They showed us what you, what you could do. We landed on the strategy somewhat inspired by a accelerate, um, but it was based on and focused on the speed of delivery.

00:06:16

And over, uh, the time we built a cut, uh, numerous capabilities, but I would say two that really stand out. We agreed a small set of metrics, and we automated them so that we, we could know where we were heading, and there was no extra toil involved for the teams in order to, to, to understand that. And we also went after some of our biggest blockers. Now, in a highly regulated environment, you have 70 regulators in different countries, 40 million customers. It's not surprising that it controls are a big deal for us. So going after the, that as a, as a blocker, uh, was one of the other big things that, that we did. So, where did we start? Well, FX evolve is, uh, it's a foreign exchange, uh, platform, uh, for institutional clients. So it's not like me coming here, getting my a hundred pound sterling turned into 120 Euros.

00:07:00

This is multi-billion dollar trades. Um, it, it that, that it handles it. Uh, it needs to be available 365 days a year. 24 hours a day. Uh, so it, it's a, it's incredibly important application for, for the firm. We had a bunch of, uh, really highly skilled, uh, engineers that joined to work in this team, sort of through 2012, 2013, and they'd been used to working in a very agile way. So they, I think the Scottish Bank was a little bit of a shock to their, to, to their culture. And they found a platform that wasn't that stable and a very strained relationship with the business, very traditional, you know, business IT relationship, a poor one. Um, and then disaster struck. Uh, the, the, uh, the application, the system went down and it was out for 48 hours for two days. This was unheard of.

00:07:53

I think if it had been down for two hours or half a day or something like that, they would've just fired somebody. They'd have fired lots of people, but they'd had carried on, right? But because it was down for so long, they had to do, you know, desperate times, desperate measures. They sat down and they talked and, and they decided the business in it. They decided that they couldn't continue to do, they couldn't continue to operate in the way that they had in the past. Something had had to change. So from IT perspective, they started doing a bunch of things, but they included things like they looked at their structure. They were very traditionally, uh, structured, lots of silos. They had a dev team, they had a support team, they had a testing team. They had a bunch of bas and the rest of it, and over time, but pretty quickly, they decided to get rid of all of that.

00:08:33

So everybody had to carry the pager. Everybody, right? Everybody did deployments. Bas learned how to deployments management. And if you didn't like that, you were tanked for your service, but encouraged to go somewhere else, right? From a business perspective, the role of a business product owner really started to become, uh, you know, much more pronounced in, in that group. So the culture changed in that organization, in that, in that group, quite, quite, quite, uh, quite, um, uh, quite, quite a big change in, in, in, in, in, in the culture, not noticeable. And they decided amongst them that what they had to do was they needed to, one thing they needed to focus on was to release more frequently into production. They used to do it at that time, maybe once a month, right? And they challenge themselves to do it once every two weeks than once a week, right?

00:09:18

If you met that team, now they release into production, uh, a hundred times or more a day, right? And they do it 365 days a year if they want to. They typically don't, right? But they don't do it at, at, at, at, at the weekends. And if you were to walk into a meeting with it and the business, you would really struggle to tell which or which, right? The empathy levels between two teams are such that it would be really, really hard to, to, to tell. Now, for me, I was a bit nervous about this. I have to admit, you know, that, you know, the idea was we needed to go faster, and that was going to improve production. I don't know what the worst day of my career was, but I'm pretty sure it was a Monday. Uh, for somebody who's worked in operations for a large part of their career, you're familiar with all these complex releases that have taken months to prepare and have been greatly tested, uh, going into production on a Friday, and then you discover a Monday, well, maybe it wasn't so great.

00:10:10

So, so, so the idea that this very direct correlation that I'd seen and understood all of my career more change, it was more, more failure. The idea that more change was going to give us less failure wasn't something that came naturally to me or, or, or native to me. That's why this graph is so important, right? 'cause the first time it showed and demonstrated to me and to many others, that if you do all of the small things, right, if you do small de-risk changes, if you automate all the things that you need to, if you've proper product ownership in place, you can not only break this correlation between change and instant, you can send it into reverse, right? And that's, and that's what this, this group did. So our mantra became, go faster and break less. Now, we realized, as we want, tried to broaden this across the organization, that we needed to have a small number of metrics that everybody could agree on, right?

00:11:03

So we, we landed on the door metrics very, very happily, um, going faster on, breaking less. So the number of changes and the number of incidents were the, the two most important ones to us. We took changes one step further. We came up with this very beautiful acronym, P D p D, P d d D Y, which I'm sure you've all guessed, is production deployments per notional, 10 person team per year. And that allowed us a way, rudimentary way to measure, uh, uh, uh, maturity, essentially from any part of the organization. Uh, and, and, um, and we use that to, to this day, incidents, uh, I think everyone understands. Uh, so you could look at, uh, a team's release frequency and its instance and judge 'em against each other. More recently, we've added lead time to deploy, which is quite helpful. And change failure rate, uh, more useful.

00:11:50

I would, I would suggest, uh, for spotting anomalies, um, than, than some of the others. As important as that, as important as agreeing what these were. And that's not simple in a place with as many opinions as silos, was to make these, uh, the data point available really easily, uh, to, to, to, to, to everybody. So today we've got a 5,000 pods in H Ss B C 50,000 people say roughly 5,000 pods for every pod. We can look at these numbers, right? And everybody can see it entirely transparently. So from a pod all the way up to enterprise, we can tell how, how people are, are going. So we needed to move on to, to the, uh, to, to the, to the investment bank. So our, our next task, if you like, was how do you grow from that small idea of, of one team to a whole department?

00:12:35

So there's one application to a thousand applications, a hundred people to 6,000 people. Leadership really was a game changer. So we knew mantra, we, we knew that, uh, go faster, break less, uh, was the idea to sell to people. We've proven that, but we've got lots of pushback. You know, there was lots of excuses. It's hard. I've got legacy systems. My business don't understand it's going to rain tomorrow, whatever it was, right? There was lots of excuses. So we sharpened down on the go fast to break less, uh, and we made a double in half. You need to double your releases and half your number of instance every year, each year. And ev every year, we got that message out. Now, we told people that. There was a few caveats that came with that. They're kind of important. If you're in charge of like a, a service line, a department, you might have 50 systems.

00:13:22

You might have a hundred systems. We didn't dictate that you needed to double them all. If you needed to quadruple one and flatline another, that was perfectly fine with us. You needed to outcome for your entire organization of double and and half. Really importantly, also, it was multi-year, first year, people are used to fads, you know, business. Say, you know, management say, you have to do this this year, but next year they'll have forgotten that they, they've moved on. We're near six of this in the investment bank. So people realizing this wasn't the fad that was going to stay was, uh, a very important thing. And finally, it's an OKR and not a target. And that's really important because you want people to be ambitious. Doubling is ambitious. Going up by 10% doesn't change the dial. Anybody that can change by 10% by working a little bit harder or cutting something out, if you double and you do that every year, you have to tackle the hard architectural issues or whatever else it is.

00:14:12

So, so having that ambition was, was really, really key. This is, um, a quote from one of our businesses, uh, or business leaders, uh, in, in the investment bank. There, big supporters, such big supporters, they think they invented this now, and we, we like to let them believe that <laugh>. Um, but there's, there's a story I think that, that maybe illustrates this a little bit better. Giles and some of his team were visiting a client, a shipping, uh, client in, in Asia, uh, I think in Thailand. And just as they were leaving, one of the, one of the guys says to them, he said, you know what, I'd love if I could cut and paste that thing and stick it into my spreadsheet over here, that would really, really help me. And, uh, he didn't expect the answer. And the answer was, yeah, we'll do that and you'll have it tomorrow.

00:14:50

And we did. We delivered it the next day. And that's a really small thing. But we compete, uh, on two things really with, with all of our clients. One's our price point, our fees, and the other one is our client relationship. Um, that client wrote back to us the next day. It says, I'm literally, I'm stunned. You know, I expected I'd ask for this. And you'd say, well, we added it to the backlog. And if lots of other people like it, we'll, we'll do it next year in version 2020 x, but to turn it around the next day, I'm stunned. You know? So look, we don't have to work that hard with that client anymore. He's stickier. He's gonna do more business with us. And so we've delighted our customer. It's a really simple story, but I think it tells, you know, the difference that this can make for our, our business.

00:15:29

So having done the ib, we have to get on to do the, do the entire bank. So now we're moving, moving to eight and a half thousand applications and 50,000 people, right? So we know our mantras, we know we'll go faster, break less. We know our double and a half. We went back, there's nine different CIOs across the, uh, across the bank, uh, all across their own, own departments. And we had the same thing. You know, oh, we can't do it here because, you know, the business don't like it. Or We're not the investment bank. The business aren't as risk, uh, the more risk averse and the weather and all the, all the usual things, right? And the first year, 2022 was the first time we got all of the CIOs to agree to a target, not an O K R, that they would increase release frequency by 10%, right?

00:16:12

We pushed all the other things out, you know, the metrics, the, and the other, the other enablers that, that, that we'd given them. And they surprised themselves. They hit 40%, right? And guess what? In 2023, uh, we're going to, uh, we're going to double in half or double in half is the, is the mantra for, for, for, for, for everybody. But if you're trying to change an organization this size, uh, you have to, you have to change the culture. And, and so you have to appeal to the 50,000 people that, that we have as, as, as as engineers. And so in the program that I'm running, we really treated the engineers as our customers, right? So putting feedback loops in with our customers was key. And we listened to what their biggest bug bears were. And for us, it was the, it, it controls. So we went after that big time, right?

00:16:56

We looked at the intersection of where you, the highest number of releases and the simplest control story. And we found this journey, which was the journey that we were trying to push them towards. Anyway, the simplest software release that only had to do four controls in order to get out into production, we automated the hell outta that. We simplified the hell outta that. But we treated the controls as products. That was the really, really key differentiator. And I happened to be the control owner for two of them. So it was a little, little bit easier for me to eat my own dog food. Um, but we, we treated the engineers as customers. 'cause they were the people consuming these, these, um, uh, uh, processes and they'd never been listened to before. They'd always been treated as the enemy somebody beat us to be suspicious of.

00:17:34

And so that was a real game changer. And I think, uh, it, it's, it's helped shift the dial for us in 2022, uh, because of the work that we did. Um, we were able to eliminate 35,000 days of toil from the engineers, uh, experience that year. And seven and a half centuries of wait time in H S B C of big numbers, seven and a half centuries of wait time removed. That's just something that pleases, uh, an an an any ENG engineer. So I think we're winning the hearts and minds of that community too. And in March, we had a, a spotlight, an internal TV event where we, uh, we invited people to a, a, a, a talk about our new DevSecOps strategy. Now, maybe a fairly dry topic, but 11,000 people turned up to to it, right? Uh, now it's voluntary. You had to register for it, and you had to take the time to either listen live or listen to it later on.

00:18:31

And I can only think that they were doing that because there was something in it for them, right? And what was the next thing? What was the next thing that was going to save them them time and, and money. So our group, C C I O, you know, he recognizes that this change, and again, referring back to the DevOps report from 2014, we showed that there's a close correlation between organizations that went faster and those that were more profitable. So we've proved that at a small scale for Evol. We've proved that at a medium scale, if you like, uh, for the investment bank. And what we're doing now is trying to prove it for the enterprise. So that's, that's where, that's where we've gotten to. Uh, we've learned a lot along the way. I don't have time to list all the things that we didn't do so well.

00:19:11

What would I, what would I, what would I say that that did work well, you know, transform from within, get outside advice, get help, absolutely. But your teams must know they have to do this for themselves. Have a simple set of metrics, a simple message double in half. If you have to communicate with large people across multiple different countries, you can't get to them all. You gotta tell them that, uh, leadership is key. You know? Uh, the experience with, uh, moving to, to, uh, product from project is, is, is important. The IT controls thing for us was a huge game, game, game changer. What not to do, don't tell smart people what to do, you know, uh, agree with them what the outcomes should be, and then get outta the way. Uh, beware fake news, right? Uh, what I put that in there because too often if people say, oh, you know, we've, they've celebrated a success story, but engineers will know it's not true.

00:20:06

Our manager will say, we're done. You know, we're a hundred percent agile, or 82% DevOps, or whatever it is. And people see that for, for, for, for what it is. Uh, so, so try and avoid those things. Last thing I'll say is, um, I think I'm going to do a q and a session at midday in ballroom four. If NB has any further questions, I'm happy to take them. I have a few questions of my own be warn. So I'd love some help around how you do tech for tech funding, uh, and also, you know, moving from, um, correlation to causation. Uh, for r o i in a heterogeneous environment, that would be something that, you know, I would personally like some help on. Gene, thank you all very much.

00:20:49

Alright.