We are currently transforming how work gets done, more specifically how we deliver software in a consistent, secure, reliable and efficient manner. This experience report focuses on a large customer facing value stream, servicing over 80 million accounts. It is made up of 25 Agile teams with approx.150+ engineers geographically distributed. We were previously using a vendor product running on a mainframe and manually releasing core capabilities to production on a monthly cadence. Today, we have transitioned to a reactive based micro services architecture in the Cloud that has enabled us to achieve far greater throughput. With close to 60 pipelines frequently releasing in small increments (~4300 releases over the last 12 months), we have minimized risk by maturing our engineering practices and increasing our adherence to automated control gates. This talk is centered around our close collaboration between Business, Engineering and Product to understand user needs as well as the alignment on shared roadmaps. The goal is to partner on achieving business AND tech outcomes - both delighting our customers while also being well managed. Was this easy? We are here to share our learnings, challenges faced, and how the team overcame these to achieve the goals set in place!
Biswanath (Bose) Basu
Senior Business Director, Anti-Money Laundering - Machine Learning and Fraud, Capital One
Director, Technology Engineering, Capital One
Director, Product Management Delivery Experience, Capital One
I met Dr. Toker about a pal in 2012, and that was immediately in awe of what he was trying to do at capital one. He has presented about their amazing DevOps journey here at DevOps enterprise. Almost every year of the conference. Topo has presented solo. He's presented with a director of it governance and amazingly with one of their internal counsel who helps support their open source efforts. And this year he's helped make possible. One of the most amazing experience reports yet this experience report is from the credit card division. One of the largest business units inside of capital one, their story is told by a trio of their business leader, their engineering leader, and a product owner of an enterprise shared service who supports them Boz is currently senior business director of anti money laundering for machine learning and fraud. His previous role was as, as the business and product owner responsible for building the credit card servicing platform of which this experience report is about. Raquesh Goyal is senior director of technology engineering, who was brought in as Bose's technology counterpart, and Jennifer Hansen is director of product management for delivery experience, whose charter is to help developers across the entire enterprise, be productive and secure, and their platform offered as a shared service supports over 50,000 builds per day. Here's Boz Raquesh and Jennifer,
I hope everyone's enjoying this awesome DevOps enterprise summit. I wanted to start by thanking Jean and the program team for bringing this community together yet again, to successfully learn and grow from each other's experiences. We are here today to share just what can be achieved when business and engineering teams come together, unite and collaborate to improve the overall customer experience. I'll start by introducing myself. My name is Jennifer Hansen. I lead the product teams for internal delivery platforms. I am passionate about empowering our engineers in building great products for our customers over to you both.
Yeah. Thank you so much, Jennifer. My name is Vishwanath Basu. I go by both, uh, I'm a senior business director in capital one and currently lead the, uh, uh, fraud strategy in the anti-money laundering area, using a lot of machine learning and AI techniques, uh, you know, in the service of improving our anti-money laundering practices. Uh, I was also as we speak, uh, the, uh, sort of the business and, you know, product manager, uh, you know, leading this effort that we will be talking about that we did in the U S card. So that's a little bit about me, uh, handed over to Russia to introduce himself,
Hey folks, I am Raquesh Crowell. And I currently lead our customer identity platform today. I'll share with you my experience leading delivery for the significant three year transformation of the card servicing platform. And more importantly, how I influenced my initially skeptical business partner, both in this journey.
So for those of you that are not as familiar with capital one, I thought I would share a few fun facts about who we, uh, we turn 25 years old this year, still a baby in the financial services industry, but known for, uh, compelling ways and determination to disrupt and innovate. We have millions of accounts over 90 plus million of these, but for us, every single customer counts, we are passionate about the experience. We are close to 49,000 in terms of employees and over 10,000 plus of these engineers, we've made it to the fortune 500 list 20 years, which is amazing. And we are known among the top 10 dev ops leaders because of all our passion that we have around the entire transformation journey. And finally, I think we are one of the largest digital banks today. So as I moved to the canyon, one of the things I wanted to call out is it's an amazing experience when you try and cross a canyon, but it doesn't happen overnight.
It requires a lot of planning, preparation, and core fundamental elements to get to the other side. So within capital one, we've been on an agile and dev ops transformation, and this has been underway for the last seven plus years. Some of the things that are notable in our transformation that are also impactful during this particular experience journey has been moving from waterfall to agile, fundamental changes, transforming how we work. In addition, we had a lot of outsource vendor software and we made the commitment that we needed to in-source. We needed flexibility. We needed agility and we love open source. We're a company where open source first is a mindset that exists all the way from the top, right down to every engineer on the team. In addition to that, we had a lot of monolithic architectures and we needed to move to microservices. And then finally we had really optimized our data centers, but then we realized that we could never quite match the value proposition of being in the cloud.
And I'm extremely proud to say that today we are all in on the cloud, no more data centers at capital one. So as we think about this journey, I want to call this out as these are foundational elements that drive transformations within business units as well. The most important passion that we have retained is that for doing the right thing for our customer, I wanted to play a little short audio clip for you on the value the transformation can bring to your business. When you have customer information at your fingertips, how you can improve the service experience for a customer who is looking for our support. So go ahead and listen to quite an amazing story during this current time,
Capital one, this is Sarah, who am I speaking with? And what can I help you with today?
Um, I'm calling because, um, I just tried to use my card, um, so that I can put gas in my car. And well, I was just at a gas station and I want to use my cards. So I have, um, what they called that overdraft protection plus. So if I go ahead and overdrafted or, or, you know, I have it's the next day grace period, but it doesn't let me do it.
Sure. No, I completely understand your frustration, everything. I can't do anything about the overdraft line of credit option that you have, but what I can do is I see there was four fees that you received in March. I can go ahead and refund those. That'll give you a little bit more available balance so that you can get gas.
Oh, wow. That's that? Wow. Um, I, I mean, I, I didn't call for oh, wow. Okay.
It's the least I can do. I just feel bad that you're overdraft line of credit option that you have for the next day. Grace, isn't helping you when you need it to
Hey, no problem at all. Yeah. Yeah, no problem.
It's great to have you,
Speaking of customer, let me transition you over to Bose. He will tell you all about the journey from a business leader perspective.
Thank you so much, Jennifer. Uh, before I start, uh, let me give a little bit of a context on the business problem we were facing. The problem at the very core is that of an aging, uh, customer servicing platform. Uh, and to give you a size of the scale, uh, this platform is servicing tens and millions of capital one credit card customers. And of course generating, uh, about like, you know, hundreds of millions of dollars in annual bottom line value to the business. So really critical component, a really critical platform, both in terms of the value it provides to the customer, as well as sort of the shareholder slash economic value. It provides to the business. The challenge was this was an aging platform and it's got the classic problems of aging, uh, that, you know, one experiences or any platform. In fact, you know, as I grow older, I'm experiencing some of those, but if you go to the next time, uh, you could actually see some of those problems that, uh, that, that we were facing primarily around the fact that look, aging platforms.
So we have problems with not meeting customer needs. Customer needs have evolved over a period of time. Uh, so you're not meeting them. We have a bunch of batch processes that are slow inefficient. Those are really things that we really don't expect, uh, you know, in the current day and age, especially given where we are and last but not the least, it's also like capital one, the, the primary horsepower or that capital one runs or runs on is data. So we want to use the power of data to make strategic, intelligent real-time decisions. And the platform that we had was not really allowing us to do that. I mean, it was sort of, you know, constructed in that way. So when you think of the problem here, uh, or the objective, not only are we solving a technology slash cyber risk problem with an agent platform, there's also a lot of business value in terms of tens of, uh, you know, millions of dollars in NPV, uh, as we kind of strive to solve these problems, uh, for the business as we started this work, what were some of the principles that we agreed to the first one, as you can see is around sort of working backwards from the customer need.
So customer needs are absolutely quintessential. We exist because of serving our customers and our customers might be very customers, but one of the core principles we worked on was making sure that we are meeting the customer needs, meeting them where they are. And quite frankly, we wanted to get an a plus in terms of, you know, where we wanted to go. The second objective that we had in mind was to make sure that as we do this, we are both iteratively delivering value, maximizing learnings, minimizing risk. And like this has, this is we knew this was going to be a long project, at least a couple of years. So we wanted to make sure that as this project goes on, we are able to deliver value to the business as well, both in terms of customer experience, as well as in terms of sort of the actual economic impact of the business.
And the last one that I would call out here is sort of a bias that, you know, we all face, uh, as you know, we've been used to working in an order system for, for a long time, which I call us the anchoring bias. What we were very, very conscious of right from day one was to make sure that we are not building a faster and a stronger horse. What we are really trying to do here is solve a problem. And that is to move in this analogy here, to move from point a to point B in the fastest and the most efficient way possible. So with these guiding principles, now, let me tell you a little bit about how we actually went about doing it. The first thing we did was essentially look at our platform or these set of customers that we had, uh, within our, within our portfolio and divided them into different segments, segments, and groups of customers based on what their needs are and be what functionalities they need.
Uh, they needed to be serviced. And as we identified them, we sort of graded them on the sequence in which we will deploy or other testes as we go on. This was by no stretch of big bang. Like it was a pure risk based approach. And what we did essentially was take a thin slice and you can take a 10 slice here, just like the picture that shows you of that, of a cake. We would try that 10 slice, see what works, what doesn't work, right. Uh, there's a key point that I mentioned here, as we talk about 10 slices, we were not as much as we are looking for MVP. We are not looking for the least farmer denominator here. We were looking for the minimum viable experience that we would give to our customers and not just any small sort of, uh, you know, product that we could come up with now, once we test that piece out and it works, the next thing we will do is just essentially scale it up.
And that's what you see on the next slide. And this is actually an actual slide for the first few months of delivery. Uh, in fact, uh, on the second month, our first release was that of just converting to accounts. And trust me, we had a ton, we had tons of learning as we were doing it. It's almost like, you know, sending a man to the moon where you don't know what's going to happen from the old system to the new one and making sure, you know, those people are safe. So a lot of great learnings in this sort of very carefully calibrated agile approach, which was giving us the learnings was minimizing risks. At the same time, it was giving us a lot of value and I'll sort of finish my part here about the business problem, with what I, uh, what I mentioned earlier with sort of the customer bad thinking.
And a lot of it is about strategically thinking about who our customers are. Our customers are not just the credit card holders. They are our regulators. They are the business analysts within the company who are working on it, that our customer servicing agents, there's a ton of people who are actually the users of the system. And we use very heavy human centered design to ensure that we are actually meeting the needs and not just replicating what was there in the old system. So with that said, now that I've given you the business context, I'll hand it over to Raquesh, who will, uh, you know, lead you through some of the technology work that, uh, you know, went in the service of this.
Thank you both. Wow. So when I started in this role then got introduced to the goals that we just saw from Bose. I was truly struck by the transformation plans. I led various large scale transformation initiatives. However, in this case, we had to migrate from a system frozen a few decades ago in terms of technology. What we had was mainframe based vendor product, that it had been bandaged to the point where the workaround systems and operational teams, as large as a product in itself, you see this old beat up car. That's what we had been used to driving and maintaining it for years and years. So what we needed was a modern system to deliver on the business promise. And yes, we had to run everything in the cloud. So you see the shiny car and you may have heard various common debates about building a Chevy bus is a Cadillac or a Prius versus a Tesla.
Regardless. We needed a new core to host this tens of millions of accounts, serving a at our business segments that both just shared, it was going to be a multi-year journey with a large number of engineering teams. Our business partners wanted reassurance while we change the engines are these cars. We could not in any way jeopardize this heavily regulated billion-dollar business. So where do we get started? Well, early in the cycle, we recognized the need to invest and evolve the tools in our toolbox to borrow from Lincoln. We needed to sharpen our sauce before we took on this journey. And we decided that investment in pooling was really required. We also had to invest in reskilling engineers and provide them the appropriate tooling to be agile in this transformation journey. So when we shared the decision with business partners, of course, there were skeptics in investing in dev ops, as no business features would have been delivered for the first, you know, short time per day, as I'm going to share this journey and outcomes, you're going to see the benefits of such an appropriate approach, but many for so to have a feature set platform, meeting the needs of our customers, we settled on building an API driven microservices based architecture system.
The goal was sustaining and building incrementally, as both suggested and expanding into various business strategies. You can think about this as having a fleet of smart cars built for specific workloads rather than one futuristic car to normalize the developer experience across various teams. We spent a couple of months investing in dev ops. So we took a pragmatic approach, leveraging proven enterprise tools that would work for us as opposed to getting distracted by shiny technologies that are always many options. And by standardizing, it helped us to keep the engineers stiff and dribble. We could react faster in situations where engineers needed to contribute to other teams or move from a team to another, et cetera. We got laser focused on building our CI CD pipeline, as it would provide a tremendous lift. It would empower the teams in cycle time and importantly reduce risk by enabling small, fast and frequent releases.
We also have to address regulatory and compliance controls in that we were blocked releases as part of the pipeline when certain controls are not met. So once belt, the pipeline became an integral part to the project operating model, as we started to build and release incrementally, it enabled the teams to focus on product features while the pipeline tools were just a utility to leverage rather than requiring investment from each individual team. This core infrastructure allowed us to scale the number of teams working on the project. We are teams distributed geographically at four off our people centers. And at the height of this effort, we had 25 teams working and contributing simultaneously. Just imagine the scale, how we were able to scale using this, this approach of agile, incremental delivery and micro services to scale out and all delivered at the same time at their own pace.
A big part of such a transformation is also how you organize the people, the people front, right? You need to make sure we followed the two pizza principle by mixing subject matter experts with legacy skillsets and those skilled in targets, their technologies. This also helped with rescaling the engineers with legacy skills and keeping them motivated to bring out the contrast. If you think about the past, and actually please follow a vendor product, you have to deal with collecting requirements, lend the product development cycles followed by testing the list cycles. It was very formal with handoffs amongst the various stakeholders. Poor comes as shortcomings. We had to build costly workarounds. Let me share with you how we operated on this project. So while we scaled across geographies, we took an approach where we call located the product owners and engineers to make sure they are able to work together closely.
This allowed us to stay agile and release frequently with a rapid feedback cycle. I wanted to share one unique situation where all of these came together, remarkably well, the customer first approach, the tooling, the co-location and the agile principles. Our servicing agents were facing some difficulties with the release. The product owner will collect that input overnight, get the team to work on it, except the release. And the agents will see the updates, the following the while. This will awesome. Come back to what we saw in the, with the vendor product. When we couldn't even make a change to the product by ourselves, it wasn't good enough for us. The team decided to take a different approach, the entire team, the product owner, and the engineers decided to travel to the call center. They worked side by side with the agents, made several releases at the site itself receiving continuous feedback as they addressed all the concerns I can say it was truly liberating to make changes and releases. That will just think about how you can just make a change and release all in the same sort of window. If you will, for all of this, we relied on Jennifer's enterprise delivery experience team and her support in enabling this rapid release cycle.
Thanks Rakesh. So I have to tell you growing up, I enjoyed success at many a track and field event, but the one that made me most nervous was the relay like any other team event. It requires strategy, teamwork, practice, but success is so critical when you do that Baton handoff in a similar manner. If you cannot translate all the verbal and non-verbal needs flawlessly on the day, despite all the awesome work both and Rakesh had done, we became the holdup in the spirit of shared services and doing the right thing. And I know a lot of you out there work and live in breeds, shared services. Each day, we have this passion to empower our engineers, to meet the business needs, to get speed to market. However, at the same time we have this need to be well-managed to demonstrate it to our board, our regulators, our risk partners and our shareholders.
So this is always a challenge. And what we have done within shared services is designed a meticulous well maintain streamlined process on how to review risks, categorize them, escalate them up all the way for the right level of review and oversight. As you can imagine, this wasn't quite what needed. The automation, the thoughtfulness that they had put through had now become manual checkpoints. We needed these to avoid the non-compliance segregation of duties is a critical component. Especially when you think about financial systems and we had this understanding on how do you meet the Sox controls. You do need this segregation in role to occur, but then we realized that this is not the partnership that we needed to be. Yes, these pre-release controls were essential. Governance was essential, but what was happening is this well-oiled assembly line was running into manual checks. We were doing those predeployment pre-release checks, identifying issues at the last minute and sending them back into the assembly line.
Now, if you had a good day, you could just get cleared instantly. And maybe in the next one to three days, you could get on that road and get moving. But you remember these inspections that took you back at times, these could result in days and weeks of delay. So this was not a fun process and not the optimal partnership we wanted. So what we needed to do is think about the entire partnership end to end, come together, understand what is it that we want collaboratively to achieve. So we wanted to promote growth and innovation. We knew we wanted to deliver a high quality working software, but faster. And we also knew that if we were to be an organization that could attract and retain talent, we had to foster that culture of innovation. I think rich Alita says it best. We take so much time to recruit great people, and we were actually holding them back in some ways on the opportunity on how we could partner collectively and be great.
So what we needed to do was build trust. However, how do you build trust? I have to steal from jazz and David's book. It's an awesome quote. How can you automate something that isn't repeatable? And just because we put all our checks in manually, it doesn't mean that we are era free. There could be an issue come up there because it's something done differently all over again. What we needed was continuous delivery. We needed to figure out how to simplify and standardize our patterns. We needed to reimagine the entire experience. We wanted to get to an automated release. We needed to mitigate risk. So how could we build in some preventive checks, security, vulnerability, remediation we needed to get ahead of these. These were critical enterprise objectives that we couldn't compromise on, but we also wanted to improve the engineering experience. We wanted to increase productivity.
We wanted to truly believe and trust in the journey and to do that. We had to empower our engineers. So the more we shifted left, the more they understood the value they appreciated building and fixing and cooperating the changes that we needed and being able to move faster. So these were some of the thought processes that were underway in the shared services area as we were. So how are we going to do that? I mean, by now we have thousands of teams. We had multiple lines of business. We wanted to give them the autonomy and the flexibility that they needed. So we needed a software delivery, clean room. Yes, very much building upon the clean room analogy we needed to ensure that things that were important to us, things that made a difference in terms of quality, speed, vulnerability, remediation, reliability were built into the pipeline.
This is just some of the examples of the checks that we have within our software delivery, clean room. It allows you to go ahead and rethink the control differently. Yes. We had a Sox control. Yes, we needed segregation. How could we achieve this in a more automated manner? So these became the combined discussions that we started to have as we built out the clean room. The good thing is we knew what the target state look like. We wanted to get to the no fear release for the business. We knew that we needed a push button deployment. The ability to deploy at will as rockish called out to production based on business demand, but we needed those monitors. They needed to be integrated into the pipeline. They needed to be recording and monitoring, compliance, and quality and security, all the elements that we needed assurance on.
Once we were able to achieve this, I'm happy to say we could offer this capability back to the team, which was a really exciting part of the journey. So coming back, what were the key outcomes as a result of this partnership? We definitely saw the improvements as you'll see in terms of experiences, both called out the human centered design thinking that was at the core of his, his entire strategy. Rakesh was focused on the cloud-based real-time analytics and servicing platform to help the business move faster. We ended up with operational efficiency and we have today a scalable resilient platform we've improved and we have a more sustainable risk management. In addition to that, we've been able to empower our engineers. Our auditors are delighted that we have thought through compliance and we've been able to automate this in real-time the testimonials I really want to share are the ones of the call center agent.
The agent that's excited about focusing now on servicing and meeting customer needs. The business analyst who had been waiting for two days on data could now work real time and support our customer base the back office agent. Now no more of those Excel sheets and notes for him to sift through. And then finally, our leadership who were equally excited in this tech transformation journey because it's serving our customers better. Now, I know we always think that if you're a high performer, do you have to compromise throughput and stability? Well, this is where we got to from once a month release to 240 times deployment frequency. I'm going to use some of the Dora benchmarks on elite performers to give you a comparison on how we are doing our delivery. Lead time has improved significantly, both for feature and infrastructure releases. I mean, recovery time, equally important, we're getting faster and improving that. And what about change failure rate? It's a lot lower than where we were. So you can see that both throughput and stability are possible. Now, some of the key learnings that I want to come back to, and I would love for Rakesh and bows to come in and let's talk about this, what we thought and what we did. So over to you, Boz,
It all starts with having a vision and sort of the paradigm shift that we wanted to make, and which is, uh, which is a very common trap that business leaders fall into is thinking of the constraint first. And then the objective we wanted to think big, we were looking for an A-plus so big learning here is we have to think big and we have to think board, we started, we had a ton of constraints. We had never done this before, but that wouldn't stop us from thinking. So if there's one key element of, um, you know, learnings that we got was around just the ability to just think as big and bold as we possibly could. And of course, you know, iterative delivery is some things that I've mentioned on can not just, you know, undermine the value of that tons and tons of value in terms of maximizing our learnings, delivering sort of the NPV or the business value that the business is looking for, as well as minimizing the technology risk here.
I don't think we would have gotten there without the tremendous partnership, but most, most important of all. I think what really struck me was that entire culture of acceptance and empowerment. We took a shift and we started to go towards blameless postmortems. Let's try to understand what we need to do to improve and iterate through it. So I think that was huge for me. Think,
I think this is an important one and creates the right dimension for the problem in the sense that the change comes it's about transformation and transformation, not in terms, not just in terms of technology, which is extremely challenging, but also the human element. Think of agents that Jennifer talked about, you have thousands of agents who've been working for years with an old system. How do you get them comfortable using the new system? How do you get business analysts comfortable to, uh, you know, using new systems that completely in some ways are very different from, you know, the way that you know, we've worked on in the past. So in terms of sort of the key learnings and probably something that we would as a team love to hear from the group is all about sort of how do we, the experiences people have had in terms of not just the technology transformation, but also transformation in terms of winning people's minds and hearts, because I think that's an equally important challenge as we go through this technology transformation, uh, you know, both within the company, as well as, uh, I would say the industry overall
Unit transformation, new technologies, people need to learn new ways. People get habits, habits, you had it, right. And we really need to figure out how we can get help and get people to start thinking changes for the better and any help would be appreciated there.
And I think the most amazing thing is you can't undermine the need to transform, but at the same time, you have to understand the human change curve. It's equally important for your business and looking at it multiple times, the change management and communication strategy is a critical part of your transformation. So I guess we a really open, we hope you've enjoyed the session in the spirit of the, of Halloween, which will be here soon. We don't want you to live with fearful releases. We want you to feel empowered for your engineers to build trust. You think about how YouTube can get to a no fear release, push button deployment model. And finally, we are extremely excited to share the transformation, but we would love to hear about your journey and things that you would have done differently. So you've got our names go ahead and reach out and let us know what you think.
Absolutely. Thank you. On
It's a pleasure talking to all of you.
Unlimited users from organization
Gene Kim’s SRE Playlist