Fear to Hope - How HCSC Became Nimble Through Experimentation During Peak Demand

Does seasonal demand impact your ability to implement change? From insurance to retail to entertainment, peak demand can limit an organization's capacity to take advantage of one of the best times to run and learn from experimentation. In this talk, attendees will hear how one of the largest insurance companies in North America evolved to conduct experiments during their open enrollment season.


Focused on business results, HCSC teams addressed missing feedback loops in their delivery engine to be more responsive to insights gained from customer experiences. They addressed this feedback with fast experiments for delivering features, which made it easier and more compelling for customers to sign up for HCSC’s plans. From frozen to flowing, HCSC’s strategy set teams up to be nimble during peak season.


This session is presented by Tasktop.

TM

Tashfeen Mahmood

Senior Manager, DevOps, HCSC

DD

Dominica DeGrandis

Principal Flow Advisor, Tasktop

Transcript

00:00:12

Hello, my name is touchscreen Mahmood, and I'm the senior manager of dev ops engineering at CSP.

00:00:19

Hi everyone I'm Dominica DeGrandis and principal flow advisor at Tasktop

00:00:26

Today. We will tell you the story of how we at ACFE evolved our mindset about implementing production change during peak demand season. And for us that's our open enrollment period. We will discuss how we went from a fear of making change in production to being hopeful about the learning that we will get from it. How we went from a place where changes were frozen to where a business value started flowing to production and how we went from being dispensed from our business to having shared business goals and objectives.

00:01:11

So, um, here's an overview of HCSC by the numbers. It's CFC is healthcare service corporation, but you may know us as blue cross blue shield, and we operate in five different states, uh, Illinois, New Mexico, Oklahoma, Texas, and Montana. It's, uh, it's funny because a lot of times when I tell people that I, I work at HCSC, it doesn't get a lot of reaction. But then when I tell them that it's blue cross blue shield, I get a lot of, uh, different expressions on their faces. Um, so we're a large company. We are about 24,000 employees and we serve, um, 17 million members. So that's 17 million souls for whom we are the healthcare insurer. We're very proud of the fact that we are the world's most ethical companies for five straight years. Now, that's us. We are, um, we're a large conservative healthcare company and that's going to play into some of these slides coming up next.

00:02:25

All right. So I'm going to tell you this story of our it journey. I will talk about these milestone moments in our history in more detail in subsequent subsequent slides at a glance, I will tell you that, um, I'll tell you about our big it transformation that happened in 2016. Then we will focus on the open enrollment period, which constitutes our period of peak demand. This is like black Friday for a retail in that our system gets stressed more during this time. So in this regard, we will discuss our change management practices and talk about the various types of productions. Then I'll tell you about how is zero downtime deployment changed our perspective about learning potential completely how it's led us to start to measure product metrics and shifted our mindset to experimentation.

00:03:32

Before 2016, we did not focus much on DevOps practices. Um, deployments were often done by an enterprise team and we would, uh, email them our deployment files or sometimes put them into a drop folder and they would pick it up from there. And, um, and do the deployments manually in a lot of cases. So things were not, not really great and our leadership, especially our CIO saw the problem with that and wanted to do an it transformation. So we set about on doing a product and dev ops transformation in the first stage, uh, new roles were defined and, and you product-based organizational structure was created. Um, these products were defined as collections of applications and the roles were aligned to those products. And the second stage then agile practices were created. Modern DevOps tools were selected and automated pipelines were created. And then we changed those, um, practitioners who had been aligned to those product teams on those agile and dev ops practices and agile and DevOps concepts. Now that that had been done, uh, these practitioners were given, um, uh, the ability to start running sprints and they, they then, um, worked in, uh, in a scrum model and, uh, started observing agile ceremony. And that's pretty much where we are today. Uh, we continuously improved since then. Uh, we continue to create more automation and do more process updates

00:05:34

As great as that transformation was. Uh, it's still left some challenges. The first challenge was that our transformation was centrally driven. It was popped down and our practitioners felt that the change was done to them rather than for them. This made it really hard for them to own up to this new model. The second part that this was only an it transformation, the business was not really involved in it. So our, uh, work from the business still came in through projects. Um, we in IP would then convert it into products and, um, do the work and then collect it up in projects and deliver that work. This meant that we were still pretty far from our, our business. The third challenge was that our organization was complex and highly matrix. As you can see on the right hand side, we have, um, product management teams that says product line and then technical teams that are in, uh, resource pools. And together they all, uh, build a product team that makes for a really complex and interdependent, um, organizational model. And the teams that work within this, um, construct then create really complex and interdependent release, which kind of went into the change management mindset, which was the fourth challenge. So, um, the change management mindset was that, uh, avoiding change because the releases were so complex and large, uh, we did not want to take on that pain too much.

00:07:23

This was especially true. And our, um, period of peak demand, which again is open enrollment. Um, our enterprise goals were to go for stability and, um, the enterprise goals were, um, especially around open enrollment. We're very focused on, uh, production, reliability goals like production is job one and make sure production is stable. However, this focus was not really countered by any goals that prioritize learning fast delivery of business value. Therefore, even up until 2018, we used to implement an enterprise level production fees during an open enrollment period. This meant that no change was allowed unless absolutely necessary. And 2019, our change management policy, uh, evolved a little bit. Uh, we allowed only necessary changes to production. However, we added quite a bit of rigor questions, like why can't the change not wait until after open enrollment? Uh, we're asked with every release. So, um, to make a change in production, you had to claim that it was really absolutely necessary and that without that change, does that start with, um, so, so, so yeah, we had production freeze production slash anything to kind of discourage change.

00:09:04

So the difference between this and this slash who's it subtle, did it evolve or was there one sort of incident that occurred that enabled this slash to, uh, allow?

00:09:20

Yeah, so that's a good point. Dominica. I think a lot of the reason behind the flush was that we kept looking at our change management analytics and we kept finding that no matter what the change volume was still there. And so people wanted to, uh, or needed to make this production change. Um, you know, so, so the production freeze really was not working as hard as we were trying. So that's why we started going to a model of let's just add more rigor. Let's just try to get, uh, you know, get those changes done that we really absolutely need to get done.

00:10:09

Um, so in 2020, uh, um, we, we, this something which I call a backward story, and this is what I mean by a lot of DevOps transformation stories begin with business had to move forward or delivering faster was, uh, the need to compete in the marketplace, et cetera, not us. Um, especially during open enrollment, our business was sold on the idea of production stability by not making changes to production in the past production changes were scheduled on Friday night. That way, if something went wrong, we would have the whole weekend to recover from it in 2020. However, my dev ops engineering team worked with our retail team to start open enrollment by making is zero downtime deployments. And this was not a trivial change either. This was the change that started open enrollment, and we did it on a, on the Wednesday afternoon that the, my brows, um, this meant that we could use a blue-green deployment environment as a fallback, if something goes wrong, uh, during the release, then do not take the whole weekend to recover just back out to a blue environment. The risk of making changes and production was not significantly lower. And our business saw the potential and now want it to deliver faster. So we actually went from making a capability and then our business saw the potential of that capability. And now has the, um, the thirst for, for speed.

00:12:02

Yeah. And also a bit more trust that's there too, since you were able to show them that you could do this zero downtime release and they caught on oh yeah. Maybe, maybe, yeah. Maybe we can get some, a bit of change out there and accept that wee bit more risk.

00:12:20

Yeah. Nothing like nothing like building trust by actually showing that you can do it right. You can talk all you want about, about doing something, but, you know, uh, when you make a change and when you start something as consequential, as open enrollment with zero non-time release, that really, that really gets attention. All right. So, um, we've started in earlier this year in 2021, we started looking at our, um, our product metrics and this is a snapshot of them metrics of one of our very typical products. And you can see here that during year-end free delivery slows down, right. As you can see in the top right graph, um, in the rectangular box, you can see that as soon as, um, open enrollment starts, sure enough, the velocity does go down. Uh, we are delivering less than production, but the rectangular box at the bottom shows that your work in progress continues to go up.

00:13:35

Right? So, so what was happening to us was we were not making any changes in production, but we were, um, still working on those changes, right? So our work in progress was continuing to go up, but, um, but it just was not, uh, was not going into production. But the other thing that was, that was very interesting was, and I call your attention to the oval on the top, right? 10 graph is frequently. There's an urgent change that we needed to make in production that just could not wait until within open enrollment. Right. We can see this is still on the open enrollment period. Suddenly there was a spike and we needed to deliver some features. And this was a very typical pattern, right? This is something that happened with many of our products that would then open enrollment after the initial dip in the velocity, we then went back up and started delivering more.

00:14:43

So the, the learning here was that we are invariably going to need to deliver more before one, one right before the new year. So that, that really means that we have a higher need for agility, even during the year end freeze, even during our period of peak demand, we need to, um, have a higher need for agility. But the things that were prohibiting us from being agile were the work in progress that was continuing to pile up. So the more work in progress we had, the harder it was for us to make these changes fast. Um, and the harder it was for us to be agile, uh, during the period that we really needed to be at the other part is we hadn't neglected our technical debt because we were focused on delivering features just before open enrollment starts, because there was going to be a priest. And so, uh, so because there was so much technical debt that had been accumulated, um, we, we, we really could not have the agility to deliver as fast as we could, or as fast as we want it to, for that spice that was coming. We weren't really prepared very well for it.

00:16:13

Yeah. It's also interesting to note that. So there's, so whip is increasing and there's a slow down with delivery and the throughput starts to tank if velocity starts to tank, but then there's all kinds of problems, right. In this war rooms happening. And people not having capacity to attend to some what we would typically call, um, you know, training sessions or, or fix technical debt, but the, the forces were there and the pressure is on. And so the teams really worked hard to try and get that delivered, but then we can see the impacts when way past open enrollment period and passed into a year and, um, went into the following year. Right. We can see, like for the end of, um, or the beginning of 20, 21, the impacts that it had to flow time as it increased.

00:17:12

Yeah, yeah. These times, right. Dominica. This is, these are the times when, um, w when we were in meetings or we were supposed to be meeting with somebody and they'd be like, sorry, I I'm too busy right now. Cause I have all this work to do. And, and this is what was happening. Right. Suddenly things were spiking up. So, um, so this was very interesting time for sure. Yeah.

00:17:38

And also sometimes there's a last minute like, oh, we gotta cancel it. Can't make it. Or in a war room, New priorities that pop up.

00:17:50

Right. Right. Yeah. All right. Um, so, and, and that's, that's exactly Greg. We had some, some formalized interviews with our practice nurse that confirms this issue. Right. Um, the kind of things they were saying was everything's a priority. Others were complaining, complaining, there's too much work being done all at the same time. And we had already seen this with the metrics. Right. We could see that the work in progress was high. And we could also see that there was a lot of technical debt, the closer we got to open enrollment, the more we were neglecting that neglecting our debt, and now it was coming back to bite us. Um, so here's why we did, we, we started identifying some business goals, um, such as improvement of customer retention rates or, uh, reduction of customer acquisition. Um, and the idea behind that was we wanted to be closer to our business.

00:18:55

So, um, yeah, let's, let's go to the next slide. Yeah. So we want it to be closer to our business and we tied the slow metrics, like work in progress to these business results. Right. So, so now that we knew what business goals we were going for, we tied our, it metrics like work in progress to those. Our hypothesis was that the reason why product teams thinks that everything is a priority is that they are not focused on the business results. Therefore, uh, you know, we were trying to pull it that gap between the business and the it team by making those business results, um, become shared goals. Um, now each member of the team is behind improved customer retention rate and lower customer acquisition costs. Only the features tied to customer retention and acquisition costs are priority. Now, um, we set up experiments that will impact these business results, um, while lowering our whip and tech that this way it was easier for us to become nimbler and respond to any emergency needs. During our peak season. This often has our focus on fast delivery and feedback, right? When you learn that a slight tweak can improve the goal that you as a group believe in, then you want to quickly make that tweak so you can positively impact that goal. So, um, so, so this was very powerful for us.

00:20:32

Um, and so was this, uh, where were the insights gained and communicated between your it teams and business folks when you were started doing these experiments and the flow metrics and the data to show the progress that was being made? Was that something that was occurring on a monthly basis or like at a quarterly quarterly review? What did that look like? Just real briefly.

00:21:02

Yeah. So the, the, there are, um, what we call the product line and retros are the product line retrospective, right? And that's where we look at the business results that, uh, that are being tracked. And, um, and we could notice that, um, you know, those, those are our business results and, and really what we were trying to go for, uh, was not really business results. At this point, we were trying to show to our business, uh, that we were prepared for meeting those business results, right. This is, this is work that we're doing to prepare for us, so that there's less work in progress so that we can build the, uh, the agility up during open enrollment. So, um, that's, that's, uh, really, um, you know, that's, that's the concept that we're going for, uh, within, within our team, right.

00:22:08

And by building, uh, are you talking about making sure that the teams have more capacity to do this type of work instead of being so old?

00:22:20

Exactly. Right. So the, the less work is in progress. The more we have the capacity to make those changes, um, faster that come up inevitably during open enrollment, right? The, the, the less, the text that then the more our, um, our ability to make those changes in production as fast as possible,

00:22:49

Translate it into distribution, um, uh, technical debt showing up on the very right bar there.

00:22:57

Yeah. Yeah. This is, um, so, so this thank you for, uh, for calling that out. So this is the, um, the flow distribution maps here. Green is features, red are Bob and purple is pack that. And as you can see, this product was not, uh, really fixing a lot of there are addressing a lot of their technical debt and tell recently until we called out this, this need here. And we started to talk about how, uh, you know, technical depth needed to be reduced in order to deliver faster. Um, so, so this is something that we are now, um, you know, we're, we're, we're trying to focus on reducing technical debt so that during open enrollment, we can, we can work on that.

00:23:49

So, um, this, this slide is, uh, really talking about how, um, you know, we have learned that experimentation, especially in the period of peak demand is, um, is paramount. Uh, we learned that allocating capacity to reduce technical debt Israeli, you know, the, the way to go, because the more we do that, um, the more process improvement we do and visualize it with our flow distribution metrics. Um, that way we can, we can be more nimble in addressing, um, addressing the needs of our peak demand season. So our urine processing starts in October and our open enrollment is going to start in, uh, in early November. So, um, we have our fingers crossed then, um, hopefully we're going to learn a lot from our experiments.

00:24:53

I'm really hopeful to want to make these daily improvements and improve the process for making daily improvements. That's just one of the five ideals of the unicorn project. It's a third ideal, right. May time allocate capacity for continuous improvement to daily improvements. And so we'll be able to see here the impact that it has on your business subjectives. All right. Thank you.