Dynamic Decision Making at HMRC - Moving to Data as a Product During a Pandemic

In March 2020, the UK’s tax department HMRC was faced with a complex problem: how to improve the accuracy, reliability, and truth of information about customer activities, in order to counter fraud AND adapt to the speeds that the situation required, ensuring that every claim was reviewed in 72 hrs?

As the UK government announced statutory sick pay, a job retention scheme, and self-employment income support, the Customer Insights team at HMRC began to transform the Customer Insights Platform (CIP).

CIP is an internal platform built on top of AWS, which provides highly available transaction monitoring and auditing to internal customers across HMRC. This session tells the story of how the Customer Insights team transformed CIP from data storage to data product capabilities and built an intelligent risking service to bake automated risking decisions into customer activities.

Millions of claims were submitted by taxpayers, and every single one was automatically assessed for fraud detection, supporting the work that HMRC fraud investigators were doing in very difficult circumstances. This was a revolution in the speed of data analysis within HMRC. Meanwhile, the thinking of the Customer Insights team was transformed, from data storage to data as a product at scale.

This session will include details of CIP architecture, the delivery mindset of the customer insights team, and working practices that delivered such successful outcomes that the Customer Insights team received a personal note of gratitude from the Prime Minister, and CIP became a multi-award winning platform at the 2021 UK IT awards.


Andrew Letherby

Service Owner of the Customer Insight Platform, HMRC


Caitlin Smith

Delivery Lead, Equal Experts



Thank you, Christina and Robbie. So some of my favorite presentations from DevOps enterprise summit has come from UK HMRC, her Majesty's revenue collection department in 2016, Anthony Callard and Lindsey brewer talked about how they modernized one of the largest. It states in the UK modernizing the tax filing system so that a single parent could finish their filing on their phone while on the bus ride home from work last year, Ben Conrad and Matt Hyat presented on how HMRC were able to distribute hundreds of billions of pounds to UK citizens and businesses, an unprecedented financial support package that would eventually see around 25% of the entire UK workforce being supported by public money. And they hear roguely built a technology to do so in four weeks under conditions of incredible pressure and uncertainty, they joked that they went from the most despised government agency to the most beloved, as citizens were able to get the money they needed when most of the economy was starting to shut down.


But they had also mentioned something about parallel effort to theirs that were working to prevent fraud from happening in this process, because they knew from the beginning that these COVID economic interventions would be obvious targets for fraud and that customers could make mistakes. And so this is that story from Andy Luby service owner of the customer insight platform from YC and Caitlin Smith delivery lead from equal expert. They describe how they were able to use data that they had and capabilities they had built to detect, and even prevent certain types of fraud long before any payments were made to ensure that economic aid went to those who were entitled to it. They were able to do this by working closely with their leadership teams and their government ministers. And instead of policy being created and modified in a way that took years, they worked hand in hand making policy changes in minutes, enabling true decentralized execution. They will share the story of how they did this and how this will likely change how technology works with policy makers for years to come. Here's Andrew and Caitlin.


Hi, my name is Andy Lety. I'm the, uh, service owner for the customer insight platform in RRC and, uh, today, uh, we are gonna myself and, uh, Kate Smith. Uh, if you'd like to say locate,


Hi, I'm Caitlin Smith. I'm an agile delivery consultant with, um, equal experts. Um, I've been working with ND for the last couple of years, um, working on the customer insight platform.


Yeah, thanks Kate. And, and today, what we're gonna be talking to you about is, um, uh, some of the developers we've been doing around dynamic decision making in RRC, um, what the pandemic taught us and some of the experience we had in the pandemic, um, uh, uh, around how we innovated our platform and our products, um, to really help out, um, uh, in delivery of those services. So, uh, who are MRC and who is the customer inside platform? So HMRC a UK's, uh, tax and revenue service. So we collect, uh, things like, uh, income tax. Um, so self-assessment, uh, V T on and corporation tax for customers, but actually HMRC is much bigger than that. Um, we are a very large government organization in the UK. We handle significant public services. Uh, so alongside tax, we also, uh, manage the customs declaration service handling import and export the goods into the UK.


We manage the government banking service, which delivers, um, financial services to government organizations, helping, uh, to simplify, uh, payments and repayments across government, um, and, uh, customer insight platform. My area, we sit alongside all of those different products and services in RRC. Um, looking at how customers, taxpayers, uh, traders companies interact with HMCs digital services, the information they give us, the device, they devices, they use, uh, how they interact, when, what information they, they share with us. And we gather all of that data together, and we use that data in a range of different decision making across the organization. So our old approach, what do we do? So HSC is a fairly traditional organization we've been around for a while. Um, uh, and our analysts in the organization tended to work like lots of analysts do. They'd like to gather data together, collate it, match it, make some decisions or model it.


Um, and then, then do do whatever their specific role or task was. This is a really manual process. Um, it's grown up over many, many years, uh, and they like to maintain that process cuz it gave them control and confidence, uh, in what they were doing, but it had some issues with it. And the biggest issue is that's not scalable. It's not sustainable. Particularly when you move the volume of transactions dial up. So as business gets busier and you've got the same amount of analysts available, um, it becomes really challenging for them to process all of that information manually for the volume of, of work that's coming through the door.


Um, from a customer insight platform perspective, we saw ourselves, uh, really as, as fulfilling one of those roles, collecting data, um, uh, in persistent event streams storing that data in large data, uh, storage. Um, we were quite distant from people making, uh, policy decisions and making decisions about services. Uh, and we tended to be a bit of a thorn in people's sides, cuz we say, you know, you make, you've gotta make sure you, you log information, you've gotta provide us data so that we've got that available for different use cases. Um, and what we didn't do was share our views on that data. Um, our structure was really quite rigid. It was, um, uh, established around fairly traditional models of a project asking us to deliver a particular view on data and a capability. We would then build that thing and then make that available.


So it, it favored predictability, predictable structures over flexibility, but all on that baseline, uh, view of we collect as much data as we can about these things so that we've got a bit of flexibility in how we can respond and then the pandemic happened. And we all know about that. That was a really interesting challenge for everyone. All of a sudden, as organizations we were told, um, you can't come into the office anymore. You've got to stay at home, you've got to protect yourselves and other people. Um, and if you can work from home, you should. Uh, but if you can't work from home, then really you shouldn't for a while. Um, and we saw large sectors of business being, uh, asked to stay at home and not to trade, obviously that had a dramatic impact on, uh, individuals and businesses. So the ask from HR, uh, see was, Hey MRC, you've got really good flexible platforms.


Can you help with the pandemic? Can you get, uh, build services to get financial help to our citizens and businesses in the UK as fast as possible? Uh, and that was really critical because that was about making sure that people had money to feed their families at a time where we didn't want them to go to work and earn money to feed their families because that was a greater risk. So it was a really care difficult time. Um, we knew that, um, if we were gonna do this, uh, we were gonna be offering up substantial amounts of money that were gonna be available in support packages. And we knew because there's big money involved that that would be attractive to organized crime. Um, so we knew we had to build some control measures in place at the same time, those changing work patterns and that change, those changing conditions actually meant as an organization.


We had our own challenges, just the same as every other business in the UK on working pattern move from predominantly face to face working with, uh, you know, particularly in the digital space where we were used to stand around whiteboards and talking to each other, all of a sudden we were being, uh, asked to work from home work online. Um, so there were real challenges around how we, um, uh, started to do those things, not least of which really simple things like organizational VPN, which was structured so that we could support typically about a third of our employees, uh, working remotely, all of a sudden we wanted a hundred percent of our employees working remotely, um, and, and equally working practices like, um, how we, uh, how we collaborate on whiteboards and things like that, all of a sudden changed dramatically. So what did we do?


So first of all, we, we took a step back. We reassessed our role on our offerings, um, and, and what our thought about what our NS actually needed. We knew that they were gonna have a lot of days coming in. So we knew they were unlikely to be able to do a lot of the analysis they needed to do manually. Um, we knew they needed to have the data available to them readily and quickly. And we also knew that if we were gonna be successful, we'd need to overlay our view on that data, our insights.


Um, we knew that we had to prepare for uncertainty. We, we designed already for a flexibility and we had some flexibility in our services. They were adaptable and they were scalable and reusable, but we needed to, um, change those because the one certainty we had with the print pandemic was nobody knew what was coming next. So when we looked at what services we were gonna build and how we were gonna build them, we deliberately built them to be very modular, very portable, um, so that they could be reused across multiple element, uh, multiple services, uh, and have repeated value. So we wouldn't have to keep rebuilding for a new thing. Um, we knew that we needed an infrastructure that SU could support those changes, which meant changing the size of our infrastructure, making it much bigger, making, you know, scaling, it, support the potential volume of, of customers who we were gonna need to support, um, uh, with those services and putting in place a much bigger team.


So changing the size of the ch team, um, from a relatively small, uh, team to, uh, more than double its size in really short time periods. So we knew we would have to do some of these things. And I guess the most important thing, uh, well, one of the most important things to, to recognize was, was that our solution wasn't new when we were looking at data models, uh, we had, you know, we've been looking at data models, we've been looking at at insights and how we develop them for the business for a number of years alongside what we were doing in delivering the specific needs of the users we had within the organization. Um, we knew there was mileage in insights. We knew that in the future decision making, based on, uh, the analysis of data was gonna be valuable. So we'd already been testing and investing time and effort into building up those types of capabilities and all of a sudden what the pandemic provided us with with a catalyst to really industrialize them and provide them at scale and SP speed to our internal analysts, um, because it was the solution. They now needed that ability to, um, uh, see, uh, automated decisions and collation of data into meaningful insight, into meaningful, um, uh, information that they could then use either in the filtering of transactions. So they could say this is good, bad, or we need to review it. Or actually when they're then performing that more detailed review of the things that did need review so that they could really dive into that detail.


Um, we also understood, um, that if we were gonna be successful, we had to focus on well understood problems. Uh, we chose to focus on, uh, looking at specific attributes and specific objects that were used by the business. And actually we chose to do things that were really transferable again, because we knew they needed to be reused. So we looked at things like, uh, addresses and bank accounts, where we had good sources of information. We could readily check that information for validity. We had good sources of intelligence focus from within HMRC and from wider communities and third parties around how specific accounts or, um, addresses had been used or are being used within, uh, the taxpayer community or the wider use in the outside world. And what that allowed us to do was, was build, look, focus on those, uh, familiar concepts of, we check a thing, we check it against our own intelligence and third party intelligence. We decide if it's good, bad, or indifferent. And then we decide what the next steps to take is. And those are really familiar concepts to the business, which meant that they was, uh, when we introduced them, the business had a really good view of how, um, they could, uh, use them and they already understood the concepts behind them.


So our solution, what did we actually do? Um, one of the biggest developments we do was delivery of our insight service. Um, prior to the pandemic, it was really a bit unthinkable to, um, uh, to, to say we would, uh, provide insights. We would provide that, that collated view. And actually we might make some decisions, uh, and triage, uh, activity up front, um, because of the volume scope scale and volume of the pandemic. Actually, we started looking at, could we assess claims up front as upfront as taxpayers were making them at that point in time for ones, uh, that were, uh, low risk, uh, for customers that were low risk, it meant that we could, uh, we, we could push those through that, that those would go through, uh, accelerated process for ones that looked to, you know, particularly risky. We could pay more attention to them.


Uh, so that prioritized <inaudible> effort became really, uh, quite important to us. New third party data became available to us, um, that helps us to have a really refined view of that day, uh, of those risks, uh, and helped us inform our judgements and, uh, detect more, um, uh, well with more fidelity, good or bad things, ultimately, uh, it was about, uh, using those ideas of human based decision making, but actually move them into a world where we could make those decisions, um, uh, in a more, uh, automated way, but whilst letting our, our, uh, our people within the, uh, analyst community maintain some of that control over that, so they could have a view. And what I'm gonna do now is hand off it's Kate, who's gonna talk a little bit more about, uh, our solution and how we approached it.


Thanks, Andy. Um, yes. I'm gonna tell you a little bit about how we approached the design element. Um, so from a, a data capture and a data pipeline perspective, we already had a very robust pipeline. So, um, for all the digital services, um, on the digital platform, um, we collect those events, we process them and we surface those in various tools or various means, um, to analysts across the organization. Um, so we looked at building on that design. So from each new COVID scheme that came on, um, we reused that design pattern, um, which made it, uh, very easy to be able to collect that data, some testing, um, some processing and some surfacing. So that, that was, um, fairly straightforward, um, where the kind of more technical challenges came was around the modeling and the insights. Um, and also the integration with the, the third party data sources.


Um, so from, from the data pipeline perspective, um, the team members who already knew that pipeline, um, were already there to help. Um, but we needed to, to scale when we looked at, um, building out the insights, um, and those additional attribute, uh, third party services, um, we to scale in the pandemic. Um, so we were already helping people to move from co-located to remote. Um, so there, there were those challenges, but then we also needed to onboard new members. Um, so we looked at, um, reaching out to people who had already worked, um, on the digital platform, um, and had that experience. And we found a lot of people were really wanted to come back and help. They'd had such a great experience at, um, HMRC. So were, were so we're, we're actually really delighted to come back at that time. Um, so we helped, um, that kind of growth, um, by reaching out to people, um, and, you know, the design patterns that we built, not only for the data pipeline perspective, but also for the insights we had reusability in mind.


So as you can see, it was eight build eight weeks to do the initial build. Um, but then later that, um, design could be applied to different schemes and done within, in approximately two weeks. Uh, next slide, Andy. Um, so we, um, we recognized that, um, the building up the right team was, was crucial to, to being a success, um, in this, um, situation. So, um, for us, it was about recognizing that people had huge amount of pressure at home. Um, but also, um, was aware of the importance of building out this service. Um, so we, we set out, um, that it was important that we, um, looked after people to make sure that they weren't doing over time. We discouraged that, um, over time was only done. Um, I think, uh, about once or twice. Um, so it was an exception rather than the norm.


Um, and we made sure that we took care of people in terms of, um, we built, um, we brought in a, a burnout specialist to help support the team. Um, we had social events, we knew that some people were at home on their own during the pandemic, um, and the work colleagues, um, were lifeline. Um, we also know that people had families and were trying to do homeschooling. Um, so, you know, the kind of you take a moment, you know, if a child comes in, we say, hello, it was, it was very much, we needed to look after the team. It needed to be sustainable, cuz this wasn't something that was gonna, you know, be a couple weeks and then it was over. Um, we knew it was gonna, um, go on longer than that. So it was really important, um, that we took care of the team and next slide.


Um, okay. Um, so the scale of, of what we achieved, um, so I've talked about our data pipeline, um, and that was that's being built for all the digital services. So as you can imagine, that can coat for SAP. Um, so the self-assessment peak that comes in, um, once a year for Hm RRC and other peaks throughout the year. Um, so it, it was already robust and it was already scalable. Um, but you can see here, the kind of, some of the scale that we, um, experienced. So 3,960 claims per minute, you know, the, the system had to cope with that. And we had to have mechanisms to make sure that the traffic was going through and that data was being processed and that we were not losing any data, um, as we hit those peaks. So for the first time, a hundred percent of the claims were being assessed up front, um, then was passed down to, to the analyst to be processed further downstream.


Um, and that was a, a first for the organization. Um, and 22%, um, of the claims that were submitted for statutory sick play, um, were flagged up front as well, helping with that kind of, uh, prioritization, um, and helping to support, avoid kind of that manual process and the scope and reach so 10 times the amount of usage than we had normally. So the users that were already using the service were using it even more than they normally would. Um, because there's, that data was being collected. It was easy for them. They knew the knew the tools, um, they were able to access the data. So for example, we had 2.3 million searches and also we had 150%, um, more users added onto the surface. So, um, you could imagine in the kind of normal circumstances, onboarding users, um, it needs time and it needs focus. Um, we had to do that remotely, um, with new people who were being onboarded into, um, HMRC or moving around in, in the organization. Um, so that was done in a collaborative way, looking, working with subject matter experts across the, um, organization and supporting them through that.


So where are we today? Where has all of this taken us? Um, now it's about packaging it into data products. So remember how we said pre pandemic, our analysts were doing all the hard work, um, of connecting the data, making sense of it. We are now doing some of that hard work for customers and our analysts, um, by packaging business, um, objects into digital services. So what does that mean? So we look at data objects such as address, um, or a bank account, and we pull out that data, um, away from the, the digital service or the journey it was going on. So we look at that, um, that address in isolation, um, and we're able to then make the connections, um, across our data set and we provide the analyst that view or access to that data, um, around those data objects. And so we have seen since the pandemic, um, that there are more and more people who across the organization who, um, have requested or services that have requested access to the data.


Um, and we also know that, um, we're having a request that are coming in from other, um, government departments as well. Um, so for it's about how do we go on that journey to access, um, to the data in a safe and secure way. So what's, so what's next for us. Um, as we've said, um, you know, for now we're excited, um, to be starting to explore the data mesh, um, architecture framework. Um, this is to take us to the next level of sustainable data production. And so that's about scaling our data products, giving us the flexibility to produce data products, um, at scale, when we need them or stop them when we don't need them. Um, the data mesh takes the threads of what we've, um, learned the last few years, um, such as we are now seeing data as our commodity where previously the tools or the pipeline was our commodity, um, or our product, but we now know data is the product, um, decentralizing the ownership and the governance, um, to enable the organization to get wider value faster, um, and to, to improve that data facility, we hope to come back next year and tell you how it went on this journey and, and we hope that it's gonna be, um, successful.


Um, but we'll tell you about those learnings. So that's, that's where we're going next. Um, one thing for me is about looking about if there was one thing, um, that I've kind of valued or learned through this journey for me, as I said, it's, it's about building those teams without the people and the teams. Um, we wouldn't have been able to have built these services, um, across HMRC. So that's what I really value.


Yeah. Thanks Kate. So in summary, what, what have we learned? We thought really deeply about, um, what we needed to do and what our role was within the pandemic. And, and our role was really clear. It was about helping HMRC to, um, deliver critical services, but at the same time, ensure that, um, because they were likely to be, uh, attractive to criminals, that we could see that happening. We could prevent it from happening, we could detect it from happening and, and if it did happen, we could had the information available to pursue people who, um, who, uh, did take money. They weren't entitled to. Um, we recognized that by looking at data as a commodity, rather than as, um, uh, a, just a flat source of information that people wanted to access, um, we could produce better value for our customers, our analysts, um, and by understanding how we delivered that, how we could deliver that, it started to get us into this place of a mental model, where we started to look at data objects and items, attributes, and, uh, as, um, commodities in their own. Right. And then overlaying that idea of insights. What does the data tell us so that our analysts didn't need to do the heavy listing though? Those analytical models were pre-built and pre to them as outcomes with, with the associated information so they could see how we'd reach decisions.


Um, it was a fairly unique time. There's no doubt about it, uh, for HMRC that meant that the whole of HMRC had a single focus for a period of time. And that single focus was, uh, around delivery of these critical services. Uh, that meant we had a unique line of communication direct from delivery teams at the front end, all the way through to policy makers in central government who were making those decisions and, and pulling the levers, uh, setting the, the levels of risk appetite, um, very clearly within the services, so that, so that we could balance the delivery of services ver versus the management of risk. And those really clear lines of communication meant that we can have that those, uh, simple decision making, uh, and detailed decision making con conversations, uh, directly. And that's something that we've, we're trying to maintain post pandemic within. RRC, um, really fostering that idea that the people who want, uh, have the policy goals should be directly involved in the delivery alongside the delivery teams, so that we can foster that community structure.


Um, our ability to change and scale was absolutely underpinned by our infrastructure. You heard from Ben last year about the amazing things that our managed digital tax platform has been able to achieve through having flexible cloud based microservice software defined infrastructure, but actually some of the less obvious things like, you know, we decided we needed, um, uh, bring your own device. So developers could develop alongside HMC secure systems without having to have full access that wasn't at the time, a decision that was re relative to, um, uh, to disaster recovery. But actually it turned out to be incredibly valuable when the worst happened, because it meant that our developers could access development services without putting strain on the, uh, the rest of HMRC. So we could adapt. And that adaptability became essential in being able to respond to the disruption that the pan pandemic causes.


And that wasn't, that disruption, I guess, is important to recognize wasn't just changes in work practices. It was new threats, different technologies that we needed to adopt rapidly and ways in which we could work with them. So this is titled I'm looking for help. It it's, I guess, what, what we're trying to, uh, get over here is, do you know what HMRC is? We, we are trying to be collaborative about how we do development. We're really interested in knowledge exchange with others, working with similar data challenges. We're interested in understanding how other organizations, uh, approach some of these, uh, problems of how you convert data science and data management into, uh, innovative solutions and how you integrate those, uh, in large organizations. And I guess what's linked to that is how you in, uh, you know, models for how to fund innovation, uh, that help to decentralize that and empower agile delivery to innovate, uh, across, uh, live services and into new services. So the bottom line is we'd really love to talk. Yeah. Um, I'd like to thank you, um, uh, for, for listening, and I hope you found this as been, uh, informative and useful. Uh, I know, uh, Kate would like to thank you to Kate.


Thank, okay. Thanks everyone. Thanks for, for listening to our story. And we do hope to come back next year and tell you about how we've got on with the data mesh.


Uh, and, and with that, I'll say thank you very much for your attention. Hope you found it useful and goodbye.