DevOps and Internal Audit: A Great Partnership (Part 2) - Nationwide Insurance | Europe 2021

Login or create a free 7-day trial account

Europe 2021

Slides not available

DevOps and Internal Audit: A Great Partnership (Part 2)

At the 2020 DevOps Enterprise Summits Nationwide’s Office of Internal Audit delivered valuable insights on topics such as how to pass an audit in a DevOps environment and how risk is mitigated in a continuous delivery model. Over the past year, Nationwide Internal Audit has continued to explore the implementation of DevOps practices and its impact on key controls. Partnering with Nationwide’s development and operations communities, Internal Audit is back with insights based on experience. Building upon the concepts presented in 2020, Nationwide will bring attendees along on their journey, moving from theory to practice.

By attending this session, attendees will get a refresher of how to think differently about risks and controls in a DevOps environment. They’ll also learn how increased reliance on automation (e.g., automated testing) has impacted risk mitigation and control assessments at Nationwide.

CL

Clarissa Lucas

IT Audit Director, Nationwide Insurance

RL

Rusty Lewis

IT Audit Specialist, Nationwide Insurance

EC

Ethan Culp

NTEC Sr. Associate, Nationwide Insurance

Chapters

Full transcript

The complete talk, organized by section.

Rusty Lewis

Welcome to DevOps and Internal Audit, a Great Partnership, Part 2. Clarissa, Ethan, and I are incredibly excited to be back at the DevOps Enterprise Summit this year to provide an update on our DevOps journey from an internal audit perspective. But first, let's introduce ourselves.

My name is Rusty Lewis, and I'm an IT staff auditor here at Nationwide, working with Clarissa and Ethan in our internal audit department. I joined Nationwide about three years ago, but prior to that, I worked for PricewaterhouseCoopers in their IT audit practice.

Now, when I'm not in the weeds of my internal audit work, or learning about IT audit risks and controls, I enjoy spending time with my wife and our dog, Scooby. I also enjoy recording a podcast with my brother-in-law, where we talk about our passion for video games, technology, and pop culture. So with that, I'll go ahead and toss it over to Ethan to give an introduction to himself.

Ethan Culp

Hi. I'm Ethan Culp, and I just joined our IT audit team last year as part of the Nationwide Technology Early Career Rotation program, or NTEC for short. Previously, I worked as an ETL and database developer at Nationwide for five years before joining the program, and then spent most of 2020 on our internal attack and penetration team in information risk management.

In my personal time, I'm known as Dada to a beautiful one-year-old little girl who takes up most of that free time. Apart from the new parenting life, I enjoy building and fixing computers, Raspberry Pi projects, sci-fi shows, and astrophotography.

Clarissa Lucas

My name is Clarissa Lucas, and I'm an IT audit director in Nationwide's Office of Internal Audit. My primary focus is leading the organization's technology audit function and providing support to our business audit teams. My professional experience has been primarily in risk management and internal audit in both the insurance and financial services industries.

During my tenure in internal audit, I've had the pleasure of auditing enterprise risk management, finance, investments, and now technology. Prior to rejoining internal audit on the IT team a couple years ago, I spent some time in enterprise risk management building out the organization's model risk management policy and compliance program.

Although I do find it difficult sometimes to turn off my auditor brain, outside of work, I enjoy spending time with my family, watching Star Wars with my son, and working out. Now I'll turn it back to Rusty to briefly introduce you to where we work.

Rusty Lewis

Thanks, Clarissa.

As the three of us have briefly stated, Ethan, Clarissa, and I work at Nationwide Insurance, which is headquartered in Columbus, Ohio. Nationwide exists to protect people, businesses, and futures with extraordinary care, and we accomplish this through a range of our services and products to protect our members through all of life's moments.

Our financial services business offers life insurance, mutual funds, and retirement plans. Our property and casualty business includes commercial lines such as agribusiness and standard commercial insurance, as well as personal lines like auto and homeowners insurance. We also offer more unique and specialized products like pet insurance and travel insurance. Ultimately, we're here for our members and the things that matter the most to them.

Speaking of which, just a few fun facts: Nationwide is actually number one in 457 retirement plans, and a number one pet insurer and writer of farms and ranches in the country. Extraordinary care doesn't end with our members. It also extends to our associates, partners, and communities. Nationwide has been named a Fortune 100 Best Company to Work For and Best Workplace for Diversity. Nationwide has also been named one of the 100 best places to work in IT by Computerworld.

Now that you know a little bit more about who we are and what we're about, let's talk about our objectives for today's session. At last year's DevOps Enterprise Summit, Clarissa and I gave you a talk on how to pass an audit in a DevOps environment, and how risk is mitigated in a continuous delivery model.

Over the past year, Nationwide's Office of Internal Audit has continued to explore the implementation of DevOps practices and its impact on key controls. Partnering with Nationwide's development and operations communities, we're back with insights based on experience. Building upon the concepts presented last year, we will bring you along our journey, moving from theory to practice.

By the end of this session, you will have gotten a refresher on how to think differently about risks and controls in a DevOps environment. You'll learn how increased reliance on automation has impacted risk mitigation and control assessments at Nationwide. You'll also discover how we've adopted certain Agile and DevOps practices into the audit process itself.

Now, to understand how DevOps and internal audit are truly a great partnership, you first need to understand our role in risk management and how that impacts you and your role. To do that, let's introduce the three lines in an organization. The first line owns the risk and executes controls to manage those risks. An example would be the development teams who are writing code and performing testing on code and reviewing and approving changes.

The second line is responsible for policy creation, defining risk tolerance, and monitoring adherence to those policies. An example would be an information risk management function. The third line is us, internal audit.

Internal audit provides assurance to the audit committee of the board, as well as to senior management, through independent assessments of risks and controls. We accomplish this through seeking to understand what could prevent the organization from achieving its objectives, considering a multitude of risks. We then evaluate the actions management is taking to mitigate those risks within established tolerances.

To tie this all together, we conduct integrated audits to evaluate business risk and controls in systems and applications that support those business processes. Some examples of IT controls evaluated might be change management, user access and interface integrity, and adherence to the information security policy.

Now, part of understanding who we are is also understanding who we are not. We are not IT risk management. We're not compliance. We're not the external auditors or regulatory body. We're not the development teams or operations. We don't tell management how to manage a particular risk, but we do partner with the first and second lines as well as external parties as we provide assurance on the organization's ability to meet its objectives.

Shifting to DevOps is a transformation in both thought and practice and requires partnership and collaboration between the three lines of defense. For example, if your policies require segregation of duties between developing code and promoting to production, the three lines need to work together to understand whether this works with a DevOps model. If it doesn't, and the organization truly wants to move to DevOps, the policy may need to be revised to allow for a different method of controlling the risk that is currently mitigated by segregation of duties.

Nationwide's internal audit team is comprised of about 70 auditors with a wide variety of experiences, degrees, and certifications. We focus much of our professional development on increasing our business acumen and better understanding the areas we audit. For Clarissa, Ethan, and me, it's technology. Our technology audit team provides assurance over the technology supporting key business processes at Nationwide. With that, I'll pass it over to Ethan to provide a bit of a refresher on some of the controls and risks we briefly discussed to this point.

Ethan Culp

Rusty just talked a little bit about risks. So what can truly go wrong using the example of change management? The primary risk associated with change processes is compromised systems or data availability, integrity, confidentiality, which can have an adverse impact on business operations. Having a strong control environment in place helps you prevent this from happening by preventing the items listed on the slide, including unauthorized changes introduced into production and introduction of security vulnerabilities or defects.

Internal audit performs audits to evaluate controls and determine whether these items are well-managed or not. Next, we're going to walk you through how internal audit currently evaluates these risks and how that may change as we adopt DevOps practices.

Currently, our testing focuses on these three controls: approvals, segregation of duties, and metrics monitoring. The objective of the approval and authorizations control is to ensure that all changes related to an application are being logged, prioritized, authorized, tested, audited, and approved prior to the release into production.

Depending on a total number of release changes during our scope period, we would typically select between a sample of one and 30 changes for detailed testing, and then request documentation evidencing approvals for sampled changes at various stages during the change management process.

Our second control, segregation of duties, focuses on verifying that appropriate separation of duties exists within the change management process to prevent individuals from writing, testing, and promoting their own code into production without independent checkpoints.

Lastly, our third control, metrics reporting, focuses on reporting and monitoring activities to determine the effects of changes on business operations. With new tools, analytics, and upskilling, we're beginning to be able to test a full population of changes for greater accuracy. This enables continuous auditing in some parts of the business and enables a partnership with our business units to mitigate risk as part of the development life cycle.

Now let's take a closer look at the second control, segregation of duties.

Rusty Lewis

Thanks, Ethan. I think the natural next question is: why are duties typically required to be separated? Let's look at an example.

In the old ways of thinking, or traditionally, a developer isn't allowed to promote their own code to production by themselves. They would need to have someone else from, say, operations or quality assurance promote it for them. And then the question becomes: why? Why do we need these actions to be separated from one another?

Well, it's to make sure that the code isn't malicious or fraudulent in nature, or to make sure the code doesn't create disruption to systems or data issues when migrated in production. And so what does the person promoting the code do to ensure that these things don't happen? They review the code, and this is going to help in detecting and fixing defects early in the process, and it helps to maintain a level of consistency in design and implementation of the code. The review also allows for uniformity and understanding, which will help the interchangeability of team members in the instance of non-availability of any other team member.

Let's say the organization's risk appetite allows for detective controls, or controls that focus on finding and fixing after an event or incident has already occurred, rather than preventative controls, or controls that focus on checking before an event or incident occurs. What sort of actions or controls could mitigate that same risk?

There are a number of examples here on the slide, but I want to focus on automated reviews or tests. What if we wrote a script for an automated test that would review the developer's code and promote it to production once it's been tested, reviewed, and approved? The automated test would then enable developers to more efficiently and quickly review code and on demand.

Now the duties are separated, not between two humans, but between a human or developer and a digital worker, or bot, or code. We also have a mechanism for providing immediate feedback and learning to the developers and for promoting code without extended wait times.

The example we just explored is only one example of how controls may look and feel different in a DevOps world. This slide shows additional risks and control considerations specific to the DevOps environment based on industry best practices. These may be different from the controls currently in place in a non-DevOps operating model.

In the middle column, you see risks. These are things that we're all worried about, not just as internal auditors. They represent what could go wrong. Controls in the right-hand column are the activities performed by management and assessed by internal audit to determine whether those risks are managed within established tolerances.

I won't read each of these bullets. I do want to walk through an example of how a control mitigates the specific risk and how we would evaluate that control in an audit. I'll use the example of risk of implementing untested or unauthorized code, which could lead to increased operational failures, poor transaction processing, and unreliability of the system.

A control that could be used to help manage this risk is automated production deployment. Here, the code is automatically moved to production as long as it meets predefined requirements, such as positive test results, completion of a peer review or authorization, or completion of some other control or suite of controls that mitigates the risk that the code will not perform as expected or introduce vulnerabilities when moved to production.

The automated deployment control will reject code that does not meet these predefined requirements, thus reducing the risk of untested or unauthorized code deployed into production.

How would we, as internal auditors, test this? Since this is an automated control, we would check the configurations of the control to determine whether it is designed and configured to mitigate the risk. That's to say, we would review the configurations of the deployment mechanisms to see whether it's set up only to move into production if it meets the requirements and is set up to reject the code that does not meet those requirements.

We'd also test, through observation or review of evidence after the control has already run, one instance where the code meets requirements to see that it was promoted as expected, and one instance where the code did not meet requirements to see that it was appropriately rejected as expected.

This may sound a little bit different from the audit experience you're currently used to, where auditors request pages and pages of documentation for a sample of 30 or more transactions, such as code deployments or changes. On the next few slides, we're going to highlight some of the automation work one of our build teams is doing here at Nationwide, and then tie that back to how we, as internal audit, would begin to test and assess those processes. Ethan, I'll go ahead and pass it to you.

Ethan Culp

Thanks, Rusty. As part of our enterprise effort to make our IT organization more agile and achieve the benefits of DevOps, we want to highlight a build team that is leading the way in the DevOps mindset. Our Partners and Beyond team is responsible for generating insurance quotes to customers that they can compare rates to and pick Nationwide as their provider.

Their journey is still happening and is beginning with a ground-up replacement of a big monolithic service that was enabled by an on-prem ETL service and scheduled on a mainframe. Instead, they are designing and building an entirely new application using cloud-based managed ETL services and streaming data to themselves in real time to offer the most accurate rates.

One of their goals is to make sure that their main branch is always releasable to production. This is an entirely new process and skill set that the team is developing. As most of you are aware, it's kind of hard to learn a new language and a new set of tools. The team's mantra is to lead with humility, and all of this is only possible if the team has the room to play around and manage themselves.

Remember how we talked about segregation of duties? A deployment pipeline that was developed by the team goes through a series of checks and gates before getting deployed. Their manager liked the pipeline and approved it once so that he is able to take a hands-off approach and the team has the ability to work without waiting for every gate to get approved.

That leads us to the first of these checks and balances. Automated testing is a staple of a DevOps team, and the Partners and Beyond team are building out their suite. Their pipeline has multiple dependencies, but it can only advance if the automated testing generates a green light.

Testing can look different on many different teams and in many different organizations, but one thing that automated testing enables across the board is polyskilling. Partners and Beyond does this through code reviews and a whole-team effort on documentation. Developers and quality engineers alike pore over each other's code and verify the testing output is accurate, and they can learn different roles so the whole team operates as a full-stack unit.

Automated testing creates one source of truth, so that when someone needs to audit a story card, the result is objective and not open to interpretation. It was best explained to me that your test should be simple. If you're expecting X over here, you should be testing that your process outputs X. If that doesn't happen, then the next step of their pipeline doesn't kick off.

The results and benefits aren't half bad either. Developers spend less time waiting to test their code and were able to do more releases in a given time. Automated testing and a maturing DevOps mindset means that iteration velocity goes up and lead time goes down.

Security and mitigating risks are important, but we can't let both of those mean that nothing gets done. Our controls can't be so stringent that we are losing market share because we aren't making improvements to our products. A sample of what a risk would look like for this team would be the introduction of code that has unexpected outputs or doesn't have sanitized inputs, which may lead to a business disruption or unscheduled downtime.

That build team could mitigate that risk by using an automated test to check if code matches the requirements. The result of that test is going to be objective and available for the whole team to see immediately.

And now for the so what: how does internal audit look at automated testing? By running this test and requiring that the code pass the test prior to moving to production, management is mitigating the risk of breaking production by introducing something that doesn't meet those standard expectations.

Let's look at a few examples of how internal audit might approach this control. First, we would inspect or test the code to evaluate the design of the test: is the test set up to do what it's expected to do? Then we would inspect the evidence of the code that should pass the test: does the test pass the code as expected? Lastly, we would inspect the code that failed that should fail the test: does it fail as expected?

Next, Rusty is going to walk through the automated deployment that a build team here at Nationwide has recently implemented, as well as internal audit's test approach when addressing the risks associated with automated deployments. Rusty?

Rusty Lewis

Thanks, Ethan. Earlier he touched on the deployment pipeline that Partners and Beyond created. Having a predictable process helps track where things are in the life cycle and creates an easy-to-audit log of events. This team built their lower environments to reflect their production environment, so when automated testing occurs, it will behave or mirror exactly what's in production.

Every commit and piece of code that gets created will get linked back to a story card in Jira, where we can see that it passed unit testing and was signed off by another team member. After the automated testing gets complete, another human will have to manually review the testing output, that the Jira card log reflects the card went through the standard steps, and that the code matches what was set up by the requirements.

What's unique about all of this is that the Partners and Beyond team are able to tell their tools that when a commit to main branch occurs, they also want to kick off a host of processes. Each of these events will get logged in their tool. Any errors will trigger email alerts automatically sent to them, and the errors will also automatically get displayed within a process-tree dashboard for direct visibility into what went wrong.

Ultimately, with the use of deployments and other tools we've briefly discussed, it leads to faster deployments. That in turn means we get to accomplish our goals of more releases and we're more competitive in our market.

Deployments can go one of two ways: either really smooth, or unexpectedly breaking something that now has caused a business disruption and potentially introduced a security concern. With automated deployments, the aim is, of course, to be more of the former. With an open-source versioning tool and container management platform, the Partners and Beyond team have been able to make their deployments predictable, making sure that automated tests have the green light and that the change is thoroughly documented and logged.

Similar to what Ethan touched on earlier as it relates to automated testing, for automated deployments, we need to go back to the question of: now how do we as internal audit adapt our change management testing approach? This automated control ensures that testing has been completed with a passing result and that the change has been documented in the enterprise change repository.

Let's look at a few specific examples of how internal audit might approach this control in an audit. First, we would inspect the deployment code to evaluate design, trying to answer the question: is the test set up to do what it's expected to do? Second, we inspect the evidence of code that should get deployed, so a sample of one, answering the question: did the code deploy as expected? Lastly, we want to inspect evidence of code that should not get deployed, another sample of one, thereby hopefully validating that the code was rejected from deployment as expected.

I hope this overview of risks and controls and practical examples highlighting one of our build teams was insightful, not only the automation tools that teams are leveraging here at Nationwide, but also how internal audit is beginning to adapt our testing approach for a DevOps environment. Next, Clarissa is going to walk through how our internal audit office is beginning to adopt more agile practices with our work. Clarissa?

Clarissa Lucas

Thanks, Rusty. As Rusty mentioned, our final learning objective for the day is to help you discover how Nationwide's internal audit organization is adopting Agile and DevOps practices.

Late last year and into the first few months of this year, we tested specific financial reporting controls in partnership with our organization's external auditors. This was a really high-profile project with a lot of attention from our key stakeholders, and it had a really tight timeline within which testing needed to be completed.

This tight timeline drove the need for us to think about our work differently. Because we were performing testing on behalf of another party, continuous feedback and the ability to pivot quickly was also imperative. We quickly realized we had a strong candidate to implement agile practices into our audit work.

The first concept we incorporated was having a self-organized team that avoided multitasking. This team was composed of dedicated resources that were focused primarily on this engagement. This differs from our normal operating model, where an auditor can have anywhere from four to six engagements going on at the same time.

The team's priority was clear, so we spent less time reevaluating priorities than we might have had there been more engagements on each person's plate. The project manager determined what testing was to be done and the test procedures to perform, while the team determined how best to accomplish that testing. This provided flexibility and agility with our audit resources. As someone finished one task, they could quickly jump onto another task without having to go through the process of being formally assigned that item by the project manager.

The next concept we incorporated was continuous delivery in short sprints. Testing was divided into four buckets, with a number of controls to be completed in each bucket, and each bucket lasted approximately one month.

The Agile Manifesto includes a principle around businesspeople and developers working together daily throughout a project. We modified that principle to reflect working together frequently with stakeholders throughout the audit. It's not a pure DevOps approach, but definitely headed in that direction.

This was accomplished through iterative meetings with both internal and external stakeholders. The internal stakeholders were provided an update at least weekly of the status and any potential issues that we identified. We also met with the external stakeholders twice each week, sent out detailed meeting notes to reiterate key decisions reached, and sought feedback and confirmation from those stakeholders after each meeting.

Continuous feedback is a topic that's discussed frequently in The DevOps Handbook, The Unicorn Project, and The Phoenix Project, and it was another concept we borrowed during this project. The audit team held daily stand-ups that supported the flexibility of the self-organized team and reinforced the urgency and priority of the audit, also enabling the team to seek and address feedback on a continuous basis. These stand-ups also enabled the team to shift available resources to tasks that needed attention, share knowledge amongst the team, and change testing procedures on the fly to adapt to our stakeholder needs.

A blameless retrospective review was conducted at the end of the third bucket to reflect on how to become more effective in that final bucket. Those opportunities that we identified were incorporated into the process during that fourth bucket.

During the retrospective report postmortem that we performed, our team identified an opportunity to increase efficiency in subsequent buckets of work. Some of the control tests we performed were quite straightforward, but they took a lot of time to complete. We were manually filtering data in Excel files and executing a series of VLOOKUPs. So we asked ourselves, "Could we somehow automate these tests?" The answer was yes. Our team leveraged the R programming language to automate testing of these controls, and it worked.

By leveraging R, we eliminated the need to do manual filtering and VLOOKUPs in Excel. The automated tests identify potential exceptions for our team to research, and it saves our team about 15 hours each quarter when we test these controls.

The test script is stored in GitHub, which provides visibility to everyone in internal audit and provides version control as well. The script was reviewed by other members of the audit team, those with deeper familiarity and experience with R, to confirm that it would produce the intended results. Essentially, they audited our automated test. Yes, we auditors get audited as well.

Building upon what we learned from our first Agile auditing pilot, we will be leveraging some of the items that worked successfully, as well as adding in additional DevOps and Agile practices in a second pilot. One of which is prioritizing customer needs.

Nationwide's identity and access management team uses a 30/60/90-day process for managing workflow, where the tasks are broken down into work that can be delivered in, you guessed it, 30, 60, and 90 days. When we told our clients that we would be doing an audit in their space this year, they asked if we could work together to fit our audit into their process workflow. As such, our first DevOps practice implemented in this engagement was to prioritize our customer's needs.

We're going to do this by aligning our audit work with our client's workflow process and decomposing the audit work into a number of sprints, during which control tests will be conducted in a concentrated effort. Our client will commit to fulfilling data requests, conducting control walkthroughs with us, and working through testing results in those 30 days. We'll commit to completing the test work and working through potential issues with the clients during that same 30-day bucket.

Another concept to incorporate in this audit is to increase our focus on continuous delivery through sprints. Each sprint will include key portions of the process that we're looking at, and determination of what portions of the process go into which sprint is done directly in collaboration with our clients, which is another DevOps practice: fostering a collaborative environment.

During each of those sprints, we will identify the controls to test, the test procedures to perform, request data and documentation that we need to do the tests, and finish our testing and communicate results to our clients. Aligned with the DevOps principle that focuses on automation, we'll continue to explore ways to automate our control tests. Finally, we'll hold a retrospective review at the end of each sprint, collectively with our clients, to identify improvement opportunities to be implemented in subsequent sprints.

At this point, you're pretty familiar with internal audit's role in the organization and our objective, which is to provide assurance to key stakeholders through evaluations of risks and the controls mitigating those risks. You're also well-versed on the mindset shift needed to focus on the risk and how best to address that risk, rather than just looking at a checklist of controls.

You've seen how Nationwide has embraced key DevOps concepts, both in the technology organization and in the way internal audit is responding to those modified processes. We've responded not only by adapting our control assessment practices to better fit the world of DevOps, but we've also taken a page out of your book, quite literally, and have incorporated Agile and DevOps practices into the way we do our audits.

As we did in last year's presentation, let's bring it all together by taking one more look at how DevOps can change your experience during an audit, and why DevOps and audit are still truly a great partnership.

When you have automated controls, we can look at a smaller sample for our testing because the automation removes that human error element, so you spend less time being audited. When you rely on automated controls with audit trails, we can provide assurance to you through full population testing, which reduces the sampling risk. Finally, flexible and agile audit approaches align better with your own operating model and workflow. This reduces the need for you to multitask and switch from your daily work to audit work by incorporating audit into your daily work.

Just when you thought your relationship with your auditors couldn't get any better, right?

On behalf of Rusty, Ethan, and myself, I want to say thank you. Thank you to everyone in the audience today for joining us, learning with us, and asking questions along the way. Thank you for coming along with us on our DevOps journey, and thank you to Gene Kim and the DevOps Enterprise Summit selection committee for giving us a platform to share our passion with others. Our contact information is shown here. Please feel free to reach out and connect with us on LinkedIn and keep this conversation going. Thanks again, and enjoy the rest of the summit.