DevOps and Internal Audit: A Great Partnership (Part 2)

At the 2020 DevOps Enterprise Summits Nationwide’s Office of Internal Audit delivered valuable insights on topics such as how to pass an audit in a DevOps environment and how risk is mitigated in a continuous delivery model. Over the past year, Nationwide Internal Audit has continued to explore the implementation of DevOps practices and its impact on key controls. Partnering with Nationwide’s development and operations communities, Internal Audit is back with insights based on experience. Building upon the concepts presented in 2020, Nationwide will bring attendees along on their journey, moving from theory to practice.


By attending this session, attendees will get a refresher of how to think differently about risks and controls in a DevOps environment. They’ll also learn how increased reliance on automation (e.g., automated testing) has impacted risk mitigation and control assessments at Nationwide.

CL

Clarissa Lucas

IT Audit Director, Nationwide Insurance

RL

Rusty Lewis

IT Audit Specialist, Nationwide Insurance

EC

Ethan Culp

NTEC Sr. Associate, Nationwide Insurance

Transcript

00:00:13

Welcome to dev ops and internal audit a great partnership part two, Clarissa, Ethan and I are incredibly excited to be back at the DevOps enterprise summit this year to provide an update on our dev ops journey from an internal audit perspective. But first let's introduce ourselves. My name is rusty Lewis and I'm an it staff hotter here at nationwide, working with Clarissa and Ethan in our internal audit department. I joined nationwide about three years ago, but prior to that, I worked for Pricewaterhouse Coopers and their, uh, it audit practice. Now, when I'm not in the weeds of my intern a lot at work, uh, or learning about it, audit risks and controls. I enjoy spending time with my wife and our dog Scooby. I also enjoyed recording a podcast with my brother-in-law or we talk about our passion for video games, technology, and pop culture. So with that, I'll go ahead and toss it over to Ethan to get her, give an introduction to himself.

00:01:03

Hi, I'm Ethan Culp and I just wanted our it audit team last year as part of the nationwide technology early career rotation program or in tech for short and previously, I worked as an ETL and database developer at nationwide for five years before joining the program and then spent most of 2020 on our internal attack and penetration team in information to risk management. And in my personal time, I'm known as data to a beautiful one-year-old little girl who takes up most of that free time. And apart from the new parenting life, I enjoy building and fixing computers, uh, raspberry or private projects, and scifi shows an astrophotography.

00:01:45

My name is Clarissa Lucas, and I'm an it audit director in Nationwide's office of internal audit. My primary focus is leading the organization's technology audit function and providing support to our business audit teams and professional experience has been primarily in risk management and internal audit in both the insurance and financial services industries. During my tenure in internal audit, I've had the pleasure of auditing enterprise risk management, finance investments, and now technology prior to rejoining internal audit on the it team. A couple of years ago, I spent some time in enterprise risk management, building out the organization's model risk management policy and compliance program. Although I do find it difficult sometimes to turn off my auditor brain outside of work. I enjoy spending time with my family watching star wars with my son and working out. Now, I'll turn it back to rusty to briefly introduce you to where we work.

00:02:35

These Clarissa as the three of us have briefly stated Ethan Clarissa and I work at nationwide insurance, which is headquartered in Columbus, Ohio now nationwide exists to protect people, businesses, and futures with extraordinary care. And we accomplish this through a range of our services and products to protect our members through all of life's moments are financial services, business offers, life insurance, mutual funds and retirement plans or property, property, and casualty business includes commercial lines, such as agribusiness and standard commercial insurance, as well as personal lines like auto and homeowners insurance. We also offer more unique and specialized products like pet insurance and travel insurance. Ultimately we're here for our members and the things that matter the most to them.

00:03:30

Speaking of which just a few fun facts nationwide is actually number one in 4 57 retirement plans and number and a number one pet insurer and writer of farms and ranches in the country. Now, extraordinary care doesn't end with our members. It also extends to our associates partners and communities nationwide has been named a fortune 100 best company to work for. And best workplace for diversity nationwide has also been named one of the 100 best places to work in it by computer world. None of now that you know, a little bit more about who we are and what we're about, let's talk about our objectives for today's session.

00:04:09

I last year's DevOps enterprise summit, Clarissa, Nike, if you a talk on how to pass an audit in a dev ops environment and how risk is mitigated in a continuous delivery model over the past year nationwide office of internal audit has continued to explore the implementation of dev ops practices and its impact on key controls, partnering with Nationwide's development and operations communities. We're back with insights based on experience building upon the concepts presented last year, we will bring you along our journey, moving from theory to practice. By the end of this session, you will have gotten a refresher on how to think differently about risks and controls in a dev ops environment. You'll learn how increased reliance on automation has impact re impacted risk mitigation and control assessments at nationwide. You'll also discover how we've adopted certain agile and DevOps practices into the audit process itself and understand about how dev ops internal audit are truly a great partnership.

00:05:07

You first need to understand our role in risk management and how that impacts you and your role to do that. Let's introduce the three lines in an organization. The first line owns the risk and executes controls to manage those risks. An example would be the development teams who are writing code and performing testing on code and reviewing and approving changes. The second line is responsible for policy creation, defining risk tolerance and monitoring adherence to those policies. An example would be an information information, risk management function. The third line is us internal audit. Now internal audit provides assurance to the audit committee of the board, as well as to senior management through independent assessments of risks and controls. We accomplish this through seeking to understand what could prevent the organization from its achieved from a its objectives, considering a multitude of risks. We then evaluate the actions.

00:06:02

Management is taking to mitigate those risks within established tolerances to tie this all together. We conduct integrated audits to evaluate business risk and controls and systems and applications that support those business processes. Some examples of it controls evaluated might be changed. Management, user access and interface, integrity, and adherence to the information security policy. Now, part of understanding who we are is also understand understanding who we are not. We are not it risk management, we're not compliance. We're not the external auditors or regulatory body. We're not the development teams or operations. We don't tell many management how to manage a particular risk, but we do partner with the first and second lines as well as external parties. As we provide assurance on the organization's ability to meet its objectives, shifting to a dev ops, shifting to dev ops is a transformation in both thought and practice and requires partnership and collaboration between the three lines of defense.

00:07:04

For example, if your policies require segregation of duties between developing code and promoting to production, the three lines need to work together to understand whether this works with a DevOps model, if it doesn't and the organization truly wants to move the DevOps, the policy may need to be revised to allow for a different method of controlling the risk that is currently mitigated by segregation of duties. Nationwide's internal audit team is comprised of about 70 auditors with a wide variety of experiences, degrees and certifications. We focus much of our professional development on increasing our business acumen and better understanding the areas we audit for Clarissa, Ethan and I it's technology. Our technology audit team provides assurance over the technology supporting key business processes at nationwide. So with that, I'll pass it over to Ethan to provide a bit of a refresher on some of the controls and risks we briefly discussed at this point.

00:07:57

So rusty just talked a little bit about risks, so what can truly go wrong using the example of change management? Well, the primary risk associated with change processes is compromised systems or data availability, integrity, confidentiality, uh, which can have an adverse impact on business operations. So having a strong control environment, uh, in place helps you prevent this from happening by preventing the items listed on the slide, including unauthorized changes and should introduce into production and introduction of security, vulnerabilities, or defects. So internal audit performs audits to evaluate controls and determine whether these items are well managed or not. So next, we're going to walk you through how internal audit currently evaluates these risks and how that may change as we adopt DevOps practices. So currently our testing focuses on these three controls, approvals, segregation of duties and metrics. The objective of the approval and authorizations control is to ensure that all changes related to an application are being logged, prioritize, authorized, tested, and audited and approved prior to the release into production, depending on a total number, number of release changes during our scope period, we would typically select between a sample of one and 30 changes for detailed testing and then request documentation evidencing approvals for sample changes at various stages during the change management process.

00:09:29

Our second control segregation of duties focuses on verifying the appropriate separation of duties exists within the change management process to prevent individuals from writing, testing, and promoting their own code into production without independent checkpoints. And lastly, our third control metrics reporting focuses on reporting and monitoring activities to determine the effects of changes on business operations. So with new tools, analytics and upskilling, we're beginning to be able to test the full population of changes for greater accuracy. So this enables continuous auditing and some parts of the business and, uh, enables a partnership with our business units to mitigate risk as part of the development life cycle. So now let's take a closer look at the second control segregation of duties.

00:10:20

Thanks, Ethan. So I think the natural next question is why are duties typically required to be separated? Let's look at an example and the old ways of thinking, or traditionally a developer isn't allowed to promote their own code to production by themselves. They would need to have someone else from say, operations or quality assurance promoted for them. And then the question becomes, why, why do we need these actions to be separated from one another? Well it's to make sure that the code is a malicious or fraudulent nature, or to make sure the code doesn't create disruption to systems or data issues when migrated in production. And so what is the person promoting the code do to ensure that these things don't happen? They review the code, and this is going to help in detecting and fixing defects early in the process. And it helps to maintain a level of consistency in design and implementation of the code. Their review also allows for uniformity and understanding, which will help the interchangeability of team members and the instance of non-availability of any other team member.

00:11:25

So let's say the organization's risk appetite allows for detective controls or controls that focus on finding and fixing after an event or incident has already occurred rather than preventative controls or controls that focus on checking before an event or incident occurs, what sort of actions or controls could mitigate that same risk. Now, there are a number of examples here on the slide, but I want to focus on automated reviews for tests. So what if wrote a script for an automated test that would review the developer's code and promote it to production? Once it's been tested, reviewed and approved, the automated test will then enable developers to more efficiently and quickly review code. And on-demand now the duties are separated, not between two humans, but between a human or developer and a digital worker or Baader code. We also have a mechanism for providing immediate feedback and learning to the developers and for promoting code without extended wait times.

00:12:26

So the example we just explored is only one example of how controls may look and feel different in a dev ops world. This slide here shows additional risks and control considerations specific to the dev ops environment based on industry best practices. And these may be different from the controls currently in place in a non dev ops operating model. Now in the middle column, you see risks. These are things that we're all worried about, not just as internal auditors, they represent what could go wrong controls in the right hand column, are the activities performed by management and assessed by internal audit to determine whether those risks are managed within established tolerances. Now, I won't read each of these bullets. I do want to walk through an example of how a control mitigates the specific risk and how we would evaluate that control in an audit, I'll use the example of risk of implementing untested or unauthorized code, which could lead to increased operational failures, poor transaction process processing an unreliability of the system, a control that could be used to help manage this risk is automated production deployment.

00:13:30

Here's the code is automatically moved to production as long as it meets predefined requirements, such as positive test results completion of a peer review or authorization or completion of some other control or suite of controls that mitigates the risks that the code will not perform as expected or introduce vulnerabilities when moves to production. The automated deployment control will reject code that does not meet these predefined requirements. Thus reducing the risk of untested or unauthorized code deployed into production. Now, how would we as internal auditors test this since this is an automated control, we would check the configurations of the control to determine whether it is designed and configured to mitigate the risk that's to say, we would really review the configurations of the deployment mechanisms to see whether it's set up only to move into production. If it really meets the requirements and is set up to reject the code that does not meet those requirements.

00:14:23

We'd also test through observation or review of evidence after the control has already run one instance where the code meets requirements to see that it was promoted as expected. And one instance where the code did not meet requirements to see that it was appropriately rejected as expected. Now, this may sound a little bit different from the hot experience you're currently used to where auditors request pages and pages of documentation for a sample of 30 or more transactions such as code or changes on the next few slides. We're going to highlight some of the automation work. One of our build teams are doing here at nationwide, and then tie that back to we, how we as internal audit would begin to test and assess those processes. Ethan, I'll go ahead and pass it to you.

00:15:10

Thanks rusty. So as part of our enterprise effort to make our it organization more agile and achieve the benefits of DevOps, we want to highlight a build team that is leading the way in the dev ops mindset. Our partners and beyond team is responsible for generating insurance quotes, to customers that they can compare rates to and pick nationwide as their provider. Their journey is still happening and is beginning with a ground up replacement of a big monolithic service that wasn't able to buy an on-prem ETL service and scheduled on a mainframe. So instead they are designing and building an entirely new application using cloud-based managed ETL services and streaming data to themselves in real-time to offer the most accurate rates. One of their goals is to make sure that their main branch is always releasable to production. So this is an entirely new process and skillset that the team is developing.

00:16:02

As most of you are aware, it's kind of hard to learn a new language and a new set of tools. So the team's mantra is to lead with humility and all of this is only possible if the team has the room to play around and manage themselves. So remember how we talked about segregation of duties will a deployment pipeline that was developed by the team, goes through a series of checks and gates before getting deployed. Their manager liked the pipeline and approved it once so that he is able to take a hands-off approach. And the team has the ability to work without waiting for every gate to get approved.

00:16:40

That leads us to the first of these checks and balances. So automated testing is a staple of a dev ops team and the partners and beyond team are building out their, uh, building out their suite. Their pipeline has multiple dependencies, um, but it can only advance if the automated testing generates a green light. So testing can look different on many different teams and in many different organizations. But one thing that automated testing enables across the board is probably scaling now partners and beyond does this through code reviews and a whole team effort on documentation developers and quality engineers alike, pour over each other's code and verify the testing output is accurate and they can learn different roles so that the whole team operates as a full stack unit. Automated testing creates one source of truth so that when someone needs to audit a story card, the result is objective and not open to interpretation.

00:17:35

The, it was best explained to me that your test should be simple. So if you're expecting X over here, you should be testing that your process outputs X, if that doesn't happen, then the next of their pipeline doesn't kick off. And the results and benefits aren't half bad. Either developers spend less time waiting to test their code, and we're able to do more releases in a given time automated testing and a maturing DevOps mindset means that iteration velocity goes up and lead time goes down so security and mitigating risks are important, but we can't let, but we can't let both of those mean that nothing gets done. Our controls can't be so stringent that we're losing market share because we aren't making improvements to our products. A sample of what a risk would look like for this team would be the introduction of code that has unexpected outputs or doesn't have sanitized inputs, which may lead to a business disruption or unscheduled downtime that build team could mitigate that risk by using an automated test, uh, to check if code matches the requirements.

00:18:47

The result of that test is going to be objective and available for the whole team to see immediately and now for the, so what, how does internal audit look at automated testing by running this test and requiring that the code pass the test prior to moving to production management is mitigating the risk of breaking production by introducing something that doesn't meet those standard expectations. So let's look at a few examples of how internal audit might approach this control. So first we would inspect or test the code to evaluate the design of the test is the test set up to do what is expected to do? And then we would inspect the evidence of the code, uh, that should pass the test. So does the PA does the test pass the code as expected? And then lastly, we would inspect the code that failed that should fail the test. So does it fail as expected? And next Russ rusty was going to walk through the automated deployment that a build team here at nationwide has recently recently implemented as well as internal audits test approach when addressing the risks associated with auto automated deployments. Rusty,

00:19:57

Thanks Ethan. So earlier he touched on the deployment pipeline that partners and beyond created having a predictable process helps track where things are in the life cycle and creates an easy to audit log of events. This team built their lower environments to reflect their production environment. So when automated testing occurs, it will behave or mirror exactly what's in production. Every commit and piece of code that gets created will get linked back to a story card in JIRA, where we can see that it passed unit testing and was signed off by another team member. After the automated testing gets complete, another human will have to manually review the testing output past that the JIRA card log reflects the card, went through this entered steps and that the code matches what was set up by the requirements. What's unique about all of this is that the partners would be on team, are able to tell their tools when a commit to mean branch occurs. They also to kick off a whole host of processes, each of these events will get logged in their tool. Any errors will trigger, email alerts automatically sent to them, and the errors will also automatically get displayed within a process tree dashboard for direct visibility into when, what went wrong. Ultimately with the use of deployments and other tools we've briefly discussed. It leads to faster deployments that in turn means we get to accomplish our goals of more releases and we're more competitive in our market.

00:21:17

Now, deployments can go one of two ways, either really smooth or unexpected breaking something that now has caused a business disruption and potentially introduce a security concern with automated deployments. The aim is of course, to be more of the former with an open source versioning tool and container management platform, the partners of beyond team have been able to make their deployments predictable, making sure that automated tests have the green light and the change is thoroughly documented in logged similar to what Ethan touched on earlier as it relates to automated testing for automated deployments, we need to go back to the question of now, how do we, as internal audit adapt our change management testing approach, there's automated control and ensures that testing has been completed with the passing result and that the change has been documented in the enterprise change repository. So let's look at a few specific examples of how internal audit approach might approach this control in an audit.

00:22:16

First, you have inspect the deployment code to evaluate design, trying to answer the question of is the tests that have to do what is expected to do. Second. We inspect the evidence of code that she gets deployed. So a sample of one asking that you're answering the question, did the code deploy as expected? And lastly, we want to inspect evidence of code that should not get deployed. Another sample of one, thereby hopefully validating that the code was rejected from deployment as expected. Now I hope this overview of risks and controls and practical examples here highlighting one of our build teams was insightful, but not only the automation tools that teams are leveraging here at nationwide, but also how internal audit is beginning to adapt our testing approach for a dev ops environment. Next Clarissa is going to walk through how internal audit, how our internal audit office is beginning to adopt more agile practices with our work Clarissa,

00:23:11

Thanks, rusty. So as rusty mentioned, our final learning objective for the day is to help you discover how Nationwide's internal audit organization is adopting agile and dev ops practices late last year. And end of the first few months of this year, we tested specific financial reporting controls in partnership with our organization's external auditors. This is a really high profile project with a lot of attention from our key stakeholders. And I have a really tight timeline within which testing needed to be completed. This tight timeline drove the need for us to think about our work differently. And because we were performing testing on behalf of another party, a continuous feedback, and the ability to pivot quickly was also imperative. We quickly realized we had a strong candidate to implement agile practices into our audit work. The first concept we incorporated was having a self-organized team that avoided multitasking.

00:24:03

This team was composed of dedicated resources that were focused primarily on this engagement. Now this differs from our normal operating model where an auditor can have anywhere from four to six engagements going on at the same time, the team's priority was clear. So we spent less time reevaluating priorities than we might have. There been more engagements on each person's plate. The project manager determined what testing was to be done and the test procedures to perform well. The team determine how best to accomplish that testing. This provided flexibility and agility with our audit resources. As someone finished one task, they could quickly jump onto another task without having to go through the process of being formally assigned that item by the project manager. The next concept we incorporated was continuous delivery in short sprints testing was divided into four buckets with a number of controls to be completed in each bucket.

00:24:53

And that bucket each bucket lasted approximately one month. And the agile manifesto includes a principle around business people and developers working together daily throughout a project. We modified that principle to reflect working together frequently with stakeholders throughout the audit. So it's not a pure dev ops approach, but definitely headed in that direction. This was accomplished through iterative meetings with both internal and external stakeholders, the internal stakeholders were provided an update at least weekly of the status and any potential issues that we identified. We also met with the external stakeholders twice each week, send out detailed meeting notes to reiterate key decisions reached and sought feedback and confirmation from those stakeholders. After each meeting, continuous feedback is a topic that's discussed frequently in the dev ops handbook, the unicorn project and project Phoenix. And it was another concept we borrowed during this project. The audit team held daily stand-ups that supported the flexibility of the self-organized team and reinforce the urgency and priority of the priority of the audit.

00:25:56

Also enabling the team to seek and address feedback on a continuous basis. These stand-ups also enabled the team to shift available resources, to tasks that needed attention, share knowledge amongst the team and change testing procedures on the fly to adapt to our stakeholder needs. A blameless retrospective review is conducted conducted at the end of the third bucket to reflect on how to become more effective in that final bucket. Those identify those opportunities that we identified were incorporated into the process starting that fourth bucket during the retrospective report, post-mortem that we performed our team, identified an opportunity to increase efficiency in subsequent buckets of work. Some of the control tests we performed were quite straightforward, but they took a lot of time to complete. We were manually data and Excel files and executing a series of V lookups. So we asked ourselves, could we somehow automate these tests?

00:26:51

The answer was, yes, our team leveraged the R programming language to automate testing of these controls. And it worked by leveraging our, we eliminated the need to do manual filtering and V lookups and Excel. The automated tests identify potential exceptions for our team to research. And it saves our team about 15 hours each quarter. When we test these controls, the test script is stored in get hub, which provides visibility to everyone in internal audit and provides version control as well. The script was reviewed by other members of the audit team, those with deeper familiarity and experience with our to confirm that it would produce the intended results, essentially they audited our automated test. Yes, we auditors get audited as well.

00:27:34

Building upon what we learned from our first agile auditing pilot, we will be leveraging some of the items that worked successfully as well as adding in additional dev ops and agile practices in a second pilot, one of which is prioritizing customer needs. Nationwide's identity and access management team uses a 30, 60, 90 day process for managing workflow where the tasks are broken down into work that can be delivered in. You guessed it 30, 60, and 90 days. When we told the, uh, our clients that we would be doing an audit in their space this year, they asked if we could work together to fit our audit into their process workflow as such our first dev ops practice implemented in this engagement was to prioritize our customer's needs. We're going to do this by aligning our audit work with our client's workflow process and decomposing the audit work and do a number of sprints during which control tests will be conducted in a concentrated effort.

00:28:27

Our client will commit to fulfilling data requests, conducting control walkthroughs with us and working through testing results in those 30 days. And we'll commit to completing the test work and working through potential issues with the clients during that same 30 day bucket. Another concept to incorporate in this audit is to increase our focus on continuous delivery through sprints. Each sprint will include key portions of the process that we're looking at and determination of what portions of the process go into which sprint is done directly in collaboration with our clients, which is another dev ops practice, fostering a collaborative environment. During each of those sprints, we will identify the controls to test the test procedures, to perform request data and documentation that we need to do the tests and finish our testing and communicate results to our clients. Aligned with the dev ops principle that focuses on automation. We'll continue to explore ways to automate our control tests. And finally, we'll hold a retrospective review at the end of each sprint to collectively, collectively with our clients to identify improvement opportunities, to be implemented in subsequent sprints.

00:29:33

At this point, you're pretty familiar with internal audits role in the organization and our objective, which is to provide assurance to key stakeholders, evaluations of risks and the controls mitigating those risks. You're also well versed on the mindset. My mindset shift needed to focus on the risk and how best to address that risk, rather than just looking at a checklist of controls. You've seen how nationwide has embraced key dev ops concepts, both in the technology organization and in the way internal audit is responding to those modified processes. We've responded not only by adapting our control assessment practices to better fit the world of dev ops, but we've also taken a page out of your book quite literally, and I've incorporated agile and dev ops practices into the way we do our audits as we did in last year's presentation, let's bring it all together by taking one more look at how dev ops can change your experience during an audit and why dev ops and audit are still truly a great partnership.

00:30:28

When you have automated controls, we can look at a smaller sample for our testing because the automation removes that human error element. So you spend less time being audited. When you rely on automated controls with audit trails, we can provide assurance to you through full population testing, which reduces the sampling risk. Finally, flexible and agile audit approaches align better with your own operating model and workflow. This reduces the need for you to multitask and switch from your daily work to audit work by incorporating audit into your daily work. Just when you thought your relationship with your auditors couldn't get any better, right on behalf of rusty, Ethan and myself. I want to say thank you. Thank you to everyone in the audience today for joining us, learning with us and asking questions along the way. Thank you for coming along with us on our dev ops journey and thank you to gene Kim and the DevOps enterprise summit selection committee for giving us a platform to share our passion with others. Our contact information is shown here. Please feel free to reach out and connect with us on LinkedIn and keep this conversation going. Thanks again, and enjoy the rest of the summit.