In this session, Chet Burgess, Principal Engineer at Cisco, will chronicle how the Developer Experience team for the Cloud & Compute Business Unit helps to improve security for their containerized applications. As Cisco increasingly used containers for both SaaS and on-premise software, they needed to adapt the processes and tooling used to secure their products. The Developer Experience team leveraged automation to streamline security checks and remediation within existing DevOps processes, making the software development teams they supported more efficient. In this session, Chet Burgess, Principal Engineer at Cisco, will discuss: - The role of a Developer Experience team in making developers more productive - Why the Software Bill of Materials was a critical foundation for security - How the team automated security checks from development through to ship/production - How they complied with internal mandates for OSS compliance This session is appropriate for all skill levels and will appeal to those looking to spearhead new DevSecOps initiatives within their organization. Attendees will take away proven strategies for improving the software development process with actionable recommendations on how to implement them.
Principal Engineer, Cisco
Hello and welcome. My name is Chad Burgess. I'm the principal engineer in the developer experience team at Cisco systems as part of the cloud and compute business unit. And today I'm going to talk to you a little bit about how the developer experience team helps to embed security and automation to make it easier for our developers to do their job. First, let me talk a little bit about what the developer experience team is and what it does. Our goal is to enable our engineers to deliver a greater product value by focusing on reducing the friction that can be caused by things like tooling process and procedures and compliance, which we're going to spend some of this presentation talking about. We do this by applying an engineering and product level approach to what we do by this. I mean, we view ourselves as delivering a product to the other engineering teams and we treat them very much as our customers.
We listened to their feedback and we work with them in partnership to make it easier for them to deliver their products. Just a couple of the things that the developer experience does or supports. So we run a number of engineering labs for our product development teams. Um, this is everything from the physical, uh, you know, racking and maintenance of the servers to doing the networking and supporting the operating systems or the virtualization layer. We also help support their CIA and CD pipelines. So this can involve things like, um, running Jenkins and running very large build farms. We also support various developer services. This can be anything and everything from artifact repositories to supporting, um, software or endpoint scanning solutions as part of security. And finally we develop and release automation continually to help enable, um, all of these activities. And in general, make our developers lives easier.
Now the developer experience team is a very small dedicated team of engineers, and we come from a very wide background based on the stuff that I just told you that we did. We obviously have everyone from data center specialists to networking specialists, um, dev ops specialists, dot compliance people, uh, as well as project managers, we tend to, um, each member of the team tends to have a very wide background and have a lot of experience doing a couple of different things so that we can easily switch between the type of tasks that we're asked to do. And finally, we support the delivery of various types of product offerings. So some of the product teams, we work with ship a very traditional looking software where it's something that's put up for download and the customer downloads it and installs it, you know, in their data centers. Um, we also support teams that deliver, um, software as a service solutions and we even support teams that deliver hardware products.
So earlier I talked a little bit about compliance and let's, let's dig into that a little bit more so broadly compliance can mean a lot of different things, but I like to boil it down to two, three primary obligations. When we're talking about compliance ship, only the software you need. So this one's pretty simple. Don't include software that you don't need it in your product and can, and as an addition to that, you know, make sure that the software you do need is configured appropriately to only enable and do the things you need it to do. Number two, keep your software up to date again, this is fairly simple. If you ship a component, um, you need to make sure that you have a way of shipping updates to that component, so that if there's a bug or a security issue with that component, you can quickly deliver an update for your customer.
And then finally, one of the, uh, parts of compliance obligations that I want to dig into a little bit more is what's called open source software compliance. So let's dig into that topic a little bit, right? Open source software is something that we're all familiar with in, in dev ops and across all of it. We use open source. Most of us use open source software all the time and every day, oftentimes without even realizing the components that we're using are open source or fully even understanding what open source for that component actually means. So open-source software exists within a legal framework that creates obligations for both the developers and the distributors. Um, what this means is every piece of open-source software you use comes with its own license and those license licenses have requirements that you have to meet as a developer or a software distributor.
Now, not all licenses are as open or as free as you might think. There's been a recent trend in both what's called source available licenses. And what I like to call open issue licenses, um, and example of this open issue licenses could be, uh, would be, um, what's been happening between say elastic search and AWS of late. Um, additionally licensing can differ based upon how you intend to use the software. Um, what I mean by this is of course, that some licenses allow you to do something with a size that you couldn't do with say a traditional piece of software is just linking against that or vice versa. Something you can do when your software just wants to link against open source component may not be permissible if you're then delivering that as a service. Uh, finally I want to call out that the Linux foundation has a great site, uh, set up to really help people begin to understand what open source software compliance is all about.
And they have sections set aside for developers, distributors, um, legal specialists. And it's a very great resource. If you haven't checked it out, or you have additional questions about open source software compliance. So how do we, so Cisco faced this problem a few years back where we said to ourselves, you know, we, we have all these new container-based products and container based platforms. And so how do we meet these compliance obligations for containers? And we set out to try and answer that question for ourselves. So one of the early things we decided was that the S bomb or what we call the software bill of materials was going to be key to being able to meet all of those, uh, compliance obligations after all, if we don't know what is inside of the containers we're shipping, how do we know if we're, if we're only shipping what we need, how do we know if it's up to date and how do we know if we're complying with the licenses, if those components have, um, additionally, one of the keys to having a good software bill of materials is that it has to be a complete inventory of all components you ship.
Now, what do I mean by all components? So components can be broken down into two categories. There's the first order components. This is what your product directly uses. Many developers or engineers know how to answer this question, either off the top of their head, or they have a configuration file that they store along in their source code control system that basically defines the dependencies of their software. So in the case of something like Python, this is typically something like a requirement style text file that has a list of the Python modules that the product you're shipping needs. Now what most of these solutions do not include is what we call the second order or greater components. This is the list of components that those first order components need in order to function. Um, or as I like to call it, it's turtles all the way down.
So the best example of this is let's say you want to go install lib FUBAR, uh, into your favorite Debby and operating system. So what do you do? Yes. S H N you're on app, get install, lube, FUBAR, and hit enter. And then it says, oh, I want to install these 12 components. Well, what happened there? You only asked for one component, what are the other 11? Well, those are the second order dependencies. Those are the things that live FUBAR needs to do. Its job an accurate S bomb has to encompass all of these components because all of them could have security issues and have their own unique license obligations. So what were the compliance challenges that we, that we faced specific to the container based product products? When we got started several years ago, well, first and foremost, lack of container support for the existing tooling that we have.
Obviously we've been shipping software and other products for decades. So we, we knew how to inventory those types of products. And we knew how to do compliance on them, figure out what the licenses were, keep the software up to date, but the container was a new type of artifact for us. It was a new distribution method and our existing tooling didn't really have support for it. Additionally, the container ecosystem in and of itself provided a couple of challenges for us. Um, oftentimes what happens is someone just posts a container to their favorite registry, like say Docker hub, and then a bunch of people download it, but they don't really know what's inside of that container. So of course there was unknown content in there. Also, we find that many of the publicly posted containers out there have a lot of unnecessary software installed in them.
A lot of people come from a virtualization background or even a hardware background where they think, oh, I need to install my operating system. And then after I've installed my operating system, I need to put in the, the software components that I need in my application containers can work fundamentally differently. And they enable us to dramatically reduce the footprint of the software that we ship inside those containers, thus reducing the surface area for attacks. And finally, a lot of containers they're built, they're published once and then the developer moves on or they continually release updates to their software, but they keep using the same exact base month after month, year after year resulting in some of the base components getting out of date.
So what were our requirements as we approached, uh, as we approach this problem? Well, first and foremost, as I said, we decided that the software bill of materials, the S bomb was going to be key for us. So we want it to have really good S bomb reporting. We wanted a tool to be able to generate an S bomb that contained all the components, the first and second level ones, then do vulnerability analysis on that S bomb and be able to do some amount of source software license analysis on that S bomb. We additionally needed to be able to scan and catalog all of our releases. We needed to have retention based on the type of artifact we were scanning. So for instance, if it was a dev build or a pre-release or a beta build, we only need to keep that for a couple of weeks, usually, maybe a few months versus a full release that we wanted to keep in perpetuity.
Additionally, we needed to make sure that we could support the full life cycle of the artifact. Again, development artifacts don't live very long, but a release maybe released and supported for quite some time. And so we needed to make sure that as long as that release was available for download and fully supported, that we were continually monitoring it for vulnerabilities and ensuring that we knew what the issues were and being able to report those to customers and release updates. And finally, we needed our solution to integrate with the existing tooling inside Cisco that was already designed to do that vulnerability management and license compliance. So what was our approach to solving this problem? Well, one of the things we decided early on is that we wanted to leverage existing tooling for S bomb and vulnerability data. There's a huge industry out there for analyzing your software products of different types, including containers, um, generating a software bill of materials, and then doing vulnerability analysis on top of that, by pulling data from all the various vendor feeds, NVD, get hub and other sources we want, we wanted to survey the market and, and find a leading tool that will, that would be able to do this for us, that we could then integrate on top of.
Um, and then what we decided, then we said, we will write the middleware to integrate with the existing Cisco tooling. Um, it made more sense to us to do that ourselves and to try and find a vendor that could both provide a really great tool for S bomb and vulnerability data, and then do custom integration for us. So we obviously looked at a bunch of different, uh, products years ago, and we finally decided to choose a suite of products that comes from a company called Encore. So they provide our core container scanning solution now, and core has a number of offerings that they make available. And we use most of them, um, anchor enterprise, which is the first service we got started with is, uh, is a persistent service that runs that can monitor, uh, uh, images that are stored in Docker registries, um, and generate a bill software bill of materials for that, and do continual ongoing vulnerability analysis.
More recently, there's two new tools that have been released, um, sift and gripe. Um, sift is a tool that's designed to generate a software bill of materials from a number of different artifacts. It primarily supports doing that from a, from a local image or a saved image in the form of like a tar file or from a registry, but it can also analyze things like software distributions that are unpacked on disc. Um, finally, there's the, uh, additionally there's the gripe tool and where sift is about generating software. Bill of materials. Gripe is about taking those software bill of materials and then giving you vulnerability analysis for what's in that content. And then finally anchor makes available an number of different integrations for both anchor enterprise sift and gripe to allow you to easily integrate with things like get hub Jenkins and the other various components that exist within your CIO and CD pipelines.
The anchor solution is able to inventory and monitor a number of different types of packages to generate that S bomb and tell us about CVS. So this includes the standard OSTP packages. So things like Boone to Debbie and sent to us and Alpine, it knows how to interrogate the package managers there and figure out what's installed, but it is also able to analyze the file system with those images and report back on other Python components, Java components, node, and Ruby gems that it discovers outside of the package manager and include those in your software bill of materials and report the vulnerabilities on this. Additionally anchor enterprise supports a form of what's called self discovery, where you can basically write a hint of file that you include in your container and provide an additional list of things that you want to anchor to include in your bill of materials and do vulnerability analysis for this enables you to package up some of your dependencies in different ways that may not be a discoverable or easy for anchor to figure out. Finally, anchor enterprise supports a very rich policy engine that you can write different rules in to enforce different types of container best practices.
So why did we choose anchor? Um, you know, several years ago when we surveyed the space, there was obviously several players there's a lot more now. Um, and so, uh, we put several of them through a series of evaluations and, um, and, and the reason we chose anchor is the following. Um, first and foremost, it had a very easy to work with and very clean rest API. This was critical to us because one of the key things we had decided was that we were going to write the middleware integrations, uh, with the existing internal Cisco tooling. So we needed a service that could serve up the S bombs and the vulnerability data to us in a very clean and easy to consume way. And our preference of course is rest API. Um, additionally, uh, when we compared the various solutions anchor, uh, produced a very accurate software bill of materials and had, uh, a very good accuracy rate on the vulnerability data and the vulnerabilities that was reporting.
Um, additionally, as I previously mentioned, anchor has a number of integrations that were available out of the box. That'll enable us to quickly start integrating parts of it into our ecosystem as we evolved. Um, they have a great, uh, support system. We've had a tremendous success opening support tickets with them, and finally, um, they truly wanted to form a partnership with us. They, they wanted to understand how, uh, what our needs were and, uh, how they could evolve the product moving forward to help us meet our goals. Um, and that's something that we value very highly since that's the approach we try and use internally with our engineering teams in our products. It's a bit, I'm happy to say it's been a great partnership with anchor. Um, they've really listened to us. The product has continued to evolve in a great direction. Um, and they've added a number of features for us that is, has added tremendous value to what we do.
So what did we do with anchor then? So we ended up actually creating a couple of different implementations. So the first thing we created is a service that we called Helio's. So this is an in-house restful service that we use for cataloging and reporting on releases. Um, what happens is, um, when it get tagged, gets pushed in, into the various repositories that triggers our pre-release or release workflow. So depending upon the format of the tag, we determine, is this a pre-release or is this a release? Um, then based upon that workflow, uh, we can take different actions. So what happens is we take that release and all the images and the containers that make up that image, and we send that into Helias. Now, what Helios does is pulls those images out of the registries, sends that those images over to Encore and waits for anchor to do analysis of the S bomb.
It can then pull that S bomb. And then the other thing anchor does for us is it continually, uh, um, does that vulnerability analysis runs on those images and Helios get to continue to feed from, from anchor of new vulnerabilities. Um, Helios also allows us to do the state management for the releases. So Helio's understands things like RC versus released versus EOL, and understands how to communicate that to the other Cisco, uh, internal tools to indicate what the status of a given release is. Um, as I said, Helio synchronizes with all these internal Cisco tools for us, so that it can accurately reflect our S bomb, um, as well as help us do the license compliance in our internal systems. Additionally, because it has all that vulnerability data. It helps us, uh, with vulnerability management. And of course it provides real-time reporting for both the software bill of materials and the current vulnerabilities.
The next thing we wrote was, uh, a piece of software that we call Minerva. So this is an in-house service that is designed to do trending analysis for vulnerabilities. Only obviously there's a lot of focus on making sure that software is secure and there's, um, you know, a huge industry and a lot of people out there trying to find vulnerabilities and exploits all the time. So what this service does is it enables us to do real-time and historical reporting on their vulnerabilities. We have in various releases and components. It can chart what a release looks like over time, as far as how many vulnerabilities have had, you know, uh, at release and how many it has a week later, two weeks later, three weeks later, as well as it can chart individual products, as well as individual components release over release. This helps us get a feel for how different engineering teams are doing as far as improving their security posture.
Um, what Minerva does is it pulls all that, the software bill of material and all of the vulnerability data from Helios, as well as it pulls in vulnerability data from various other Cisco internal tools, it then can crunch all of that and determine the vulnerabilities that exist in our product and how we want to address them. It integrates with the other Cisco internal tooling for vulnerability reporting, and it automates the engineering ticket creation for us for remediation of these vulnerabilities. So after Minerva has done its analysis and determined that that AI vulnerability is a valid vulnerability, that would actually impact our software. It can reach out to JIRA, create the ticket in the appropriate JIRA queue, set all the right attributes to indicate, you know, which image and component has the issue. What are the CVEs and vulnerabilities present with that? What fixed versions, if any, are available, need to be installed and indicate the time to remediation based upon the severity of the issue identified.
Finally, we have, we we've started to integrate, uh, with, uh, or shifting left, I guess, is we started to integrate, uh, pull request based scanning. So, um, in this case, we're using a combination of stuff provided by anchor and some integration we wrote ourselves. So there's a Jenkins plugin that is provided by anchor that enables Jenkins to do direct scanning against an anchor enterprise instance. Um, we then took it, we then wrote an in-house wrapper, um, that will parse those results that are available in the Jenkins build and post out as a comment back to the get hub PR. Um, today this is a non-voting informational, only vulnerability report. Now, as a test, I, I made a, uh, I cooked up an image that I know had a, had an issue in it, um, and sent that through a fictitious PR to get it scanned.
What you see here at the bottom is what the actual comment in our, our get hub enterprise instance would look like. If any image had a particular issue here, it tells you that this image had a vulnerability that's classified as high or greater. Um, it gives you the, it tells you the name of the, uh, uh, the, the name of the image. In this case, I, I made up a fictitious open JDK for OpenShift and installed a few things in it. Um, and you can see what the CVSs score is as well as the CV ID in the package that has that vulnerability. So what's next for us, obviously, uh, we don't, we don't think we're done yet. There's, there's obviously room to improve. So in the helium Helio side, we want to be able to start ingesting software bill of materials that are produced directly out of sift.
Um, as I mentioned earlier, sift has a number of different scanning modes. One of is you can ask it to scan a directory on disc. So rather than having to have something packaged as a container, we can start scanning other types of artifacts that are just existing in a file system. Uh, we even think this will enable us to start doing limited scanning of full, uh, VM based images. Um, so we want to get to the point where Helios can take either on existing Docker image, or it can take a pre-generated software bill of materials from sift. Um, we also wanna enable, uh, better SAS support. So anchor itself has what's called a Kubernetes Kubernetes runtime inventory plugin. So this is a neat feature where you install a little agent on your kubelet, a worker nodes, and it continually monitors the images that are present and running as pods in your Kubernetes cluster and reports this back to anchor.
We want to start enabling this for some of our, uh, SAS offerings and having that be able to be reported back through Helios so that we can integrate that with other Cisco tools. Finally, we expect that there's gonna be a lot of evolution as it relates to software bill of materials. Earlier this year, the Biden administration here in the United States issued the executive order on improving the nation's cyber security is a fairly comprehensive document about cybersecurity as a whole, but one of the key elements in there talks about the need and future requirements for vendors to provide accurate and up-to-date software bill of materials for all solutions delivered to the federal government. Um, additionally, there's a number of competing standards to try and standardize what a software bill of materials should look like. The two, the two that have the most support is SPDX and cyclo DX.
We expect that there'll be a lot of, uh, a lot of work done in the next year or two in this space, as, as we figure out, and the industry figures out exactly what the executive order means and how as an industry, we're going to meet that. And we expect that there will probably become a standardized software bill of materials possibly around the SPDX recycling DX standards. So what our next steps for, for Minerva, which is our vulnerability management system. Well, um, one of the big things we want to do is we want to start moving more heavily into risk-based assessments. A lot of our assessments today and scoring and severity is based purely on the CVS V3 scores and the different elements that go into making up that score. We want to broaden that and look at integrating additional tooling. There provides a risk based approach to evaluating that it goes a little bit deeper than just the CVSs score.
Um, additionally, we're looking forward to incorporating a bunch of new data that's available to us, uh, out of the latest major release of anchor anchor three, including things like vendor disposition. So vendor disposition is, um, something, some vendors, uh, Debbie and or red hat will look at a CVE and they'll evaluate it. And they'll say, yes, it's valid, but it's of such low severity or low risk. We are choosing not to fix it, and they will flag that as a won't fix. And so we want to start incorporating the concept of won't fix along with a risk-based assessment into our workflow so that we can more accurately determine is this a vulnerability that we should to take action with and can take action with finding the PR based scanning is probably where we have the most work that we want to look to do. So we want to move to using sift pro predominantly to generate a local software bill of materials that we then send to our anchor enterprise instance for vulnerability analysis, by moving to SIF, as I've said earlier, this should allow us to expand, to be able to, to, uh, support non container based artifacts with this workflow.
We want to make our PRP scanning, uh, voting, uh, gate job. Um, once we incorporate some of that risk management. And finally, we want to figure out a way to add the light, uh, license check of our license compliance check to that PR based scanning today. Ally's all of license compliance, um, is, is done following a set of rules that exist in external, uh, tools that are other tools internal to Cisco. We want to work on bringing that and shifting that left into PR scanning process to give us, uh, an early view on any potential license issues we might have. So thank you. I, I hope this has been somewhat informative and I hope you've enjoyed this presentation and I hope you enjoy the rest of the week and the wonderful conference have a great day. Cheers.
Unlimited users from organization
Jason Cox's SRE Playlist
Service Level Objectivity: Improving Mutual Understanding Through the Language of SRE Accepted (MediaMath and Google)
Adam Shake, MediaMath Source; David Stanke, Google