What does log4j teach us about the software supply chain?

Dr. Stephen Magill was the CEO and co-founder of MuseDev, and is now VP of Product Innovation at Sonatype. He has spent his career developing tools to help developers identify errors, gauge code quality, and detect security issues. Stephen is a world-recognized expert on program analysis and has led multiple large-scale research initiatives including DARPA projects on privacy, security, and code quality. He also served as research lead for the 2020 and 2021 State of the Software Supply Chain reports. Dr. Magill earned his Ph.D. in CS from Carnegie Mellon University, and his BS from the University of Tulsa. He is a member of the University of Tulsa Industry Advisory Board and has served on numerous program committees and funding panels.

SM

Stephen Magill

Vice President, Product Innovation, Sonatype

Transcript

00:00:14

Thank you Mick. And by the way, Steve spear and I will be talking more on this topic, uh, at the end of day three. Okay. The next speaker is Dr. Steven McGill, whose primary research area was static code analysis. He was the founder of M dev, which was acquired by Sonotype and is now their VP of product innovation. So Steven and I, we have a common love for functional programming languages like Haskel enclosure. And we got to work together on a variety of projects, including the state of the software supply chain report, where thanks to Sonotype. We got to explore the dependency update behavior in the Maven ecosystem. So Maven is a Java as NPM is the JavaScript gems are Ruby and so forth. It was so fun because we got to explore how people used dependencies, how they migrated from one version to another. And what open source projects did when critical vulnerabilities were published against them and see how quickly they fixed it, and then see how those updates propagated through the dependency chain. It was such a fascinating project, and I learned so much about the software ecosystem that we depend on every day. So Dr. McGill, we'll be talking about the softer supply chain with a very specific focus on log for J something that I suspect was unfortunately very relevant to all of us late last year. Here's Steven,

00:01:32

Thank you, gene. It was super exciting work. Um, and it was exciting cuz we had access to some really incredible data, right? All this data about, um, Maven central and the usage of components in the Java ecosystem. Um, we were able to ask some really interesting questions about how open source, uh, projects manage their supply chains. Um, and then it led to some, you know, interesting advice and, and best practices around, you know, what you can do, uh, to be more secure in your use of open source. Um, and so, you know, that was all really exciting and uh, you know, involved a lot of cool experiments and analysis. Uh, and then at the end of 2021, uh, we had this amazing natural experiment of log four shell. So this was a zero day vulnerability in the Java ecosystem. It affected, uh, the log four J component, which is a super widely used logging library.

00:02:24

Um, and it really provided this amazing stress test of our ability to manage the software supply chain. Uh, and I used the word stress test quite literally because it was a very stressful event. Um, it landed on a Friday before a weekend, uh, very close to the holidays, um, affected many, many components across, you know, almost every organization, uh, and led to this huge scramble, uh, to deal with this and get it patched and get the patches out there quickly. Um, and we'll hear more about, uh, what that was actually like on the ground, uh, from Paul Paul's from Morgan Stanley and is gonna talk about their experience with log four J um, but what I'm gonna talk about here is sort of the high level. What did we learn from this experiment? Um, and was it consistent, uh, with what we had found so far, um, in the software supply chain research?

00:03:14

And so I wanna start with the question of, uh, you know, is, was log four J uh, this event, this huge event, was it like plate tectonics or quantum mechanics where it like, it changes how we have to think about things. Now we have to throw away, you know, our previous view of the world and how it works, uh, or is it more like, uh, the certain large Hadron Collider, the particle accelerator, that's doing these advanced physics experiments, but you know, sort of largely validating, uh, the hypotheses that we had about how the world works. And so, uh, what we found is that, uh, we're in large Hadron Collider territory, right? So, um, we've, we've seen confirmation of the sorts of effects, the sorts of trends, um, the sorts of things that are effective that we saw in the software supply chain, uh, research, uh, we've seen those manifest in, uh, the community's response to log for J.

00:04:01

Um, and so I'm gonna talk about this. I'm gonna talk about four different concepts from, uh, the reports and, uh, how we saw that reflected in the log for J response. So the first is this concept of exemplars and laggards. And this has been there from the very beginning, uh, the work that gene and I did in 2019, um, where we discovered that, you know, there's not just sort of a general approach to supply chain management. There's really, you know, a lot of individual approaches, different teams have different focus areas. Some are very focused on attending to, you know, their dependencies and keeping them up to date others, not so much. Um, and so we really see the population break down into these clusters and there's a cluster, um, of exemplars that update very quickly. Um, and then there's laggards that are sort of not attending, um, to their supply chain, uh, at the same level.

00:04:44

And that's, uh, displayed on this graph here. Um, what this is just showing is what percentage of the population updates their dependencies within a certain period of time. So within 20 days of a new version being released or within 30 days, right. Um, and so over at the lower left, as those exemplars, they're updating, you know, in tens of days, uh, when new versions come out and then you can see the laggards at the top, um, that are taking months or even years, uh, to update their dependencies in some cases. And these are large groups, each one is 20 to 30% of the population. We see the same thing in the log for J response. And so, um, I'm gonna come back to this graph several times. So let me explain it real quick. Um, what you're seeing here is a record of downloads of various versions of log for J from Maven central.

00:05:24

So Sonatype, uh, hosts Maven central. Um, and because of that, uh, you know, we have visibility into what, what people are downloading, what, you know, what components are in demand, which ones aren't, which ones are being pulled down and how that shifts over time. And so we can use that to analyze, uh, you know, what is happening out there in the community in terms of shifting away from certain versions and over to, to other versions. Um, it's a bit like monitoring, uh, COVID by, uh, looking at waste water treatment plants, right. Um, minus the sewage. I, I like Maven central. It's great. Um, so we can see here is very, very quickly, um, there is a comp, uh, contingent of projects that upgraded, uh, so the red versions here are the vulnerable versions. Um, and so you can see within one to two days. So the very first column here is the first day after the disclosure of the vulnerability.

00:06:14

Um, 40 to 50% of projects, uh, had moved on, right. They had op updated and, uh, adopted, uh, secure versions of log for J but then it sort of plateaued, right. So then, um, there was this much longer timeframe, uh, where the laggards are sort of slowly updating. Um, and then, you know, you sort of see slow progress and then, then another plateau. So, um, sort of a very real world manifestation, uh, of what we had seen about, uh, having exemplars and, and sort of laggard cohorts, concept two, um, is all about staying secure by staying up to date. And so, um, the idea here is that, uh, yeah, you can sort of just pay attention to big events, like log for J right. And update. Then when you hear, you know, that you have to, because there's this, you know, sort of critical, uh, patch that you need, uh, or you can just stay up to date as a, as a matter of practice.

00:07:05

Right. Uh, and what we find is that those exemplars, those teams that are best, um, at updating and, and staying secure, uh, they do so by adopting a culture of just keeping dependencies up to date. So, uh, staying up to date leads to the security. Um, and so we, we see that again here in the log for J um, adoption. So another thing to know about the log for J vulnerability is, uh, because it was a zero day, um, the security community had not seen it until it was publicly disclosed. Um, we got to see the vulnerability research process play out in real time. And so, uh, the security community, you know, immediately the log for Jay maintainers, uh, sort of got a patched together to fix the initial, uh, vulnerability that was disclosed. But then as the community took a deeper look, they discovered, oh, you know, there's some additional issues that need to be fixed as well.

00:07:52

You know, in this case, under these conditions with this configuration under these sorts of deployments, right. There's additional issues. And so there were a series of patches. There were actually five patches that came out in less than a month following that initial disclosure. And, uh, you know, we were really interested to see, uh, are people gonna adopt these subsequent patches? Cuz the first one was the highest, uh, criticality one, right. That affected everyone. And then the subsequent ones were sort of more niche, right? You needed certain conditions, uh, for them to apply to you. Um, and so, you know, would people do this calculus of like, am I actually vulnerable, um, or, or not. Right. Um, and what we found is that they didn't right. They just, they just adopted these new versions. And so you can see each of these colors. So the non-ad colors are various patch levels of log for J and you can see those get pretty consistently adopted, right?

00:08:38

There's certainly some people who only adopt the first, but it's down there, you know, sort of very low, right? Most people have moved on at this point to, uh, the latest, which is 2.1 7.1. And we've seen that in the software supply chain work as well. So this is a graph, um, from last year's report showing, uh, spring framework and showing how people migrate to various versions of spring framework. And there's a lot of stuff in here, you know, you can go back and re-watch and pause if you wanna read sort of all the callouts, uh, the main things here are in, in orange, um, which is the thing to know about this graph is time is marching downward. So each row is a week. So it's a week of updates and it's, you know, someone decided to update their project, uh, their version of spring framework that week, then there's a, you know, little square here showing what version they moved to.

00:09:23

Um, and the new versions that come out, those are on the right of the graph. So, you know, there's a new column at each time, a new version is released. And so that, that right part of the graph that's circled there is really that's the cutting edge, right? That's the new versions that are coming out and what you can see. Um, the, the fact that those squares are all very dark indicates that there's a lot of activity there there's a lot of people moving to those versions. And so we see that's essentially this cohort that is staying up to date, uh, sort of at, or very close to, um, the edge of what's current. And, um, and so that means, uh, that you get, uh, security, you know, you get a certain amount of security from that. A log four J was, you know, a bit odd in that it was a zero day.

00:10:02

Um, it was not known by the security community before it was announced. Usually, uh, vulnerabilities get discovered by, you know, say the white hat, uh, security research community. Um, they work with the, uh, maintainers, not in public, you know, sort of privately work with the maintainers to get it patched, to get it patched out there. And then the vulnerability is disclosed later. And so in those cases, if you are at the, at the front there, um, then you, you know, you will already be secure when that vulnerability disclosed. And so there's essentially this vulnerability buffer, right? You can think about, you know, there's the people who are current with respect to their dependencies. Um, and then there's these old versions that are, have known vulnerabilities that you sort of don't want to be on. Um, and what's not depicted here is that, but, you know, you can sort of imagine over time, uh, this, this wall of red, so the red is vulnerable versions.

00:10:47

It sort of marches to the right, right. Uh, because you know, new things get discovered in prior versions. Um, those eventually get patched, new versions come out. Um, and so, you know, if you're close to, to where the red is, uh, you're much more likely to need to react to a disclosure, right. And so keeping this buffer helps you be proactive versus reactive concept three is that, um, transitive dependencies matter. And so, um, this is, you know, one, one thing that's, that's difficult about, again, something like log for J is if you're using log for J you can update your version of log for J no big deal, right. But if you're using a package that itself uses log for J you're kind of dependent on that project to adopt the update. Um, and so you kind of have to wait for them to, you know, to update their dependencies so you can update your version of that package so that you can be secure.

00:11:34

Um, what we saw with, uh, log for J was the community really worked quickly, um, to update all the various uses of it. So these are, um, commits to GitHub that mention the CVE, the log for JCV. You can see, you know, a bunch of them happening on December 9th, which is, you know, when, when it was first announced. Um, and we also found in last year's, uh, research that the community in general is getting much better, um, at, uh, remediating these issues and at sort of keeping up to date with respect to their dependencies. So what you see here is a by year graph, sort of a histogram of how quickly do various projects, um, update their dependencies. And so to the left is better. Um, and, and you can see sort of each curve, um, as the years, March on is, is getting a little bit higher and a little bit farther to the left.

00:12:20

And what that shows is, first of all, the amount of open source is increasing cuz the height of the curve here is sort of proportional to the number of projects doing releases. Um, but also the update speed is improving. So it's moving to the left, it's taking less time, uh, for these projects to push updates when their dependencies get updated. So that's great. That means, you know, sort of the capacity is there to deal with, uh, vulnerabilities and transitive dependencies concept four is, uh, that some dependencies just never get upgraded <laugh> and, uh, I, I, I wanna ask a series of questions and, and think about this, right. So how long did it take to get to 90% remediation? And I have the answer here already, like infinity or, well, it's not done yet. Right. Maybe we'll get there, but <laugh> the clock is still ticking.

00:13:02

Right. All right. Um, and this is for log for J so what about 80%? Same story, right? What about 70%? Okay. Finally we get an answer to <laugh> to get to, um, 70% of downloads being non vulnerable versions, uh, took 52 days. And so you can see that here on the graph, uh, you know, it happened, there was a spike on January 31st, um, where, you know, most of the download 70 plus percent of downloads were, uh, were non vulnerable. There was sort of some slipping after that. This is, you know, this can change based on, you know, who's automating what and, and, um, and various changes in CI processes. Um, but it sort of reaches this steady state of 35%, uh, downloads being vulnerable. And that hasn't been changing for a couple of months now. So, uh, you can imagine, you know, these projects probably are just gonna stay vulnerable.

00:13:55

Right. Um, and actually we we've seen that in the, in the supply chain research as well, more broadly. Right. So when we looked at, um, this pattern, right, when we showed these, um, you know, these projects, this cohort, you know, staying up to date, doing really great making choice, good choices when they, uh, update their, um, their dependencies, that was for the dependencies, they were updating there actually 75% of dependencies, uh, that were never upgraded. Um, and so, you know, that's, that's sort of disappointing, that's something to be aware of. Right. Um, make sure that you don't have dependencies that are languishing like this, make sure everything is getting attention when it comes to this, uh, practice of, of keeping things up to date the takeaways, um, in terms of, you know, what to learn from this, what do you apply, um, you know, at your organization?

00:14:40

Um, the first is this stay secure by staying up to date principle, that's really, uh, you know, that goes a long way, uh, to keeping you, uh, secure is just having this practice, this of keeping dependencies up to date, um, and as open source update performance, uh, gets better. So that graph we saw of, uh, open source getting better and better at, you know, uh, dealing with these trans, this transitive dependency problem, um, that becomes more and more effective because if you're keeping your dependencies up to date and they're keeping theirs up to date, then you're, you're protected in your direct and your transitive dependencies, and then make sure you're updating all your dependencies. Right. Make sure you're not one of these, um, uh, one of these teams that has, you know, a majority of their dependencies that just never, never get love. Right. <laugh> never get updated, um, and do all these and you'll, you too can be an exemplar, right? So <laugh>, that's where you wanna be. Right. Um, sort of modeling what it, what it means to have, you know, really good software supply chain practices.

00:15:38

And then I have some additional guidance, um, when it comes to zero days. So, you know, all of those things, I just mentioned, those, those will keep you generally healthy work for the majority of sort of responsibly disclosed vulnerabilities. Zero days have sort of an extra thing to consider, which is, um, they, they're sort of, when they're announced, uh, you have to be reactive, right. I showed that graph of like, if you have this vulnerability buffer, you can sort of be proactive and, and plan your work in terms of updating, it doesn't happen with something like, oh, for J it's an immediate fire drill. Right. Um, and so how do you manage that fire drill? What helps, uh, with sort of getting a good outcome? The first is inventory, right? If you have a full software bill of materials for all your applications, that helps you answer the first question that comes up, uh, in, in this remediation, uh, step is, uh, you, you, you need to find out where am I using this, right?

00:16:29

What application, first of all, are we even using log for J right? There were <laugh>. I think there were a lot of, um, you know, there were a lot of organizations where that's the first question, right? What logging libraries do we use? Um, and then if we do use it, where, what applications, you know, what teams need to be aware of this. Um, and then if you can have the ability to centrally monitor consumption, right? So you can use something, um, like an artifact repository, a cash, you know, a proxy to pull in your dependencies, so that you have awareness of what's coming into your organization. That gives you a lot of visibility about where you are, um, and about how their remediation is progressing. And then to actually do the remediation, um, you know, continuous monitoring and remediation guidance. Like the, the more that, that sort of, uh, software supply chain guidance is integrated into development workflows.

00:17:14

The more it can happen, uh, just as a matter of practice and just automatically, right. You can let the development team know, oh, you know, go check your report and follow the remediation guidance. Right? Yeah. You don't have to get involved individually with producing that guidance for each team. And then a big part of pushing those out into production is having mature DevOps practices using C I C D and having that ability to fix an application and then deploy it very quickly, um, to, to remediate that vulnerability. And, uh, in the reports, um, again, this was, uh, the last year, and then we also did the year before, um, we did a survey of practices to see, you know, where are organizations with respect to these things. Um, and by and large, you know, these were, um, these were areas of maturity that organizations reported, uh, that they felt good about, um, that, you know, certainly there's a range.

00:18:02

Um, but they were clearly areas that, that were being prioritized. So I think, you know, the industry is moving in the right direction in terms of appreciating the importance of these things. Uh, it's just a matter of making sure that they are rolled out consistently, um, that they are getting the attention that they need. So if you handle, if you, uh, can get all these practices in place, then you know, the next time, the next log for J comes out, uh, can be less of a fire drill, less, you know, maybe it's still a stress test for the software supply chain in general, but less of a, uh, stressful situation, uh, for you and your employees. Thank you.