DevOps during COVID-19

COVID-19 is not just a centennial public health and economic crisis. It is also a unique sudden experiment in digital transformation. The way we work will be different forever.


What previously appeared to be niche differences in DevOps performance now have significant impact.


Sam reviews three months of research since the quarantine and identifies factors that differentiate teams that survive from those that thrive.

SG

Sam Guckenheimer

Product Owner, Azure DevOps, Microsoft

Transcript

00:00:08

Thank you for coming back Yesterday in the plenary session, I chatted with Jean about What we've learned so far around dev ops COVID and the future of work. My goal in this session is to dive more deeply into the data. So spend some time on the research data. I'll spend more on what we know about stress during the COVID pandemic. I'll isolate what we know about working from home with, from COVID by looking at pre COVID work from home research, and then I'll look at recent studies that compare from co pre COVID, a pandemic to during the lockdown. Then I'll talk about what we observed on working effectively. What we observe on applying dev ops more effectively And what we should expect going forward into the new normal. So first Let's start with what the data indicate. And since we're meeting virtually in London, data are plural and they indicate that developers are doing more. Is this a surprise to many of us? This is a surprise to me. I expected to see that when we went into lockdown from COVID, we would see a drop in developer productivity. We don't find it. I'm going to share two data sets with you. They're completely independent and they're both big. So this one is from Microsoft internal,

00:01:58

Uh, it's data out of the one engineering system, which is built on Azure dev ops, Where we can measure across all Microsoft engineering levels of activity. And we find that we look at CLO requests year over year, They're up in 2020 relative to 2019. And if we look at this black line here, we see that when we closed the office and Made everyone go home mandatorily across the company, the beginning of March, there was no visible change in poll request volume. We tried to drill further by looking at the data week over week. So in other words, comparing Monday to Monday, Tuesday to Tuesday and so forth. And each of these, uh, colored bands is a different week with magenta being March 8th.

00:02:59

Uh, by then everyone was fully working from home. The offices work completely shut. So what did we see? We saw that, uh, people were still working quite a bit. In fact, they were working longer days. It appears that, uh, engineers were starting their day earlier. They were finishing their days later. We didn't see the usual midday dip of folks going to lunch as they do when they're in the office. And, and the, uh, cafes are open for certain hours. We didn't see a strong, uh, mid-afternoon breaks that we typically, uh, would see in prior weeks. So it looked like activity was there with longer days. Some concern about the implication of those longer days. Now let's flip to the GitHub dataset. This is data from github.com. Here we compared 2020, again to 2019, looking at the amount of time for each contributor between the first push of the day and the last push of the day.

00:04:23

So in other words, if you make your first push of the day at 10 o'clock in the morning and your last push at five o'clock in the afternoon, that's a seven hour day. You start at nine and go to six that's an 11 hour day. And we compared the length of those days from 2020 to 2019, we saw that there longer in 2020, and that it appears they go up in March and it appears that they're continuing to be longer as we go further. This data was put together by my friend and colleague, Nicole Forsgren with her associates and published in the October's pulse report. Recently,

00:05:13

We also see a big difference if we look at the volume of code pushes 20, 20 versus 2019 and a year, what you see if you look at the beginning of the year, January, February is I see a whole bunch of random noise in terms of, uh, how much code is getting pushed. But then as we go into March and people are going into lockdown, we suddenly see a big rise in code pushes as more work is getting delivered more work during the lockdown period. Now, if we compare pull requests, the requests, uh, for changes year over year 2020 to 2019, we see the 2020s a little higher than 2019. We don't see, um, any particular difference, uh, before or after lockdown. However, if we then look at the cycle time that it takes for those pull requests to get approved and that code to get merged, we suddenly see in March a big drop in the wait time for pull request approval. In other words, cycle time is improving and we see this in two cohorts. We see this both for the enterprise cloud accounts, and then we see it for the other paid accounts, the smaller team accounts on get GitHub.

00:06:56

We also see that open source is climbing radically after lockdown. The number of new open source projects being created is going way up. And we also saw, uh, I didn't include the graph that there are a number of open source projects where activity is going up considerably. So all of these are indicators of people doing more during the lockdown period, doing more with less how's it possible. We all know that we're stressed out during this pandemic. We know that it's the worst health crisis in the century. We know that we're going into the worst economic crisis in 80 years, we've seen in most of our countries, a level of civil protest for social justice, racial justice, uh, control, uh, uh, uh, immoral policing activities, uh, control, uh, uh, systemic problems. And we've seen the counter protests under false flags that have, uh, frequently turn these into violent confrontations. And that's happening globally in places like London, we know that's happening. And we know that even without all of that, just from the pandemic, we would see a lasting mental health crisis. We saw that in the Lancet review article that looked at previous quarantines from endemics and that mental health crisis will be here for years afterwards.

00:09:02

We also know that we're in an industry where burnout is a second less talked about pandemic, Dr. Christina Maslov at the last summit, went through all of the ill effects of burnout on our lives. And how much of a toll those burnout symptoms take burnouts getting worse now compounded with the other stress Harvard business review last month, summarize this well, calling it war room fatigue that we're all feeling. So why aren't we seeing a broader problem than we are? Well, I wanted to step outside the pandemic and look at what we know about work from home independent of the pandemic. And indeed, there's a fantastic study of a large sample randomized clinical trial of work from home. This was done by James Lang, who's founder and chairman of Ctrip, a public company with 40,000 employees China's largest travel agency. And he did this in conjunction with professor Nicholas bloom from Stanford.

00:10:40

And a few associates is as coauthors. It's a beautifully documented, publicly available paper then was wondering to support the companies on ongoing growth. Should they keep investing in all of this expensive office space? And they listened to workers who were saying, we'd really rather be at home. We don't like our commutes. So he said, who wants to volunteer to work from home? And he got more than a thousand volunteers and then used an independent, random variable their birthday to separate them into two cohorts. So if you had a birthday of where the date was 1, 3, 5, and so forth, odd, you would be in a control group. If you had a birthday that was even two, four, six, eight, and so forth, you would be in the, And the treatment group would get to work from home. Now, remember, these are all volunteers who said they wanted to work from home. And the ones who work from home had good working conditions. They had an ergonomic workspace, they had decent internet, they had decent equipment. They had freedom from distraction at home. What Ctrip found Was over the nine months of the trial, a 13% improvement in output. This came somewhat 3.5% of the 13 from

00:12:36

The at-home employees taking more calls. And 9.5% of it Came from their improved punctuality. The bus didn't break down on the commute. They didn't have to deal with something on the way to work or run an errand and get delayed. So the overall impact Was pretty astonished. So at the end of the trial, Ah, before I get there, so there was One other interesting thing about thing. Working from home. Attrition dropped by 50% employees who were working from home were half as likely to leave the company twice as likely to stay. Now, there's also one other effect, which is hard to discern. They were less likely to get promoted. This could be because they declined promotion saying, Hey, I like working from home. And if I take that offer, I need to go back to the office. It could be because they were overlooked. It's probably a combination of those. It is a red flag for how you work effectively as an organization in the new hybrid world. After the nine month trial And the success. When I opened the option to work from home to all employees. Now, some interesting things happen. There were three groups,

00:14:21

The treatment group, the red diamonds here on top. Those are the ones who did work from home during the trial period, several of them said, Luna, go back to the office. It feels too isolated. So the exercise that choice of the control group, these blue pluses, several of them, most of them said, yeah, we want to work from home. Now we, uh, missed out before. And then if the ones had not volunteered to work from home, The green ones, A large number of them said, yeah, we, you know, we see what everyone else is doing. And, uh, we'd like to do that too. So now that's an option for everyone at Ctrip Profits for the company, went up $2,000 per employee by allowing work from home C trip, rolled this out as an option to everyone in the company. And it became a great recruiting tool to get more of the self motivated employees who wanted to stick with the company. Now I realize this is a travel agency. It's not extraordinarily highly skilled work, But it is an illustration that the work from home model can serve as well. So that was of course,

00:15:58

Eight years ago, before any of us had heard of Cars. So to, or COVID or anything like that, Let's talk about a 2020 study that compares the period just before COVID locked down with the period just after. So in other words, February of this year, to April of this year, And in this study, we look at Microsoft employees, individual contributors and managers, and we are doing primarily a diary study here where employees are self-reporting every day, We find that work from home and COVID are pulling in opposite directions. Working from home, remote work increases focused time available to the employees. And this was clear pre lucked out COVID introduces stress. It's decreasing focus time. We're all stressed out. Both of these effects we find are larger on managers and they are an individual contributors Meetings are also pulling in opposite directions. Working from home, decreases the need for scheduled meetings. COVID in practice has increased The scheduled meetings at calendars. This is visible for ICS. We don't see quite, uh, we don't see a statistically significant effect on managers. In this particular study, Collaboration Appears to become more difficult. So remote work Makes ICS Feel collaboratively, isolated. On the one hand, they get increased focus, time, more control over the workweek reduced scheduled meeting time, but they do have a harder time getting together to collaborate on the next innovative thing

00:18:29

For managers, The collaboration with others Isn't produced in the same way, but the cost they pay is that they have noticeably longer work weeks. Now in looking at all sorts of qualitative data about working from home, There's a tremendous amount of noise for individual experience. One of the things that stands out is that the basics we know about effective work in an office environment apply to effective work in a home environment as well. Again, professor Nick bloom shared some great pictures of home working conditions. This is one of his grad students Who had to reserve time in the clothes closet and sit on the shoe rack, Hunched over a laptop, literally on her lap in order to participate in an online session. I look at this and I see Chronic back pain. I see chronic stress. I see horrible ergonomic conditions and horrible problems paying attention. On the other hand, we do know that there Thrivers who have dedicated workspaces at home, where they can focus, they have good internet bandwidth, low latency. I know some of my colleagues in India, you know, will say, I'm not going to do video from home because you know, the latency here is pretty bad at home in the us. 25% of our population does not have broadband. And this is a problem we need to address

00:20:27

Socially. The Thrivers have ergonomic furniture, good chairs that support their backs, Standing desks like I'm using now multiple high resolution screens so that you can participate in the video on one screen. And then the chat on, We set up spaces that are free from interruption.

00:20:53

We observed schedule rituals so that we know this is family time. This is work time. Our colleagues know this is our work time. We take care of ourselves. We get our exercise. We eat well. We sleep well. We scheduled breaks. We take time off. This is another problem. We've seen a huge drop in vacation reporting since, uh, the lockdown. This is, uh, something where we need to model that you still take vacation. Even if you can't travel. And the Thrivers use one-on-ones for social connection, they invest in maintaining their social capital. They invest in their connection. They stay connected to humans. Now over time, this is going to be a problem as our worlds change, but we need to use this. What the technology gives us in order to keep our human connection. It's also clear that online meetings are a learned competence, the strugglers, well they're multitasking, not really paying attention. They're in distracting spaces. The meetings are too long and are taking too much time. They S people stress over eye contact. So if I'm here looking at my screen, you notice I'm no longer looking at you. I'm reading, uh, what's down there. And, uh, yet when I'm talking to you, it really is less pleasant than if I'm actually looking at you. Like this.

00:22:38

Strivers are coping with typically highlight and see networks for bandwidth. And there's not enough preparation, those meetings In contrast the Thrivers. Well, they're very deliberate about turn-taking in meetings and well-run online meetings have very clear turn-taking whether they're moderated by someone explicitly for that, or whether the group self moderates people use gestures, they in Microsoft teams or in zoom, whatever they raise their hands, or they raise their hands in front of the camera so that they can join in. While one person's presenting with video,

00:23:28

We use chat for side conversations, just like we'd, you know, you do side conversations in the meeting room. And when we're not in the room, we use IAM for quick response in the meeting room, the virtual one, we do check in explicitly because you can't read the room. I can't see all of you who are listening to me right now, but in an interactive meeting I can check in and I can pause and, and see if you're with me. We also are intentional about the breaks. So we set up our calendars and outlook or whatever, so that a default meeting is 25 minutes of 30 with a scheduled break, or a longer meeting is 50 minutes of 60, with 10 minutes in between. And everyone knows that we use the con, we use conferencing for intentional social connection. We check in how are you doing?

00:24:34

How's the family doing? Is everyone healthy? Is everyone feeling okay? What's going on? We make sure everyone's got good equipment typically, or more monitors, decent internet, decent furniture in decent lights. And we positioned the camera and Mike intentionally so that you can focus on me as the speaker and not be distracted by all the stuff that's going on around now on the defensive side, we also make sure that we have good security hygiene. The world health organization has reported a five fold increase in cyber attacks. And that's typical. The attackers assume that you're distracted. We are all distracted. Every public system is under stress.

00:25:37

Phishing attacks have gone through the roof. We need to train our people and train our machine learning algorithms to recognize phishing better and not get sucked in impersonations are becoming common, including identity theft. Someone filed an unemployment claim using my name and social security number that was stolen some years ago from one or another unsecured database. And they tried to claim unemployment, uh, go into some bank account of theirs. And it was a claim from off that originated off shore. And that's part of a common scam. And then ransomware is on the rise, particularly for small businesses. And I'm going to let my dog out

00:26:35

Just so that you see I'm actually in my office and my 15 year old

00:26:42

Dog actually wants to go outside. So make sure your security hygiene is good and that you're working with your cloud provider for that. And you're using reasonable threat intelligence to do it. Now, I'm going to spend a few moments on dev ops and how you use dev ops for anti-fragility. So the first thing is rip off the red tape of all those procedures. As recently as February, we would hear from every regulated customer, Hey, you know, the FDA doesn't let us do that. Uh, FFIC doesn't let us do that. Nesta says, no, blah, blah, blah, blah, blah. Starting in March. It was well, how quickly can we use the cloud to help? A great example is telemedicine. I get some of my care at the university of Washington medical center, and, you know, virtual visits were unheard of in February. Now they're the norm learn from the open source projects.

00:27:52

This is visual studio code. It's the most contributed to open source project on GitHub. It's now the world's most popular ID and it works by having a very rigid pipeline with continuous automated testing. So that green means green and red means red and only 100% green builds make it to release successful product teams like vs code. They'll optimize for asynchronous workflows. They'll connect with our community continually through issues and discussions. They'll focus on the outcomes they want to achieve. Not, not spend time on outputs. People are running services. And I hope you've seen the talks from Eric and Scott here from CSG. They practice mature incident response. If you have tech debt, get clean, get rid of it, and then use automation to stay clean and use the automation to shift quality left like we saw with as code and shifted, right? Like we see with CSG, think about any manual approvals in your process and are they necessary? How many can we get rid of all the way from idea to deploy to data, ship, to learn frequently, the shorter you make your delivery cycle, more validated learning. You build up and the shorter, the cycle, the faster that value compounds like compound interest and remember automation doesn't care where you work. It doesn't care about the pandemics show your users the status of your service. And if you have an incident that affects customers, be clear about it so that you and your customers can build up a relationship with trust.

00:30:18

Finally, let's think about the new normal. The bank of England did a study where to quote, uh, Randy Gibson. The future appears to be here just not evenly distributed. And they looked across industries and saw that working from home is indeed becoming a pattern. And as I cited in the Saturday session, the MIT report found that the strongest positive financial effects were in non-tech companies. Do it make working from home healthy, make sure people can arrange family care for children or for elderly, elderly parents. Make sure they've got good broadband access, make sure they have ergonomic furniture and don't burn them with unnecessary ceremonies. And if you do plan a soft return to a hybrid environment, learn from the people who've done it in a healthy way. A tool Gawande published a great paper in the new Yorker about health care workers at hospitals using masks, hand hygiene, physical distancing, and daily screening in order to make it possible for healthcare workers to work safely.

00:31:46

And then they let the data inform the decisions of what they do. So embrace the new normal automate to get clean and stay clean. Don't let people do the work your machines should do. Do let people have days to renew the social capital do allow certain days to be dedicated for focus. If you go back to a hybrid model, do work as though everyone's so remote. So you don't create two classes of citizens do provide employee choice. See trip showed that value measure, how it's working continuously and recognize that in this new cadence, we need to inspect and adapt at least quarterly. Thank you.