Tales From the DevOps Loop: 4 Teams Approach to 1 Common DevOps Framework (US 2021)

Denver has been working in IT since 1998 and has a Masters of Information Technology and an MBA in global technology management. He is currently enrolled in an Executive Doctorate in Business Administration program at Crummer Graduate School of Business, Rollins College in Winter Park, FL, dissertation topic includes DevOps. Denver has been teaching at various colleges and universities since 2001 as an Adjunct Professor of IT, currently teaching with Purdue University Global. He is the Director of DevSecOps for Thread Research, a leading provider of Distributed Clinical Trials for Pharmaceutical and Medical Devices. Denver has held many roles in the IT industry, mostly in operations and support. He started his journey into DevOps in 2017 when his team read The Phoenix Project as a team book club. He has since led efforts at 3 companies with successful transition into adopting the DevOps mindset and moving them into a DevOps model and framework.

breakoutusvegaslas vegas2021
DM

Denver Martin

Director of DevSecOps, Thread Research

TRANSCRIPT

00:00:14

Hi, and welcome to tells from the dev ops loop. Uh, we're looking at four teams and one framework before we get started, I'd like to introduce myself. My name is Denver Martin. I am currently the director of DevSecOps at thread research. The red sea research is a clinical trial company. Uh, we specialize in doing distributed, um, clinical trials for pharmaceutical and medical devices. Um, we have participants in studies that happen all around the world. Um, so we are, um, continually improving, looking to make DevSecOps the best practice to follow for this type of work. Additionally, I am an adjunct professor with Purdue university global. I've taught it courses there since 2005. It is produce online institution, um, for delivering high quality education to, to adult learners, as well as traditional college students that prefer to do online or distance education. Um, I've also started work on my doctorate of business administration.

00:01:21

Um, it is an executive MBA program from the creamer graduate school part of Rollins college in winter park, Florida. Um, my dissertation topic will include information, uh, surrounding, uh, dev ops, uh, the implementation of dev ops and the use of disruptive technologies like cloud technologies, SRE teams, uh, containers with Docker and Kubernetes. Um, so it is, uh, a great program if you've got an idea of how you'd like to add to, and continue to improve a business across your organization or a across business and almost any industry and landscape. Um, having said that, um, I am coming to you live tonight from, uh, the, one of the classrooms that we have available for, uh, students and instructors to use. Um, so let's get started and kind of look at, um, one of the implementations that I've been part of. I've been part of several implementations at dev ops at several companies.

00:02:22

Um, this is not my current implementation that I'm working on. This is a previous one at one of my previous employers, um, and it successfully shows how, uh, I've movement into dev ops can be done. And I'm not saying that this is the way to do dev ops at every institution. I've actually implemented it in different ways that each of the institutions that I've worked on, uh, based on their company culture, the individuals that we have assigned within the teams and at various roles, and you do need to do an assessment if you will, as to what do you think would work in terms of adoption? Um, so with that, um, let's look at, uh, this particular, uh, implementation of how dev ops was successfully rolled out, uh, within an organization. Um, so with that, I'll go ahead and show you some of the information I have.

00:03:23

So what we'll do is we'll start off a little bit of background in general, and then we'll talk about the leaders in teams, and then we'll come around to the ways of working. And this is the ways in which we had to change the ways that, um, we were able to have success happen amongst the four different teams. And ultimately they all ended up being part of the same, um, enterprise framework, uh, for the way that we've implemented a dev ops. So the four teams, uh, started off as an, as almost like an independent journey, uh, if you will. Uh, so with that, uh, I ended up looking at is, uh, we had three operations teams and one development team not to say that that development team was just one group of development. Uh, the development team actually included all of the development, um, groups that were inside of that, that area.

00:04:24

So that actually became a pretty large team. Um, but dev development, uh, started their own journey of going through dev ops, um, ironically without operations involved. Um, however operations did quickly catch up and we did end up, um, being able to share the common tools and being able to move forward together. Um, but it started off very much like a silo, uh, with each of the groups kind of doing their own thing. Um, so in that, uh, let's talk a little bit about operations, uh, at that time operations, uh, included working long hours being on call, having people very burnt out, uh, running from one fire to the next, um, and, uh, really no, no relief in sight. Um, we have the, the fortunate, uh, good luck of a executive member, a senior director who eventually became a VP, um, had the idea of not only moving, uh, all the production systems and all the development systems and all the operations systems from an on-prem, uh, self hosted cloud into, uh, AWS cloud, which gave us, uh, an easier time, uh, better standards of quality, uh, and relieved us from the day-to-day operations of having to maintain our own cloud fleet, um, in regards to the hardware.

00:05:47

Um, so this was a very wise move. And on our part, you could go to other, uh, thought prop forms, if you felt that you had enough skill set in that. So that could include, uh, looking at, uh, Google cloud platform, um, even Oracle cloud infrastructure, or, uh, even, um, Microsoft's as your product, if, if that is the one that you like, we settled on or agreed to, uh, AWS, uh, as being our cloud cloud form of choice. Um, and so what we'll do is we'll now step through this a little bit and talk a little bit more about the individual teams here, and the slides do have a little bit different information. I didn't want to read those verbatim to you, so definitely it's good information if you wanted to share. Um, but, uh, actually listening and observing this is going to be something that you'll actually take away a little bit more.

00:06:41

So if we look at leaders and teams, uh, team one is how identify them. They were an operations based team. Uh, they predominantly work on the network infrastructure. Uh, their team size was, uh, nine members, seven of them in the U S two of them were in India. Uh, they did provide worldwide coverage, um, and would that, um, prior to dev ops, they were working 80% of their time, uh, firefighting. And that's both in India and U S hours. And even after hours, um, only 20% of the work, uh, was available or time was available for them to work on 30 plus projects. So they had a lot of work that was piling up on them. Uh, the very little time to do that, uh, the break fix, uh, didn't take up a lot of hours of work there. Uh, the on-call rotation, uh, was pretty much treated like a hot potato.

00:07:28

Nobody wanted to take it. And when you got off of it, you were very relieved and very happy. Um, the skills on this team, um, they moved to AWS cloud from a traditional physical data center. Um, so they had really good skills. They understood their, their craft really well. They understood networking very well. They understood how networking impacted other teams and other groups. So there was a good foundation in that regards. Um, they were responsible, uh, for the work centers. So the work centers that they had in place were building out the environments, uh, connecting to remote clients and doing basic operations. So if something broke along the path or the connection end, and they would go through and work on that, um, the prior leader, uh, before the new leader, uh, joined, uh, the prior leader had very command and control type view. They handed out individual tasks, assignments.

00:08:20

They pretty much siloed each of the resources. The resources did not work together as much as you would think. Um, they each had their own projects or their own enjoyable items that they were working on and, and hardly ever, um, talked together. There was never any team meetings. It was, uh, a very, um, heavy, uh, leader ladened, uh, type organization when the dev ops leader joined and, um, and subsequently replaced, uh, them. And then prior manager's role, uh, they ended up, started a delegate workflow. So they figured out what the workflow was for each of the different work centers, and then figured out a way to delegate authority, moving it closer to the individual team members, and then encourage the team members to work together as a team, as they work through the different components and different pieces. Um, they also looked at setting up a ways of doing continuous learning, which included doing a book club to introduce them to the dev ops mindset are already in the cloud.

00:09:22

They have some advantages that they weren't able to take care of or take advantage of because they were still working and the older it shop mentality of, you know, if it's broken, fix it. And then whenever you've got time, you work on other projects. Um, so that was kind of a tough, tough way to look at it. So there was a transition plan put in place to having them adopt more of your dev ops mindset, which I fully believe dev ops while has tools. And it has principles and philosophy. More importantly, there's a mindset for people that have worked in dev ops and continue to work in, in dev ops.

00:10:00

Now, team two is still an operations based team. Um, they are more or less platform infrastructure, so they would deal with the S3, which is the storage within the cloud. They also would work on the C2 instances and OS level, uh, patching and all the things that are more platform centric, uh, RDS database would roughly fall under their preview. Um, as well as they made sure the instances that were supporting RDS was doing well. They may not have been database administrators, um, but they were able to, to make sure the foundations and the functionalities were there again, they helped move, uh, to the AWS cloud from the traditional physical data center. So they have much of the same skillset as the team one did. Uh, their prior leader again, was a command and control task assignment, uh, siloing all the resources. Um, it was the way of working if you will, at this, uh, company originally, um, they actually had to transition to two DevOps leaders.

00:10:59

The first one did introduce them to tools like Terraform and pushed automation, uh, in the forms of scripts, they increased, uh, project workload, uh, to give every engineer their own little pet project. Uh, they assign a task after reviewing every ticket with the, with the team. So they reviewed the queue together, and then he ultimately assigned out the work, um, while it was a step in the right directions for dev ops, but it wasn't truly embracing where you would expect your leadership to be, um, at that level. Um, so there wasn't a need for there to be another dev ops, uh, leader put in place. Um, and this dev ops leader was actually promoted from team one. Um, as they had a really good grasp the idea they had had pre previous management experience and they saw all the advantages that were happening in team one during the, the revolution, if you will.

00:11:52

And so they took a lot of the same, uh, ways of thinking and working, um, into a team to, um, from team one. So, uh, again, it was a really interesting piece and having come from team one, it created an immediate feedback loop, uh, between the two teams that actually had a lot of work that we're interdependent on each other. So if we look at, uh, originally it was 85 to 90%, break-fix firefighting and only 10 to 15% on a planned project work. And if you can imagine they were in more of a firefight than the network team was our team one was so they definitely had a need and a want to move to a dev ops roles, take a break from operations teams and we'll look at the development team. So team three is our dev ops team for mid up development. Um, this was a full-on dev op shop.

00:12:51

They had multiple streams, they were doing agile, uh, project management for development efforts for doing major releases. All major releases were happening instead of iterator plea as large, big bang, uh, releases of the, of the code software that, that the company was producing. Um, so it's kinda interesting that they were doing dev ops, uh, on the actual, like development of the code, but then the code itself got major releases. And so they try to, what you would have is basically code being released, you know, two to three times a year. Um, if you're lucky you might get four in a year, um, depending on if, if they were doing the opponents points poker correctly, and if they had enough to make it a major release, um, but instead of following the iterative approach that you typically would think of, um, you know, making feature requests is sooner, faster, quicker.

00:13:42

Um, those just seem to go a little bit slower, uh, in that regards. Um, now they did have a mastery of the dev op tools and repos and source control of code. So if you're looking at people that could execute dev ops really well in terms of tools and technology, this was your team. The, these were the people that had that down. And as far as the development work goes, but there weren't any real feedback loops from operations. So after they developed it, they released it often telling operations and the customers at the same time that the code is ready. So operations had no time to prepare their scripts for install or for upgrade. Uh, there was no, um, you know, mechanism saying, Hey, we're working on this particular piece. Um, or any feedback from operations, letting them know that something is broken. So definitely, uh, a troublesome way to look at having dev ops implemented in half of your organization.

00:14:36

Um, again, in this one, we did have, um, a, a dev ops leader that was there that pretty much saw that the tools belong to his group and that he would let us on occasion use those and operations side. Um, as you can probably tell, I was part of the operations group, uh, within this movement. Um, and we often had to have, uh, special tickets, open, special firewall rules that were opened up between the organizations to allow us to have access to the repos or to be able to do source control control, or even to access JIRA engine. Ken's part of our tool stack, uh, for creating automation and looking at doing operations. Um, so it was, it was quite a contentious undergoing, um, basically they were the dev op shop and we were there in our Lopers. Um, eventually that leader, um, was replaced with, uh, another dev ops leader.

00:15:31

Um, and in this case, they partnered with us. They actually worked with team one and team two, and eventually team four, which we'll talk about in a second. Um, we invited them to the BPMs, the blameless postmortems to talk about things that we recognized in operations. Uh, they actually took the foot feedback back and looked on ways to improve the stability of the product and applications, uh, based on what we were seeing operations, uh, they looked at how we were doing our deployments. Uh, they actually started working on code to have one common pipeline between operations and, um, and, and development. And on top of that, believe it or not development was actually on-prem. They weren't doing any of their development work, um, within the cloud and all of the operations team was running in the cloud. So that did create some, some interesting struggles as we built out a common pipeline.

00:16:26

And then if we look at team four, team four is an operations based team again, but this is your application and product support. So these are the people that actually worked with our clients. They actually were on calls to help them if they are running into issues or having trouble using the product, or if there were stability within the inherent AWS environment, they would be able to help get teams one and two involved if needed. Eventually they did work with team three and getting more information on how the customer is actually use the product. And what were some of the issues that they saw to have better understanding of what feature requests should come about based on actual customer feedback. Um, they knew the product inside out, uh, and many of these people had no experience within the other technology stack. So they didn't understand necessarily how cloud worked really well.

00:17:18

They didn't know the tools that were there. They didn't understand some of the automation they could put in place while helping support clients. And they were usually the ones that were working tirelessly to do a deployment or an upgrade of the new release that they were not necessarily aware of what was going to be in that release or what were some of the things that would break and how to get those things resolved. So, uh, getting feedback within development, being part of that development cycle, um, being able to influence what features got, got implemented and how they get implemented became very important. Um, in this case, we had a leader that was command and control task assignments, uh, focused on the quick fix and not really the long-term fix. So they believed that as long as the problem is not happening, that's good. Just move to the next thing and work on the next problem.

00:18:07

Uh, when we implemented a DevOps leader within that group, uh, they focused on finding and fixing the root cause, uh, setting up a book club so that they could review, uh, getting the dev ops mindset within their team. Uh, they set up training goals to learn new ways of working, uh, developed a team internal BPM. So even the ones that didn't make it as a true outage, if they recognize something as a team that didn't quite work right, they created their own VPMs to work on within their teams. So, uh, definitely taking that and then, uh, taking it to heart to understand how to improve. And then they also started working with development and giving them direct feedback, as well as throughout the BPM process or the P one review process that will be implemented later on.

00:18:52

So, um, I've kind of alluded to it. Um, some of the major influences here is that they did hire have a executive sponsorship hiring managers that understood dev ops, not just as in the tools, but also in the philosophy or in the mindset of the way of working. They understood that they had to change culture. They had to change the way the teams worked and this had this be something that wasn't just, you know, in transition or a fleeting thought it needed to be something that would take root and something that could survive people moving in and out of different roles throughout the organization. Um, so there was some really good, um, recruiting done to find those right leaders and then to have that leader, um, bring their team up to understanding what dev ops is. Um, the team one, uh, actually created a network summit, uh, and they started each year.

00:19:41

They would have a network summit, bring everybody together. Um, even folks from India, uh, to work for a week to do a retrospective of the year and to work on major planning for the next coming year, and then to figure out what worked, what didn't work. Um, and purposely we brought them into a small conference room where they have to actually talk to each other. Um, I don't know that we would do that now in COVID timeframe, but we would definitely get them into a zoom or an Ms teams, uh, call together and, and schedule a lot of time together and get them some whites, white time, white space to be able to do chit-chat and to work together and those regards. So, um, definitely some good stuff. Um, in that, uh, during that first summit, they talked about the four types of work, uh, the three ways of dev ops and then how to prioritize.

00:20:29

And that was a very interesting process, uh, as we worked on what is the best way to decide what projects you've worked on first and the team originally decided to work on all the big, major business projects and quickly learn. The more important was learning how to eliminate unplanned work as it is the true time thief. Um, and then from there, how important it's going to maintenance was, and then how to understand that if you looked at your own work center, how do you make that more efficient? So it doesn't become the constraint. Um, so all of that was really good learning and they gained that through the use of the book club, where they review the Phoenix project, the unicorn project, uh, um, they looked at gold rots, the goal they looked at beyond the goal, and they looked at it's not luck as well as, uh, eventually doing some Decker series, uh, looking at just culture and, um, drift into failure. So they definitely were a well-rounded group.

00:21:29

Uh, again with team one over the course of the next six months, they set up the best practices. They started creating standards. They started deploying the standards. They looked at tools and automation. Uh, there was a CMDB that was always two years away, so they created their own database, uh, for, to use for automations. They started to move to more from break fix work that was unplanned to more project and planned work. Um, they had a huge win and that they developed a network router script that could run and populate knowledge, um, automatically that reduced their average troubleshooting time from 90 minutes to five seconds, just for gathering information from all the different sources and having it in a standard format to be able to start troubleshooting quickly. Um, and then they started working on automations of how do they synthesize that information and then be able to write automatic scripts that could self deploy and self heal based on the readings that they had within the automation database.

00:22:23

So again, truly looking at the ways of working, um, in there. This is also where we saw the first emergence of a Kanban board, uh, within the operation side. Um, so they develop, uh, a way of working that was different than before, which was individual tasks that never were seen together and didn't have swim lanes and weren't sure where they were in the process. So this was a really good way of, of doing that. Um, and then on-call went from being a hot potato to being something they enjoyed doing, uh, and finding new ways of working. So a very interesting change in that atmosphere. Now, team two was a little bit larger. Um, they were moving forward on more projects than team one, um, but they were really focused on tools and projects, not focused on the why of dev ops or the how of dev ops.

00:23:10

Um, the team had a solid adoption and, and borrowed the development tools initially, and then share those with the team one, the network team, uh, they were great with automation and scripting, um, but they were not seeing it as true orchestration or self-healing, they, they had a lot of manual intervention to push those off. And again, they saw, um, the on-calls a hot potato. Um, the manager became frustrated. He eventually left the role, and then we implemented, uh, another dev ops leader. So the new manager was promoted from team one who had prior platform experience. And he also looked at the projects and the whip, and he actually deprioritized a lot of the work that didn't hit on the core requirements in terms of the company goal or the team goals. Um, he also brought the new level of automation, uh, having them look at it as a service catalog and having them think about pipelines that they could do that actually would take work away from team one and team two, and have it as one single push button flow. So very good efforts there.

00:24:17

They also implemented a book club where they did continuous learning, including the Phoenix project, uh, team three, uh, owned the dev ops tools. They control the access. Uh, they had firewall rules to keep people away from them, and then let only allow you to have access on as needed basis and then would have that access removed whenever they thought they should. They also didn't let anybody know when they were schedule maintenance on the tools. So if operations needed them for supporting, rebuilding, doing something with a client, they sometimes weren't available during key times. So, um, so it was kind of tough. Uh, if we looked at it at the way that the two teams or three teams at this point were talking while together, um, so they were definitely their own independent dev ops, uh, group that didn't really include ops. Um, so there was a change of management.

00:25:09

Uh, there was, uh, they started attending the blameless postmortems as noted before, uh, they help pay down some of the technical debt. And then they had a new releases that started to become more stable, more stable, and easier to deploy. Um, they started working on a common pipeline with depths and ops working together on that, and then actually moved their development environment to the cloud, which is where operations had all that the environment already set up. Uh, they were able to take advantage of the, all the pre-work that was done like that and get, em, get environment stood up. They were very much like, uh, operations that the development teams could work on. Uh, so it, it ended up being a really good way once we got the teams working together to actually move fast towards having a complete operations within dev ops.

00:25:56

Uh, and then if we move the team for this is an application support team. Again, they knew their product really well. Um, but they needed help getting into the modern tools to understand how it is. They treated the application stack more like cattle are more like pets than cattle. Um, so we had a new leader, uh, command and they started a book club, um, to start understanding the difference and the way that teams one and two are working and then try to get them involved with team three. Uh, with that, uh, the new manager emerged, they transform transform the, from the transformation from team one in tune to, uh, the application teams started to join the, uh, the inclusion in BPM calls. They relaunched the book club with, with a really, um, they also started to work around the five ideals of the uniform project.

00:26:48

And then they worked to develop a safety to culture. This team was fearful of being fired every day. Um, and what we needed them to do is feel secure in their role and letting them know we needed their feedback. We needed them to be able to move things forward and do really well. Um, so this team moved from 60%, um, moved to 60% scheduled maintenance and 40% firefighting and one month. So that was a major transition and they had the support from team one team two and team three to make that happen. So overall our customers are our winners when it comes to, um, moving the dev ops. Um, our second winners are all our employees that have a greater sense of work-life balance. And, uh, generally speaking, I've been told many times that after a team has moved to dev ops, they feel like they are now in therapy together, and that they can talk things out.

00:27:41

They can work on things together and generally a safe, happy place to be able to share their experiences that they're having at work. So hopefully if you're moving forward in a dev ops, uh, opportunity, uh, you can do that. This is just for tails. Uh, our one tale of four teams moving into one common framework, uh, willing to talk with you more at some of the birds of the feather or lean coffee, uh, gatherings. If you see me in any of the places reach out, I'd be glad to talk more about these tails and others. Thank you, and have a good day.