DevOps at Target (San Francisco 2014)

Even for traditional enterprises, the name of the game is now Business Agility and getting ideas to market faster than the other guy. This places significant challenges on traditional models for delivering IT, especially in a very large IT organization such as Target Technology Services (TTS). At Target, there is increasing momentum on pursuing a DevOps culture and mindset as an approach to solve for these challenges. In this talk, we will share how Target has been approaching this journey. We’ll start with the early days, where Heather was an early innovator pushing for many of these concepts within her organization leading an effort to build an API Platform and APIs across all business domains in 2012. The Enterprise Services team knew they needed to embrace DevOps to enable speed, scalability, and resiliency to keep up with the rate of change in the digital retail space. The team has been a change agent and early adopter for re-thinking how Target does technology work from cultural aspects to technology toolchains as well as delivery methodologies. In 2013 Target’s automation focus was significantly increased to meet the business case for speed, quality and cost of IT service delivery. This caused many groups across TTS to start looking at different ways to increase their automation maturity. Ross formed an Infrastructure Automation team to focus on building a deeper Automation / Infrastructure as Code competency within Ops, began testing and learning new approaches, and advocating a DevOps culture and mindset. This also started to increase the conversations around DevOps, but it was clear that there were some common misconceptions around what DevOps is that was becoming a barrier to its adoption at Target. It was at this point that Ross and Heather decided to come together as a Dev leader and Ops leaders passionate about driving this change at Target. We’re now at the most exciting part of our journey; building a common understanding on DevOps, connecting/engaging passionate team members, and empowering people to share their learnings across the organization. We’ll discuss approaches we’ve followed to collaborate across traditional organizational boundaries and build momentum around this movement at Target. While we’re still on this journey, the future is extremely bright.

vegassanfranplenary2014
HM

Heather Mickman

Senior Group Manager, Target

RC

Ross Clanton

Senior Group Manager, Target

TRANSCRIPT

00:00:08

So I'm Heather Mickman, uh, I lead targets API and integration team at target. And I'm here this morning with Ross Clanton, uh, an ops leader in our infrastructure services organization. So we're literally thrilled to be here today to talk about this journey, um, that target is on with dev ops. Um, so I am going to tell the story of my dev team in the early days of our journey. Uh, Ross is not going to talk about the fantastic work that he's driving on the ops side, and then about our combined efforts, uh, for what we're doing to drive dev ops at target across our enterprise. Uh, but first I'll tell you a little bit about who we are, what we do, and some of the complexities that we face. So Target's first store opened in 1962 in Roseville, Minnesota. Um, and we now have more than 1900 stores in two countries.

00:00:54

Uh, we have headquarters, a load of locations across the world, primarily we're located in Minneapolis, Minnesota, but also Bangalore London, even Cairo. Uh, we have 40 distribution centers and are the second largest importer in the U S. Now over the last 50 years, target has also added new businesses, including banking, pharmacy, and healthcare. Now over the years. Um, I'm sure you can imagine that we've added a lot of, we've added a lot of technology to keep up with our growing business. So I'm sure you've seen targets, um, in-store technologies, right? Like our price, checkers, um, uh, cash registers, the handheld devices that team members will use to restock shelves. Um, the new gift registry, iPads that are rolling out, um, our guest wifi that we have. So you can easily use the Cartwheel app on your mobile phone, but there's a lot of technology behind the scenes as well from the applications that run our distribution centers to ensure we're getting the right inventory to the right stores at the right time.

00:01:52

Um, our, our.com site, our mobile apps, the servers in our data centers, the, the phones and our guest contact center, right? It takes a lot of technology to run a modern retailing company. Now, over the years, of course, um, complexity in our technology footprint has expanded. The number of applications is increased in that means new technologies, new tools. Of course, our infrastructure has grown. That also means that we have a lot of technical debt that's grown over the years. So to keep up with that complexity and to keep up with that technical debt, we've added process and organizational structure, and we have silos, and we have a lot of silos. We have silos within our silos because we really want those silos to be as efficient as they can be.

00:02:35

Um, and so, so we've got this technical complexity, technical debt. We have organizational complexity with lots of silos, and of course the world has changed a lot in the last 50 years since target first opened that store, right? How we communicate, how we dress, how we listen to music, watch TV, that's all changed and how we shop that's changed dramatically in the last 10 years. Uh, so I'm a mom, two kids, and I think about myself and how differently I shop now than I did even six months ago. And that's because my expectations continue to evolve because target guests have changed how, where, and when we want to shop, right, technology is in the hands. I mean, in all of our hands, and it's not enough for us to be a big box retailer anymore. It's about connecting with our guests anywhere, anytime, anyhow, retail market.

00:03:25

So the retail has been completely disrupted and it's being disrupted at an increasing pace. Um, as innovation and changing technology just continues to explode. So target needs technology to remain a leader in retail. Retail is broken away from brick and mortar. We need technology to be flexible, scalable, and have the ability to quickly test and learn new strategies, whether that's for our guests, within our stores or even within our supply chain. So all of those complexities, the organizational complexity, the technical debt complexity, the evolving, changing marketplace. That was, um, that was what was behind a key technology strategy that we kicked off in 2012 around API APIs. Uh, so we had to start creating APIs to expose core assets so that target could create these cool digital guest experiences quickly and also, uh, to simplify our internal architecture. Uh, so those legacy technologies that, um, that I've talked about, those have grown over the years with a lot of point to point integrations, and we have a very tightly coupled architecture.

00:04:25

Um, so that means that quarterly release cycles are the norm. Um, and I think, you know, probably as well as me, when you're doing big releases, just a handful of times a year, there's a lot of dependencies in the ecosystem and that always adds a lot of risk, um, and typically a lot of instability as well. So it was important for us to focus on building out some APIs to decrease that complexity as well. Um, so initially we focused on building out APIs that you would imagine would be important for a retailer, uh, like a products API or store locations, promos pricing. And we have we've built and exposed a lot of those API APIs from many different backend providing systems that were, that ran across the different, uh, or a lot of different platforms. And these APIs now are enabling internal applications, as well as our mobile applications and a lot of capabilities on our.com site.

00:05:13

So we can quickly test and learn with different guests experiences internally. And then also when we're working with partners like Pinterest, that makes a lot of sense, right? Like probably no surprise to anybody here. It's important to have APIs to be able to move quickly and innovate, but what was more interest? I shouldn't say more interesting. And the reason I'm standing here today, um, was, is because, yeah, it was great. It's been great to lead the strategy for target for the, uh, one of the world's largest retailers, but just as challenging was the fact that we were talking about how we were going to deliver those APIs. And that was a fundamental change to how we were doing work, um, in a large enterprise it organization. Uh, so we started talking about things that were going against the norm, like agile, not waterfall, smaller, more frequent releases, automating all of the things, continuous integration, continuous delivery.

00:06:06

And of course, dev ops. These were entirely new concepts for a large enterprise it shop and very different than how we had been doing our work. Now in the early days of this dev ops journey, I actually had to stop using the term dev ops. Um, it had become a loaded term with a lot of misinformation and misconception. Um, and as I would have conversations with my peers and other senior leaders in the organization, I would literally see them shut down as soon as I would say, dev ops. Some laughter indicates that others have had the same experience, probably not unusual. Um, so right. It was going to be hard. Being a change agent is really hard, uh, but I knew that we could do it. So I focused on three things for my team in the early days, and that was talent culture technology so that we could deliver results.

00:06:52

There was a lot I needed to do as a leader for an incredible team to pave the path forward so that we could achieve these results. And the only way forward was I had to figure out how to start removing roadblocks and being a constraint Buster for my team, so that we could demonstrate by doing and showing results. So this is my awesome team. They build cool stuff. They like to innovate and try new things. We started as a pretty small team, um, in the early days with some really great people like Dan Conda, fin Greg Larson. Now the fastest way to de-motivate them. Uh, the team will probably all of us and to stop progress in his tracks to his bureaucracy, nothing is worse than running into a roadblock that stands between you and building stuff. And so what my job is is to empower my team, to get stuff done and to not be paper pushers, I wanted to minimize grunt work and we've made some really great progress.

00:07:43

Uh, one of my highlights from last year was when my team presented me a lifetime achievement award for dismantling, um, and entrenched process. That was just one of those processes that everyone was doing, because that's what we had always done like, well, but why are we? So I started asking questions, why are we doing this? Why are we adding hours and hours and hours of work? Um, in order to just bring new technologies in or to try and innovate with, um, different ways of doing work. And as I started asking those questions, just like why, and the answers would come back, because that's what we've always done. We have to go through that process, right? No, and very quickly everyone's kind of started scratching their heads and getting on board. And you're like, yeah, you're right. We don't need that process. So sometimes it's as easy as just asking those questions, um, to, to really, to remove inefficiencies that we have within our system.

00:08:33

So my focus is always on results, not process adherence and identifying those types of roadblocks and working with, uh, folks across the it organization to drive those efficiencies. Um, so to make that work on my team, I created a safety zone because I needed my team to feel comfortable with innovating, failing, learning, and moving on. I needed them to trust that everybody on the team had their back and that everybody was empowered. So I provided cover so that my team could make awesome happen and trust that I could remove roadblocks for them. So that meant that it was important for me, uh, to build trust with, uh, to build trust with my peers and leaders across the organization. And I needed them to trust me as well. That was really hard. Um, has anybody ever been called a hippie that wants to live with the unicorns and rainbows?

00:09:21

So I ha I have a few times, and that was, I mean, it was just a result of the fact, like we were doing things differently and the rest of the organization just didn't understand what that was in the early days. Um, so I needed to prove that it wasn't just a bunch of Hocus Pocus, fairy dust magic. So I demonstrated results with data. I took the emotion out of the equation and, and use metrics. So initially what I did was, um, I ran parallel development efforts to capture cost and quality metrics for a more traditional delivery model versus the changes that I was advocating for. And the results showed faster delivery, higher quality with overall less cost. Awesome. Right? Like, so now I wasn't just preaching and talking theory. I had actual data and metrics to demonstrate to the rest of the organization, what it was that we were doing.

00:10:14

So to do that, we needed the right tools in order to get our work done. Uh, so we have full stack ownership for our APIs and our API platform and our tooling, our tooling does matter. Um, so we brought in a number of new technologies to create the automation tool chain for target. So that included OpenStack get hub Jenkins. And chef, we built transparency into our development process. Uh, code reviews are dead. So we use pull requests and our ops team loves this because they can actually see what's happening. We're not throwing code over the wall and they can provide us feedback. And they're part of the development process. And also when something breaks, we know it and we can fix it really fast. Um, we also use HipChat, um, on the team, which has been really awesome because the team not only can chat throughout the day team, including our operations and support teams, but also we can stream in alerts and events and we can, real-time see what's happening across our ecosystem.

00:11:10

Now, of course, as we brought these tools in, we worked with a number of partners across the organization, including, uh, including Ross, Jeff Einhorn, and Elizabeth Carpenter, all of whom are here today. It's been really great to see the adoption of these tools and some of these new processes start to expand across our organization over the last year. So this is my brag side. Um, so we now have more than 30 APIs in production, uh, with monthly volumes of more than 1.5 billion, um, Ross and I gave this talk back in the July timeframe. And that number was, uh, I think it was just a half a billion at that point. So in the last three months, we've more than tripled the, that we're seeing through these platforms and APIs. And one of the, the metric that I like the most on this slide is that the fact is the stability that we have across our platform and our API. We have less than 10 incidents per month. And that number hasn't changed as the volumes have gone up. We also do more than 80 deployments a week. We have 250 commits on an average week. Um, and so there's a lot of change as well. That's being pushed through the system, but it's small changes. And so it's small changes. And so that means more stability for us because we can, we know when a change is going in and if something goes wrong, we can fix it quickly.

00:12:29

I also want to call out just from a business case perspective, um, how successful this program has been. Uh, we have an IRR of more than 80% with a really significant NPV as well. Now, these metrics and some of those other that I had. So some of these metrics here and some of the others, I just mentioned, I didn't have those three years ago. Right. I didn't have transparency into my development process. It wasn't easy for me to see how many times we were doing deployments or how frequently different developers were committing code. Now I can, and it's really powerful. So, um, it's been a really fantastic three-year journey, um, on this, you know, on this dev team that I've been leading, but what's more exciting is the change that we're driving across the organization. So Target's a technology company and we are committed to making dev ops a reality. And Ross is going to tell us more about how we're doing that.

00:13:22

So those results are extremely impressive. Um, especially if you saw the cultural and organizational hurdles and barriers that Heather and her team had to go through to make this happen. Well, I, Heather was driving awesomeness in her own space. I was kind of going through my own DevOps journey. Um, I was moving, uh, into an ops role after starting my career in ops spending time leading and engineering security went to enterprise architecture on the dev side. And now I was coming kind of full circle. And, uh, needless to say, I'd gained enormous empathy around the challenges that these different functions faced within the organization. And over that time, I definitely broadened my perspective on the systemic issues that were actually impacting our service delivery. It wasn't any one function. It was a problem with our organizational system, and that problem was misaligned incentives, too many manual handoffs, too many, too much manual process, too many handoffs, many silos, and not enough accountability across the organization. I think that's a problem that many of you are probably familiar with today. Um, and it's very difficult to get to good results when you're in an environment like that. So at this point I was clear on the problem and I was left thinking, there must be a better way than this.

00:14:41

Um, now's the time for the token Phoenix project slide. I'm guessing you guys will see 20, 30 versions of this slide over the next few days, most DevOps presentations to have one in there. Um, and I think this is really, really important to me when I was coming back into ops. I reached out to a, a trusted thought leader in infrastructure at target Matt Walburn and asked him, what should I do to get back up to speed? I've been out for a while. Um, he just recommended one thing, read the Phoenix project. And that book was so eyeopening to me because I kind of felt like I had lived the role of all the main characters in that leadership journey I'd been in. So it really, really resonated with me. And it helped me start to see the importance of dev ops more as a cultural movement and really how it can connect people across the organization. Um, and it, it got me excited and passionate to learn more. Uh, in fact, I was, felt so strongly about this book that later on, I actually, after I did move into ops, I brought about 22 copies of the book, assigned it out to my management team underneath me, key partners, key team members that had a role in driving this agenda. And it was like a homework assignment. I even had an offsite on it where we like virtually recreated some scenes in the book.

00:15:54

Um, and I got passion. And when I get passion, I want to learn, I want to understand, invest deeply to understand things. So I researched, I learned, I talked to experts internally, externally. Um, I learned about the innovations that were happening in the industry. I started looking at the unicorns and looking at what Netflix and Facebook and Google and everyone was doing. And I, one important thing I learned was how to start seeing and treating failure differently. Uh, it's not something to avoid, which is how I was brought up to believe. Um, but something that's really key to innovation within an organization. And we as an organization needed to learn to think differently. And that's hard changing minds is a very difficult thing to do in a change agenda. Um, so at this point I worked a lot with, with some close partners, many of whom are in the room today, like Jeff and, uh, Elizabeth, uh, to not only mind the gap, but recognize that it existed in the first place.

00:16:50

And that was hard. I mean, these were different concepts and some things that I had to do at this stage, like Heather, I did not use the word. I tried not to overuse the word DevOps. I used, um, outcome based language, like we're driving leaner, faster, higher quality service delivery. There's no dev ops in that phrase. Um, people don't want to hear about unicorns and rainbows when they don't, they aren't part of this, this community in this culture. And they don't understand what that means yet. Once they become part of the community, they do like talking about that stuff. And, um, one story actually at this stage in the game, uh, I went to velocity last year and it was shortly after I moved back into ops and I brought a bunch of my managers with me. I didn't bring any engineers, which is kind of weird.

00:17:33

That's an engineer conference. Um, but my goal was to expose them to this thinking. And I wanted them to see the passion and the magic that happens. And what happens when you have highly empowered, highly enabled engineers that feel passionate about doing things and it paid off in spades. Those leaders came back, um, played key roles in helping drive this change agenda. Um, once we got back from that conference and now we're sending engineers to these conferences, obviously, um, I also made a business case to, um, build my own rockstar team. And the goal was really to accelerate building our competency and service management infrastructure as code and performance engineering across the organization. Um, and I really wanted to extend the tooling that our architecture partners, both Heather and Jeff Einhorn from infrastructure architecture had brought into the organization. And my goal was not to make this team yet another silo.

00:18:26

We have plenty of silos, um, but really to leverage them, to enable and empower others, to learn how to do these things and to make off awesome happen in their own spaces. And we do that through a number of different ways that I'll talk about here in a second. So like Heather, I had to create a culture where everyone could level up and that meant doing something scary, that meant experimenting, testing, failing, and ultimately succeeding. Uh, I've learned that this is a key to establishing a learning culture within an organization. And ultimately that's how you enable continuous improvement. And to do this, I had to fully empower this team and I'm really proud of that team because they embrace these principles even before they were as culturally acceptable as they are today at target. And if that wasn't enough, we also started doing Lego ops.

00:19:12

Uh, we embraced the new tool chain. We started looking at our infrastructure as building blocks and started building reusable components, primitives of infrastructure code that can be stitched together in ways that give us flexibility in how we do service delivery and what this flexibility. We're starting to test and learn on some different service delivery models. And our goal is really to find the best models that will go the furthest in delighting our customers. One way we've done. This is, um, I've been a big proponent of introducing agile approaches to our infrastructure and operations organizations. Um, so, you know, we had our teams doing, you know, scrum sprints, combine different techniques, more modern, agile based techniques to get our work done. And we even took that a step further and started doing what we call flash builds and really what it was happening is demand for the team's time was, was ramping up and we needed more partner folks involved.

00:20:06

And it's hard to get a partner team that an infrastructure team that isn't oriented around agile to give up one of their people to be in, you know, one week sprints or two weeks sprints. So a flash build is an eight hour day filled with two sprints, two, four hour sprints filled with sprint planning. The development work that you do, retrospectives demos, highly engaging high-speed high energy process, got to have a really disciplined scrum master facilitator to lead that type of an event. Um, and th the outcomes or the benefits of that is it allowed to stop demise our time because we had everyone in the room working together on these problems. It allowed us to bring in those partners MES because they could free up a day at a time. You know, maybe we do a flash load once a week or something that's different for every engagement that we do. Um, and it's more fun. The engineers and the folks in the room are having a blast doing this work, and we get faster results. In fact, our very first flash bill that we did, um, we got done in an eight hour day, what took many, many weeks following our traditional delivery processes.

00:21:09

Um, so where are we at in terms of enterprise results? Um, the blow metrics here tell a powerful story and how collaboration new tools infrastructure is code, and a dev ops mentality are bringing target team members together to achieve great results in doing so. They're utilizing social coding and get hub, um, creating over 270 repositories to date with ever expanding test coverage, whenever anyone, um, updates any of those builds, uh, it's going through automated testing to ensure that we, you know, we've, we're building a quality and consistent, uh, product. And one thing that's really powerful here is, uh, or even more significant is the momentum that we're getting. If you go back a year, um, there was really just a couple teams doing this stuff. It was basically Heather's team and our.com organization. Um, and, and there was some POC work in our infrastructure architecture organization, uh, in July when Heather and I gave this presentation, we had, uh, 11, uh, teams and about 71 team members that were all contributing to infrastructure's code at target three months later that that's gone from 11 teams, 71 team members to, uh, 30 teams and 134 team members.

00:22:24

So you can see the momentum and the growth that we're starting to get across our organization and how we're doing this.

00:22:32

I'm going to shift gears and talk about how we started to drive a dev ops within target. And, um, what, what we recognize that this point is we're not, we weren't the only ones that were looking to change things. Um, we've already mentioned some of the other partners here today, as well, many groups were starting to focus on automation across our technology organization. Some like our.com organization we're even exploring dev ops. And at this point we really worked hard to establish partnerships with a lot of these stakeholders, um, because the vision was really to enable and empower our technologists to automate across target and to drive more end to end improved service delivery. And the time was really right at this point to expand the conversation on dev ops. Uh, so Heather and I talked about this and we decided we're going to do our own internal dev ops days.

00:23:19

Um, we decided we were going to champion, you know, calling it what it is. We actually start calling it dev ops. Uh, and that was a hard step we had to overcome. Um, some of the challenges that Heather mentioned earlier with like negative connotations on what that meant. Um, but we also recognize that that DevOps has a significant meaning to folks in the community. And we've got a big, strong growing internal community. We have about 400 members of our internal target dev ops community now. And it has a lot of meaning to them. You need, we needed to call it what it is. It had resonated with them and it had meaning for them. So we started doing that. Um, and we, we held our DevOps days. Uh, we, our moniker there was connect share and learn. That's really our goal. So our goal is to connect people across the organization that were interested in this topic, give them a forum where they can start to come together to share experiences through demonstrations, ignite presentations, open spaces, where people that are actually working on these types of things, these different ways of doing things, both in our development organizations and our infrastructure organizations, tests, wherever, have them come and start sharing what they're doing and ultimately to learn, we wanted to get people together so that they can learn from our internal experts and number of external experts that have come in and presented for us as well.

00:24:30

Um, Rob Cummings from Nordstrom was, was actually our first kickoff speaker. And that was awesome. We've had a number of other folks. I won't go through naming who they all are today. Um, B in dev ops folks, we capture data on these events. So we've held three dev ops days starting. We do them quarterly. We started in February, uh, we've had three events. We've had 50 presenters across those events, and we've had over 650 attendees over those three. Uh, in fact, one event we coordinated globally and had a dev op stays in our Bangalore India offices at the same time as the same day that we were having in our, uh, us Minneapolis offices. And, um, of course we get survey feedback, uh, eight out of 10 on the awesome scale. And a hundred percent of our state of the attendees have said, keep doing these.

00:25:18

And so we have our annual one our one year, uh, our one year celebration of our dev ops days. We'll be coming up here in February. And in doing this, we were able to start building a community. Um, and as a community, we started questioning whether the way that we were doing things was the way they should be done. Um, and we viewed, uh, questions and resistance to this way of as a negative. But the important thing here is we have silos as an organization. We're always going to have silos. We have a massive it organization at target silos by themselves, create boundaries and those boundaries create barriers to communication. We needed to break through that. And what, what we really did with building this community is allowed our stakeholders to start boundary-spanning and start connecting together one way. We, uh, yeah, and so we did this by challenging the norm, uh, questioning the way that things were.

00:26:16

And we didn't view resistance to, to, um, our new way of doing things as a threat or a negative. We actually viewed it as an opportunity to, um, start doing things differently and to pull people closer and, and be inclusive. One way we did this was, um, by publishing our own Flipboard magazine, make awesome happen. Uh, anyone here today, by the way, can subscribe to this. If you, uh, if you have Flipboard and you want to scan the QR code, uh, this'll be in the presentation that goes out, uh, later as well. But we built this magazine about five months ago. It's curating content from around the internet and it's open for anyone who wants to subscribe to it. And we have 208 readers today, 109 articles, almost 4,000 page flips. And our goal was to set a new course, to do things differently and to enable and empower our technologists to be free.

00:27:10

Um, that's really why we hired them in the first place to be free, to innovate, to start using our outside voice. Um, whether that's us here presenting today, we finally have an external facing tech blog that we're encouraging. You know, these participants that are members to communicate externally, whether it's target sponsoring dev days, uh, Minneapolis, we're doing a number of things to get far more active in the external community. Um, oops, I went backwards and our goal is to really start something new. It can be small, it just needs a place to grow. And we started this largely started as a bottoms up movement that has very much been fertilized tops down now, uh, and ultimately to create a culture of awesomeness that everyone wants to catch a culture. That's focused on getting stuff done and progress over perfection. And if I were to leave you with one final takeaway, it would be to take the leap.

00:28:06

Um, it's working for us and being a change agent in an organization can be very, very difficult. It takes perseverance, passion, resilience. Most importantly, it takes, uh, partners that are either willing to take that jump with you, or they're standing on the other shore encouraging you along. Uh, we've got a lot of work left ahead of us to continue moving this journey at target, whether it's figuring out how to do continuous delivery across the enterprise testing and learning different models of getting stuff done and continuing to create and share our story. But even though we're still early in our DevOps journey changes happening and we're having a lot of fun. And with that, I'll be the, we're the first ones to share our slide of where we could use help from, from this community. Uh, and so here's what we'd like help on, um, accelerating our DevOps transformation, aligning incentives and shifting to a blameless culture. I think that's a very hard cultural shift to do in a very large enterprise context. And that's the journey we're on right now. So we welcome any advice that you all have there. Thank you.