Las Vegas 2020

Chasing the Unicorns at T-Mobile

Twelve hour outage bridges, worn out headphones, 90% unplanned work, and 25TB of randomly corrupted file systems were normal business for T-Mobile developer platforms. When the foundation of where software delivery happens is the bottleneck, throughput remains buried under a large pile of debt. Ripe for improvement, T-Mobile has begun to embrace DevOps principles including transparency, telemetry, post-mortems, and continuous experimentation to spark a turnaround of historic proportions.


Listen as Chris Hill, Senior Manager of Developer Platforms, walks through a journey capitalizing on T-Mobile culture and desire to create experiences customers love. The culture, otherwise know as "Team Magenta" lead to an appetite to change and now has teams achieving up to 30x throughput gains and decreased deployment pain.

CH

Chris Hill

Senior Manager, Developer Platforms, T-Mobile

Transcript

00:00:00

Um,

00:00:14

Good morning, Las Vegas does. My name is Chris hill and I'm from T-Mobile and I run developer platforms. I'd like to talk to you all today about chasing unicorns and unicorn in my book is an enterprise that has mastered software delivery at scale. Now here at T-Mobile, we are actively improving every single day so that we can eventually get to that benchmark of what it means to be a unicorn and deliver value at scale to our customers. I'm part of a group called product and technology. And my space primarily has been in the developer platform arena. I'm highly passionate about this area, and I'm really excited to talk to everyone about creating a developer experience that we can all be proud of within a very large enterprise. This is what I'm going to talk about today. First, I'm going to talk a little bit about why does it make sense to invest in developer experience? Then I'll talk about transformation, fatigue, and what does it mean to have multiple conflicting transformations all done at the same time and how we may actually be diluting the impact of the underlying spirit of the change we're trying to make. I'll also go into how in the last two years, we've transitioned out of the chaos domain from a developer platform perspective, and then I'll go over the lessons learned.

00:02:05

Now I'm going to start us off in an area that I think a lot of us don't really like to talk about and that's onboarding. And this is where I feel most developers first lose their initial spurts of motivation. And what I'm referring to is inheriting a software project. I have inherited many software projects throughout my career, and every one of them feels like I stepped in the middle of an Ikea build cycle and all the parts were missing and there are no instructions and there's no support line and all the screws are stripped. And I have pressure that I should come out with my first feature next week. And by stumbling into a lot of different software projects that now became my own, I felt firsthand the experience of frustration, trying to get up to speed, to be effective and contribute value to existing software projects. Now I've spent some times up to a week, two weeks just looking for documentation on how to actually get access to the code or get access to the environments. And sometimes I'm left with a series of tickets that I have to raise. As we all know in a corporation tickets have SLS tickets usually have an approval workflow. And if I'm really lucky within that approval workflow, every one of those approvers are actually in the office during that SLA window.

00:04:01

And if the first thing in terms of value contribution I make to the business is to go around and disrupt every one of the approvers and say, Hey, would you mind doing this for me? I may not be contributing as much value and I may not have as much motivation in a longer term. Now, one thing that I've noticed is if you start at a 10 out of 10 in terms of motivation and your first experience is something similar to what I've just described before you even look at the code. You may already be at a three. I honestly don't understand why most developers don't run for the Hills. I think secretly every time I thought that no, it's definitely going to get better.

00:04:56

Then by the time that I actually got access to the code, I realized that it's just worse than I thought. Not only is it worse because I don't understand any of this code that I, that I didn't write, but I also don't fully understand the value stream and all of the fragmented tools that I have to use just to get the end fulfillment. I hear that you're asking me for a feature. I don't really know how to get a feature into your hands, but let me try and piece this value stream tool set together to figure out how I get you, what you want. And if the change muscle wasn't something that had been flexed very often, I've got an uphill battle. This can be extremely frustrating and extremely demotivated.

00:05:56

And honestly, at T-Mobile customer experiences in our blood developer experience has just recently come in our blood. And ultimately when we think about a T-Mobile customer, we think about a subscriber. And we think about that subscriber telling their friends that they should subscribe or keeping their subscription longer than they originally intended driving higher revenues for T-Mobile better customer experience equates to a better subscriber experience and long-term growth for T-Mobile. Well, we also see the same behavior from a developer experience and the end results are higher throughput, more innovation, more creativity, higher retention. And the more Mo the more investment you make to keep that motivation up from the beginning, the happier your deaths will be. And so in order to reconcile that, I've asked myself the question, why does this make sense? Why does, why does something like a developer experience acquaint to results that I can reconcile? And there are a couple of ways that I think I've been able to capture this in my head. One is that there's less cognitive load for context, which is so it essentially, isn't challenging for me to get that value delivery and have that personal fulfillment that I've delivered something that makes my customers happy.

00:07:36

I also feel that there's less wait time within the value stream and less people that have to get involved with a change that I need to make. We've done calculations to show that if we take all of our CICT jobs that are done on a daily basis, and we save one second on average across all of them, it's like we just hired a brand new full-time employee Just recently. We've been able to shave off four or five seconds off of every single standard city job. This means essentially we just hired a small team,

00:08:26

But I also think is important here about the experiences. Are you empowering rather than impeding? And if you're empowering, is it leading to the results that you care about? That could be more creativity, it could be higher quality. It could be faster. One big assumption that I make in this reconciliation and experience is that if you're the way that you conduct your business or your value stream operates, do you have the confidence of your customers? And ultimately I think out of the pool of every customer, we've all experienced this idea where change and transformation really equate to this fear of loss. Well, I've been told this before, I've been told it's going to get better. The last time we moved, how is this time going to be different? What is it that you can say to me this time that will change the result? What we already have works, we've already made it work for us. And if you're not truly invested in earning the confidence, especially for your shared services, are you doing the best thing for your company or is the service actually adequate for where your business needs to go? I've been with T-Mobile for about two years driving this initiative. And I've always asked myself if we figured out that the developer experience was valuable a long time ago, would we be further alone?

00:10:12

And I've come to the conclusion that it's a lot more complicated than that. There's no really good answer to that. But what I've discovered is that you can't just take a unicorn's playbook and become the unicorn overnight. And this always fascinated me, that groups of thousands of people could all know the ideal and right way to do things and that the big tax to getting to the ideal, isn't just knowing what is right and what you're doing is wrong. It's how to actually get there and the journey and the legacy debt and everything holding you back from making that change.

00:11:02

Now, uh, I drew a little picture here. Uh, don't make fun of me. I don't really have good journaling, but in this picture, I show a lifecycle of where I've seen transformations within these organizations end up back, and I see a new and shiny. Everyone loves this new and shiny on the top. And everyone's excited about using something that it gets adopted or it fails fast. It doesn't scale. It's the wrong timing. There are too many other things going on. Uh, any days there, if it does get adoption, then typically the next step is either centralization or governance, right? Is this going to be feasible in our working environment? How do we productionize? How do we make this, uh, accommodate for all of our use cases? Now during governance, we may derive the relevancy of our service or a platform or a product or a change completely into this unusable state. And sometimes it never dies. In fact, it just adds on to the pile of dead of existing services and softwares and products, uh, that we started last year and the year before that, and it just becomes ongoing debt, almost like a house of cards,

00:12:24

Or it becomes too constrained that our customers start to resent us and start to detract and essentially know that well, we could do better. This is no longer relevant for us. And then we start the new shiny cycle over again. Ultimately when we start this new shiny cycle, we probably haven't deprecated the previous thing. And I see this all the time from a change perspective. Now change is much more than just following this graph or a lot of things that are included with change in an enterprise. You've got people to convince, you've got funding to earn, and that's hard to earn. You have legacy systems to make sure that they're running and satisfying your existing customers. You have to integrate with these legacy systems. You have to take anti-patterns head-on and go. This is an anti-pattern. This behavior needs to be broken. Sometimes you have architectures to rip apart. Firewall changes to make policies, to challenge cultures, to, to evolve your culture may not even be in a position where it can immediately move to what you're trying to transform to. And you also have unplanned work to compete with, If the department doing your change or transformation has 90% of their work, it's completely unplanned and they don't have a room. You don't have room to have a thought on how it could be better.

00:14:05

You know, when I first joined, T-Mobile roughly about two years ago, uh, I took on a department that was currently in that chaos sort of realm. In fact, we had 10 hour, 12 hour grudges every single day. And I remember them because I went through multiple pairs of headphones and cause all of them would start to fray almost like I had a haircut every day of those little black pad shots.

00:14:34

And ultimately I really channeled John all spot ex CTO of Etsy would, he said, incidents are unplanned investments. And I took every 10 hour and 12 hour incident. And I really understood, well, not only can help us figure out what we're doing wrong with legacy, but this helps us figure out what to do in the future. And usually the communication to our customers could come out in the form of, I really apologize for this experience. This isn't ideal, but this is what we're doing today to make it better in the future. I understand that I don't have a silver bullet for you, but what I do have is how we're incrementally changing so that we can serve you better.

00:15:33

And transformation I feel is intended to be fruitful for all, but a lot of times it's painful for some and it's uncomfortable. Usually for most, you find comfort in the ways that you were working before. Cause it's known now, this is my fourth industry doing digital transformation. I started in semi-conductors. Then I went to retail. Then I went to automotive and now in telecom and I keep thinking one day, maybe it's going to get easier. Maybe I'm really naive. It never is easier. And who am I really kidding? Every enterprise has their fair share of legacy debt that is holding them back from how to get to that ideal unicorn status, but knows usually can acknowledge and it could be much better.

00:16:34

It's almost as if you're at the bottom of a hole, trying to claw your way out. And the bottom is just this legacy quick sin just continues to pull you down. And I can't tell you how many times I get faced with the decision making of, do we invest in legacy or do we invest in new? Do we have to invest in legacy or can we invest in you now? Honestly, there's hope for getting out of that hole. Here's some things that worked for us. We turned to the unplanned into plant. One of the ways we did that is we took every incident into consideration and got a full return on how we should conduct ourselves as a business so that we don't end up in a series of judgment calls, which lead to that incident. That way we can plan and prioritize. We also made all of our work visible with all read the book, making work visible by Dominica DeGrandis. If you don't see the activity and the volume of your work in progress, you're harming your ability to make a justifiable analysis.

00:17:52

We also had a formal acknowledgement of debt and preemptively set the stage to all of our customers that said, look, we have all this debt. We're not going to be able to provide as good of an experience as you would like. Here's some steps we're going to take that are going to be disruptive in the short term, that will help us in the long-term. And it was almost preemptively setting the stage of come along with us on this journey. We want to make it better. We also built in the discipline to how we operate it. It's really easy when you're operating in chaos, not to have a formal, runbook not to bring a buddy with you not to establish rollback, not to estimate how long a change is going to make.

00:18:44

As you transitioned out of chaos, you have to build in this discipline. Otherwise you're never going to get out. I also think we changed the way that we think anytime that we showed up to an incident and we knew the change that potentially caused the incident could be backed out. Well, when choose back out versus the fail forward mentality, I can't tell you how many calls I've been on where that is a debate. Well, we could try this patch and we think that it'll be done in three hours, two hours past our downtime window, where we could just back out and get to a known state in a couple of minutes, just back out. If you're going to measure the incidents and you're going to use them to learn, you have to be able to praise the flawless execution. Now you can transition to finding the right questions that will pull you out of this hole and the questions are, do we know what good looks like? Can we actually know if we were doing this right? What right. Looks like, how can we measure ourselves for our own success, right? Where are our bottlenecks? Eliyahu Goldratt says there can only be one bottleneck constraining your system. Well, what is that one bottleneck? Can we not find it because we don't measure it? Well, measuring is now our new bottleneck

00:20:13

Also consider what standards should be enforced and what standards should remain flexible, but always be open to refine them and ask yourself, are we impeding or are we empowering? Do we have a community of support? Do we have customers that believe in us? And I think you kind of transitioned into, well, now that we know we're asking the right questions, can we find the right solutions to progress our journey or our transformation? And there are a couple of solutions that can come to my mind at least. And one is define and refine the best practices that you can control and then simultaneously challenged the ones that you can't or more refinements. So it's important to think ideal in unicorn, but it's also important to think iterations and here's how we're going to get there. And here's how I will evangelize the support to do that.

00:21:17

Also understand that if you're going to make large scale directional movements, huge toolset changes, huge pattern changes, organization changes. You only have small windows of opportunities to be able to do that. Take advantage of those small windows of opportunities. Make sure you treat any feedback like gold. I scour through all of our NPS surveys to find the bad feedback. I love the fact the bad feedback, because it instantly gets me in a position where I can take the seat and be empathetic with the individual and really harness the frustration on how I can fix that and make it a better experience so that they never end up in that position or the future developer won't end up in that position. I think it's really important to also factor in the cognitive load. You put on a developer or your customers on an ongoing basis and ensure that they're there. They have less context, which is, and that they're using path to productions tool sets that focus on through. But one of those examples is the underpinning of our developer platform is using GitLab, which happens to combine CICB with sorts of control management.

00:22:37

If they were, if they are so closely involved with the fact production, why not make them the same tool along the same sort of context is this idea of core versus context. And what is core to your business happens to be what makes you special? But what is context is what somebody else is good at making special. We took this to heart and with the underpinning of get lab, we chose to allow to host their software in a SAS for us, T-Mobile, shouldn't be able to run, get lab better than they can. And if they can, I think I was in a lot of trouble, but the implementation of how T-Mobile spins get lab and how it uses it and how it maintains its internal network and internal automation sharing and templatization and maintaining ecosystem and policy and compliance. All of that stuff can remain core to our business.

00:23:41

It may potentially move to context, but if we're constantly aware of what feels like context versus core, we have to be able to make future driven decisions that ensure that we're focusing on what makes us special. Now there are some lessons learned and some of them already came out. But one of the questions I really like to challenge almost everyone that brings a problem statement to me is, is this the best thing for your team or is this the best thing for the enterprise? If there's a unique scenario that for some reason, a platform doesn't accommodate for, I've always asked the question.

00:24:29

If you were able to contribute to the platform and accommodate for this unique scenario, do you think every anyone else in the entire enterprise may have a similar scenario that could benefit from that change? And the answer is almost always yes. And what turned into almost resentment actually turned into empowerment. Well, if you want to change how it works, belong to this community and ensure that the control is democratized throughout your entire business. One of the things that I mentioned a lot is that it's not about what I says goes or doesn't go it's collective ownership through an entire community of what it means to be a software developer and T-Mobile, and then you're empowered to make decisions based off of a much larger scale, rather than isolated to one team. I mentioned transformation fatigue before there is a passive diluting that can happen to your transformation. If you have many simultaneously simultaneous transformations all happening at the same time, if you're not cognizant on what transformation is most important to your business, just the idea of having more simultaneous ones will dilute the important ones. I also think it's really important to focus on what can, what constraints cannot move and which ones can Eliyahu Goldratt mentions in his book. He says, well, subordinate the constraints and use his framework to really determine how you can turn a constraint to your advantage rather than the current disadvantage that is currently in front of you.

00:26:30

I also think it's important to obtain adoption by unlocking the passion. If your transformation is consistently hitting a brick wall, maybe there's a reason for that. If you can't or organically create adoption and create a value statement, maybe there's a reason for that. Maybe it is tiny. Maybe it's a perfect change of the timing's wrong or there's transformation fatigue, or maybe it's not the right solution. Now, a lot of you may know, we just merged with sprint and we're all really excited to have sprint as part of the family. What this has generated is a challenge for us to make two companies into one and to operate as one and to extend everything that we've learned from a developer capacity perspective and delivering the enterprise software at scale from two companies and turn it into high economies of scale community between both groups. What this means is there's a lot of opportunity within T-Mobile and I would encourage anyone who wants to be a part of our journey to visit our career site.

00:27:57

That's all I have for today. I really appreciate everyone's time. I appreciate everyone for listening and realize that this conference comes at a challenging point when virtually, uh, it's it creates a lot of strain on what the conference has primarily been about. Now, if you've watched how it revolution has, uh, been able to gracefully transition into a temporary work arounds, uh, it has been a very heroic effort and I want everyone to acknowledge the fact that putting on this does conference in this sort of format, uh, and changing this quickly is very impressive. And so I appreciate the it revolution staff. I appreciate everyone who has listened to me and would love to hear about challenges that you have. Thank you so much.