San Francisco 2015

Metrics and Modeling – Helping Teams See How to Improve

Helping teams see and understand their process highlights areas that need improvement. This session shows how to make team data visible in ways that lead to improvement action and how to avoid the pitfalls and traps of managing by numbers.


With examples of good and bad techniques for showing data and coaching teams, this session will help choose a path for helping teams develop an analytical angle to their technical skills.

TM

Troy Magennis

President, Focused Objective

JW

Julia Wester

Improvement Coach, Leankit

Transcript

00:00:04

Hi. Thanks everybody for coming to the last breakout session slot of the day. I know it's been a long day, so we're glad you're here. I'm Julia Wester and I'm an improvement coach at LeanKit. I get the pleasure of working with Dominica every day, and I help people take the crazy out of work.

00:00:21

And, uh, I'm Troy McGinnis. Uh, I run a small math consulting business, so I don't do much work,

00:00:26

<laugh>.

00:00:30

All right, we're just gonna start talking about the different types of analytics that there are because there's a lot of, uh, it's, it's, there's a lot of confusion out there. You hear about big data and statistics and stuff like that. Um, now essentially I'm dressed in statistical attire. If I was in big data, I'd have a MacBook and I'd have a turtleneck top. Okay, there's a lot of different types of statistics and analytics you can do. Now we're gonna just give you our definition of them here, just so you understand, uh, where we're coming from because we hear a lot about metrics and we hear a lot about statistics, but what we're trying to do is get you to understand the different styles so you can make a choice about which one to apply and when. First of all, descriptive statistics and lets you look at the past and then make an assumption about the future.

00:01:16

So it's saying that even though we can't say, or your insurance company can't say exactly when you will die, they can tell that people like you commonly die at a certain age, depending on certain factors. How do they determine those factors? Well, that's where they look at big data. That's where they look at historically starting to subdivide the whole population to work out what factors put you at higher risk than others. And this is the hot area. If you want to move into big money, go into big data, don't do statistics. It's uh, it's just, uh, that's, so yesterday, <laugh>. Now what we're talking about here is prescriptive analytics. In other words, what we're looking at is we're looking at improving your next decision. We're not looking at, uh, understanding we're gonna use the past to help us, inform us to help us make the next decision better than the previous.

00:02:04

But once we take an action, all bets are off the whole, the whole table gets reset, and now we're looking at the next decision that we're going to make. So what prescriptive analytics does is it helps you answer the question about what can we do to cause an outcome that we want? The other two types of statistics we're looking about historically, what are the chances that something's going to occur? So in coaching, what we're looking at is being prescriptive and saying what we want to happen. So they're very expensive, setting up data, making decisions, understanding, uh, what's going to happen. It can cost you a lot of money and still mislead you in the end, so you get no return on that investment. So before you spend a dollar on doing any analysis, invest wisely

00:02:51

Because metrics are expensive. We wanna make sure that we're using the best possible metrics we can. And one common thing we see is people using too many vanity metrics. And what that means is we see people out there measuring what's easy to measure or interesting, maybe it came in the tool you just bought, so you're like, great. It makes you feel good because you're doing something. So vanity metrics don't answer one of the two key questions. Does it matter to my customer? Does it help me make a decision that makes my business fitter? It's like this pond looking into the mirror and have a self, you know, a grander than itself view of what it really is. We fool ourselves into thinking that these vanity metrics can proxy for the information that we really need. So one metric that I consider a vanity metric is number of tickets closed.

00:03:44

And I bet that somewhere in here, there's someone that's being measured by that right now. But my concern with this is that it falls into a class of metrics I like to call productivity metrics. It puts more focus on activity rather than the progress of value. So we need to look at what happened because we did the work. What value was delivered for our business or for our customers? What satisfaction were did we derive because we did that work? Those are the things that really matter much more so than finding out the team A closed five tickets and team B closed 10 that actually has no meeting when you sit down and think about it, although it's interesting. Now, another metric I wanna talk about is system uptime. And I know that this is a ubiquitous metric that everyone measures and we're expected to, so I'm not suggesting that you change that.

00:04:35

But system uptime has varied definitions across our customer bases. If I were to look at five different contracts, I could get five different definitions of what uptime means. Unless you read the fine print, you're not really sure if your five nines is the same as your five nines, but more egregiously, it treats all downtime as the same, similar to the number of tickets closed, which treats all activity with the same level of value. We don't wanna do that with our downtime. We know that if we have an hour of downtime when no one is using, our service is significantly less impactful than if we have a downtime, an hour of downtime when everyone is using our service. So we need to go beyond measuring these generic measures and look at the actual impact of the outages to our customer. What did it cost them? What did it cost us?

00:05:27

It might be a little more challenging to measure, but it provides you immensely more valuable information. And as a customer of services, I'm mu I'm as much interested in your ability to recover from an outage as I am your ability to prevent it, because I know that no matter how hard you try, one day you're going to have an outage. And the worst thing for me at that time is if you spent all of your time on prevention and no time on learning how to recover once you have that failure. So we also need to measure our ability to quickly respond as well as our ability to prevent outages and the cost when we fail.

00:06:06

Again, you wanna answer me? You wanna have metrics answer the questions. So what? And you do that by asking those two questions. Again, does it matter to my customer or does it help me make a decision that makes my business fitter? If it doesn't do either one of those, then it's a vanity metric. So just discard it and look for the next thing. Now, measuring individuals is a hugely commonplace practice, but it's one of the biggest pitch you can fall into as an organization. And it really doesn't matter whether you're measuring people to give them props for doing extremely well or if you're trying to motivate people to improve. Measuring individuals rarely works out like you'd hope. I wanna talk about Mr. Carmelo Anthony. I know he gets a bad rap, but he's a great example of why measuring individuals to give them props is a bad idea.

00:07:00

So Carmelo Anthony at his peak was the eighth highest score in the NBA. And you'd think his team would be ecstatic because they have a superstar on their team, but, and you'd think that when he played his team would win more often, but what actually happened is that when he played, his team lost more often. And so why is that superstar on my team? I should have great team performance, right? Well, the fact is, is that Carmelo had to take significantly more shots to get the scores that he got, and effectively he stole scoring opportunities from other members of his team. So, and those members could have been more capable of providing an outcome that could lead to team wins. So carmelo's focus on individual statistics, whether it was subconscious or knowingly had a negative effect on team outcomes. What you measure shows what you value. So you need to make sure that you're valuing and showing what's really important. You don't care if carmelo's number one, you care if the Knicks are number one. That's who needs to win. So measure team performance.

00:08:05

On the flip side of that, you know, we like to give people motivation to improve and there's a fine line between giving someone feedback and using metrics as a sharp weapon. And so what I wanna talk about here is a dashboard that Troy found in a break room and a, so that's in a common area where everyone can see it. And it's a list of team names and individual team member names who have 10 or more bugs assigned to them. So we don't know anything else about these bugs. We don't know the cost of delay of working on them, we don't know any other information that helps us prioritize them. And we also don't know what else these people are working on. All we know is that a line got drawn in the sand, 10 bugs or more, you're on the naughty list, right? So, um, gold Rat had a quote that said, tell me how you'll measure me and I'll tell you how I'll behave. And a lot of people start there, but I really love the next part that says, if you measure me in an illogical way, don't complain about a logical behavior. So Troy, what did people do as a result of that dashboard?

00:09:16

Yeah, I, I mean, uh, there were, what happens is over time there, um, they did nothing else to change the actual underlying problem. So all they really changed was how the data was collected. So it actually tainted the data that we had to analyze. We no longer had accurate bug defect numbers in areas of code. We didn't know which teams were, uh, signing work off early or trying to send it to production and have the have issues arise when it, when it went to deployment. So yeah, this, this had a terrible impact on architecture. People started dealing with the defects one at a time rather than trying to find whole groups of defects and fix the root cause. So basically what they did by putting this chart up was make the software more unstable and increase the number of defects they ended up having in production.

00:10:01

Not only that, but you know, when you hate having your name up on a naughty list, you're gonna think things like, oh, that's not really a bug. That's if I think about it, that's a feature, right? So let me just change that to a feature or Hey John over there, he's really good at doing that kind of work, so I'll just reassign it to him so he can do that. Get my name off that board. Even worse, other parts of your organization look and say, oh, they really fix bugs when they come up and I need this thing done so I, you know, this could be a bug if I think about it in this certain way. So let me put it in as a bug so they'll get on that really fast. You know, these are all ways that the system gets gamed when you measure the wrong things.

00:10:41

So we need to continue to measure team outcomes, not individual behaviors. Give feedback in an appropriate way. Now I wanna talk about balance a little bit before I hand it over to Troy for the rest of the presentation. And I wanna do that by talking about restaurant tables. So when I go sit down at a restaurant table, one of my biggest pet peeves, if it's wobbly, right? It's super frustrating. Every time you make a move, you have this reaction and you have, you have no idea when that's 'cause you forget between the times it moves every time you move. And if I encountered that every time I went to a restaurant, I would probably go to a different restaurant if I could, right? So businesses have a similar need for balance. The key is to find the pillars that you need to be concerned with.

00:11:33

And I'm gonna use Larry Macaronis four key metric quadrants for a sample. And you need to find a balance across those. So this quadrant, you can use this if you don't have a place to start from. It talks about doing it fast. Am I keeping pace with my business, doing it right when I do it? Are people happy? What's the outcome when I deliver something, do it on time. When I promise, do I deliver what I promised and when I promised it? And then keep doing it. It's not just enough to do each one of those things for a short time. You need to do them in an ongoing manner. Now there's something that they don't teach you in math class, but you certainly learn when you work on projects in an organization. Now I'm gonna ask for a show of hands. How many of you have ever worked on a project where each individual component was done well, but when you put it together it didn't work, right?

00:12:33

<laugh>? Yeah, a lot, right? It's definitely happened to me. So there's this dark matter or glue or whatever we wanna call it that you can't easily articulate in a project plan, but we know it needs to be done. And I like to think of this fourth quadrant is that concept. It's not just enough to do it fast and to have those skills or do it right or do it on time. We need to figure out what skills we need and cultivate those to do them all at the same time. And then to do them in an ongoing manner. And then once we do that, we have to keep the quadrant balanced so we're not like that wobbly table. And coaching can really come in when you need to understand how to effectively tweak one quadrant without gutting another. You know that every action has an equal and opposite reaction. We have to sort of manage that. So define your pillars of balance, whether it's three, four, whatever. Understand how you're going to measure something in each, and then focus on keeping them in relative balance. 'cause if you forget one or ignore some, your customers will likely look elsewhere.

00:13:42

Alright, um, I'm gonna use the back button. <laugh>, we, we were discussing whether we needed it. Um, that's right. I I guess as, as managers and as executives, if we're above many teams, what our job is, is to try and sometimes when you change one of these, you amplify the negative impact on the others. So it's not even a one for one trade off here. It can be very dire. Uh, I'm gonna show you an example of how we tried to set up a dashboard within a company which helped people see the trade-offs they were making so that they could make intelligent trade-offs economically. Um, and I emphasis on economically. So essentially it, you're trying to set the scene inside the company where everyone understands the business so that the decisions they make locally don't detrimentally impact someone else in in your business. So this is what we built.

00:14:29

Whenever we showed metrics on any of our dashboards in our public spaces or for teams to use inside the organization, we always showed something from each of those quadrants, okay? We never show them individually because what we're trying to do is get people to understand when something moves, something else will also, you'll have an impact somewhere else and might be a little bit hard to see, see for some of you. But you'll see the bottom corner here, which doesn't really work. Exhibit a dark green, bottom corner throughput. You'll sort of see it was going along roughly around about sort of 22, um, pieces of work per week. And then we put that dashboard up in the, uh, lunch rooms, see the drop to 1, 1 4, 13 2, 2 2. So by putting the dashboard up to sort of say we're tough on quality, what we did is stop the company delivering value work.

00:15:22

The, the light green line in the background was the number of defects teams were delivering over time. So what what happened was just the sheer fact of reporting a metric that was easy to capture had a very detrimental effect on, on business delivery throughput. This is what you're looking for. You're trying to help your teams understand that it's not good to overemphasize any one of these. You wanna be, you wanna trade something you're excellent for, for something. You're not trending as well as your peers in context matters, right? So it doesn't matter what the individual number is, it matters how you compare to the rest of the company. So we converted that bugs rather than just a number of 10. We converted it to the number of days it would take all developers in the team to get to zero defects. It made it personal for the team.

00:16:13

It set the team the target that if before you take on any new business work, look at how much you have in debt lying around that you could burn down and make a decision as a team whether you should burn that down early. Um, you've got some others there, responsiveness, but again, we're trying to soften the colors. This is informational. It's not meant to say red and green. We don't want to sort of evoke a emotional sort of fight or flight response based on color. So we made everything a nice soft shade of pastel, uh, and it matters.

00:16:47

So that was it. Bugs, we made it personal to the team. The number of team members down to zero. Uh, incidentally, there's a couple of more talks on metrics. If you're in this room tomorrow at 1130, so you should stay here overnight so you don't miss a seat, uh, Mark McKay goes into a plethora of metrics in each of these measures that, that you might wanna choose from. I'm not gonna do that. We're not gonna do that. What we're gonna do is just tell you that you need something from each of those four quadrants and you're gonna ask your teams which ones they want. So this was ours, bugs responsiveness cycle time. How quickly they, they fixed things, throughput, how fast they were getting through work and defects and predictability. Um, were they working weekends? And then they stopped working weekends and now the trend started going in a different direction.

00:17:33

Everything needs to be aligned on the date axis. This is the team view. The team would use this during, uh, retrospectives or ops meetings. What are they called? DOA Ops reviews. Ops reviews. See, ILII learned that this morning. Uh, and what you're trying to get the team to do is trade something they're best at for something that they wanna improve. And this is the way we did that. When we, what we did at a, at another level where the rest of the company could see is we removed the, the axis. So there's no numbers on this chart at all. The orange line is your team, not your team, their team. And the the gray line is everyone else in the company in the same context. So we're comparing, we're helping the team see their trend over time against all of their peers. And what we're after is we're after them an asking these questions down the bottom here in the, um, throughput, bottom left hand corner.

00:18:37

Uh, they're actually above the company trend. Something they might trade on the right hand side. You sort of see they started better than the company trend, but they're starting to cross the company. Might be something that we might be able to trade some of that, uh, some of that good throughput for. And the predictability, they're just way, they're an outlier. They're almost the worst in the company. So we're helping them come up with some coaching advice. What we would do then is up on the top coaching list, we would give them three pre-canned coaching responses for saying teams in your situation. Often found these things helped <laugh>. So the coaching advice was synchronized to the trend balance against the rest of the company at uh, and then we let the, we just leave it to the team to choose what they wanna do. Again, they get immediate response.

00:19:31

Next, next sort of sprint, next couple of weeks, next ops review. They get to see it all again and they get to see it real time. So that's what we're after. We're after just trading something good for something we're not so good at. And we're just doing trends and now we're just showing off. So where, what we also wanted to do as an industry, this company's particularly into data visualization. Um, you could say their name, this is Tableau Software. This is their development teams. Um, and what we wanted to show in the story was the fact that we wanted to give them a little tool where they could look at all the defects that they closed and all the defects they had and uh, see where it goes. Do you wanna do this slide?

00:20:12

No, you're good. <laugh>.

00:20:15

Um, and we made the colors just looked like a ball of lint out of your pocket on the right hand side. 'cause that was not really defects, which actually were fixed. It was just process noise. So we wanted to show it as process noise. And what we found just by putting this up with no instructions, is that people were just sort of enticed to sort of hover their mouse over these little bubbles and sort of see which ones were big and small. And then they would start sort of discussing root causes in clusters. And then to help them do that, we started sort of doing some text analysis on the bugs and the defects and sort of trying to put them into big, small and medium buckets based on just the text in the bug description. So we could see and the teams could see if it was a certain unstable environment which was causing most of the defects. There we go.

00:21:06

So if we were to distill this talk into five top takeaways, it would be these keep your metrics inventory small. You guys are an option. You know, inventory incurs cost, right? So keep it small. Measure valuable outcomes, not individuals so that you can achieve the goals that you really have. Actively monitor a balanced set of metrics and then monitor trends against those metrics to expose trade-offs. And then as Troy just went over, provide beautiful interaction to engage big brains to help you figure out what next steps to take forward. Now the ask that we have is for people to share information with us. What set of metrics are you guys using and why are you using those? And that'll tell us a lot about what's going on with you. So does anybody have any questions for us?

00:22:02

Wow, questions

00:22:04

Shut up. Customer engagement. How do you successfully manage, um, measure that,

00:22:10

That's a great question. So the question was customer engagement, uh, how the hell do you measure that <laugh>? There's a lot of metrics which are, um, you get after the fact. Um, so measuring it's not real problem. I mean, often you get surveys and um, certain, certain companies, um, aren't afraid to call in. So, but it's a bit late. Um, I, if there's one metric, if there's one ask that I, I find I'm always at a loss at chasing, it's the quality aspect. I can't find a good leading indicator for quality. Um, the one I'm most, uh, looking at, uh, as an advantage there is, uh, a lot of companies have feature flags and I'm finding that whether you get a group of people and they have to agree whether to turn a feature flag on or off, that's correlating very closely to, uh, customer satisfaction. Because I think if you can get the team who built it to agree that it should be switched on in production, then you've done a survey of, of people informed people that this feature is actually ready to go. So I'm trying, I can find out afterwards by survey and customer satisfaction sort of scores. Trying to find a way to bring that earlier is, is a, is a challenge which we would love your help with. Great question.

00:23:26

Why, why not try experimentation? Just do it with some of your cus do something with some of your customers and something else with some of the rest and see which one does better.

00:23:36

Yeah, so like AB testing and seeing what results in higher conversions and that would be a lagging indicator as well.

00:23:43

It's lagging and and there's investment in it too. And while you're running that experiment, you're not running another. So it's, I agree with you with just the right 50% of the time it's the right choice sometimes. There

00:23:53

You go.

00:23:54

That's right.

00:23:59

Anything else?

00:24:01

One over here, Dominica.

00:24:03

Oh, second row is,

00:24:06

It's hot today.

00:24:08

Any recommendations on how to get executives away from vanity metrics,

00:24:12

<laugh>?

00:24:15

Oh yeah, I mean, so it's all in asking them to tell you what they think that really means. I had a situation where the director of business ops and it at F five Networks where I worked before I worked at LeanKit, he wanted to know how many enhancements each development team got done in a sprint. And I said, okay, so I tell you 35, what does that mean to you? Can you tell anything from 35? What about 30? You know, it's just, it's making them think to the next step and they will figure that out themselves in almost every case. 'cause they're like, yeah, you're right, 35 really doesn't tell me anything. You know, so it's

00:24:55

Start start measuring them by a vanity metrics, start putting up number of restroom breaks for the executives.

00:25:00

Yeah. And if you can get them there without making them feel stupid along the way, then that's really going to, you know, help <laugh>

00:25:08

With that. Sometimes it's great being a consultant and not an employee.

00:25:11

<laugh>. Yeah, <laugh>. Although if you wanna get hired back, you still have to make them

00:25:16

Not feel stupid. I'm in math, I'm not being hired back.

00:25:19

<laugh>.

00:25:22

Um, yeah, listen, this is a big area. Um, the things I also want you to look at, um, probabilistic forecasting. Start helping your executives understand there is no one right answer to any of these metrics. Um, every one of these metrics over time, uh, changes just based on the work that you choose to pull and the direction that the company goes. Most of the value metrics are outside your company's control. Your competitors call the shots as to what they're gonna do and they change the, the value proposition for the features in your backlog. Um, this is an area which we, we need to get good at. Um, and I guess we're here to help.

00:25:57

Yeah. And so Troy mentioned a talk tomorrow that you might wanna go to. Um, I also wanna point out Dominic's talk at 3 35, same time, same bat channel, same place, same room.

00:26:10

Yeah. Stay in the room.

00:26:11

Um, right, so she's gonna talk a lot about the shape of uncertainty and talk more about those probabilistic kinds of metrics. So tune in and thank you so much.

00:26:21

Thank you for coming. Thanks

00:26:23

Everyone.