Anyone who has worked at a large organization, especially if they have been in charge of a department or section, will have encountered the dreaded metrics. Someone in upper management decides that they need to measure precisely how effective each part of the organization is functioning and so they develop some sort of metric that is sent out which section heads are supposed to periodically fill in and return.
The problem is that unless you are dealing with highly tangible and easily measurable entities, like the number of widgets that are produced per day, metrics can turn out to be extremely frustrating to fill out and even counter-productive, as Jerry Z. Muller explains.
The key components of metric fixation are the belief that it is possible – and desirable – to replace professional judgment (acquired through personal experience and talent) with numerical indicators of comparative performance based upon standardised data (metrics); and that the best way to motivate people within these organisations is by attaching rewards and penalties to their measured performance.
The rewards can be monetary, in the form of pay for performance, say, or reputational, in the form of college rankings, hospital ratings, surgical report cards and so on. But the most dramatic negative effect of metric fixation is its propensity to incentivise gaming: that is, encouraging professionals to maximise the metrics in ways that are at odds with the larger purpose of the organisation. If the rate of major crimes in a district becomes the metric according to which police officers are promoted, then some officers will respond by simply not recording crimes or downgrading them from major offences to misdemeanours. Or take the case of surgeons. When the metrics of success and failure are made public – affecting their reputation and income – some surgeons will improve their metric scores by refusing to operate on patients with more complex problems, whose surgical outcomes are more likely to be negative. Who suffers? The patients who don’t get operated upon.
…To the debit side of the ledger must also be added the transactional costs of metrics: the expenditure of employee time by those tasked with compiling and processing the metrics in the first place – not to mention the time required to actually read them. As the heterodox management consultants Yves Morieux and Peter Tollman note in Six Simple Rules (2014), employees end up working longer and harder at activities that add little to the real productiveness of their organisation, while sapping their enthusiasm. In an attempt to staunch the flow of faulty metrics through gaming, cheating and goal diversion, organisations often institute a cascade of rules, even as complying with them further slows down the institution’s functioning and diminishes its efficiency.
As head of a center in a large university, I would periodically get metrics to fill out about how effectively the center was functioning. But the metrics were generic ones meant for all the units, all of which have very diverse missions. My center’s mission was to help faculty improve their teaching but the boxes that I was supposed to fill out had little relevance to that goal. My suggestions that they needed to customize the metrics fell on deaf ears. That may be because it was too much work and the person issuing the metrics was being evaluated using criteria that did not require that level of detail, the kind of problem Muller points out. In my case, I decided that my time was better spent trying to improve teaching than agonizing over the metrics and filled in the boxes somewhat cavalierly. Ultimately, I felt that whether my center was considered effective or not would be determined by the faculty grapevine, what they said about how helpful we were to them. If the faculty were grumbling to the top management that my center was of no use, no glowing metric numbers could counter that.
The kind of gaming that Muller speaks about was evident in my seminar class as well. In such classes, participation is important. Since I discuss with the students what would be the best grading system, I asked them what we should do about assigning a participation grade. Should I create a metric that counted the number of times they spoke as well as the length and quality of their comments and keep track during class? The students overwhelmingly rejected that idea. They said that in the classes where that was the policy, they would speak just to meet the metric goals even if they had nothing they really wanted to say. They considered it a waste of time that actually diminished the classroom experience. They told me that they trusted my judgment about the level of their participation and to let them know if I felt that they needed to improve. That was what I did and there was no problem. It saved me from the tedious task of keeping track and the class could focus on having interesting discussions.
Andreas Avester says
Such grading also rewards students who are naturally talkative and have specific personality traits. It unfairly punishes people who don’t enjoy talking in class.
Bruce says
Thank you, Mano, for speaking out in favor of common sense.
A related issue I have seen as a professor has been student feedback statistics that try to report the views of a class of 24 students with six digit precision. Clearly, nobody who reviews such data is thinking about what is meaningful.
Marcus Ranum says
I used to teach classes on metrics for information security.
The problem is that few organizations are competent to use them wisely or correctly so they collect metrics that are easy: production rates or nebulous “feedback stars.” It requires a real understanding of the business, work-flows, and process.
After 10 years at that, I gave up.
You know who else uses metrics badly? The media. “5 Million credit cards leaked!” Uh, so? Is 5 worse than 4.5? Why?
Deepak Shetty says
Offtopic : Mano -- The Single signon buttons(yahoo, google) seem to be broken
Heh. In software , in the Scrum practice , one of the metrics is “velocity” (roughly the sum of the items you are delivering where every item has some points depending on value/complexity). Every efficient scrum will show a gradual increase in velocity . If however you make a scrum team’s performance dependent on it -- the velocity does increase -- usually by estimating each item to have more points -- rather than actually delivering more (True story!)
machintelligence says
Uh, so? Is 5 worse than 4.5? Why?
Not to be overly snarky, but it obviously is, because potentially half a million more people are affected.
Roj Blake says
A couple of observations on this managing by Managerialism.
A Library application form asked if I spoke a language other than English. I queried the need for such a question and was told it was so the library could gauge the potential need for books in LOTE. I found it horrifying I needed to point out to a Librarian that there are many people who can <b<speak a language, but cannot read it. She agreed they needed to rephrase the question.
Whenever I have an online interaction with a business or make a purchase, I am sent a “survey” asking a bunch of basic questions. These I complete dutifully, giving the company a poor score on every metric. When I finally get to the bit where I can write my opinions I tell them that the way my transactions work is if they have pleased me, they will have won a customer. If they have not pleased me, I will be in touch with a specific claim that requires a specific resolution. Only one has contacted me after such a survey to discuss why I feel that way. they are measuring something, but I don’t know why.
Reminds me a bit of the scene from “Alice’s Restaurant” when Officer Obie lays out in court twenty seven eight-by-ten
colour glossy pictures with circles and arrows and a paragraph on the back of each one, and then realises that Justice is blind.
flex says
I try to tell my bosses that metrics should not be seen as a measure of individual performance but as indicative of how well a process is working. And when I’ve used metrics in that way it become easier to map the weak points in a process. Of course, once you’ve fixed the process you can stop collecting the metric,but that never seems to happen.
But as far as stupid metrics go, I’ve got a couple stories.
At work a new change management process was rolled out and a metric was established to measure how long a change would take to get through the new system. Not a bad thing to measure, but the measurement was from the time the initial change submission was made to the time the drawings were released. And it was used to measure the efficiency of the change management team, who shepherded these changes through the system. What it didn’t take into account was that changes may take widely different lengths of time depending on the engineering teams, the purchasing teams, the suppliers, and even our customers. All of whom had activities in making a change happen. Then the metric was not used to show that the process needed to include all these other activities, but as a reason to give poor performance reviews for those people who’s job is to overcome these obstacles and get changes through the system. When that occurred there almost was a mass resignation, which made management sit up a little and pay attention. The solution? Allow the people who make the changes to adjust the starting dates on changes to compensate for the time it takes for all the other groups to add their input, approvals, etc. Which made the metric look real good, but worthless.
As for the other story, there is a little background. I learned a couple of years ago that when checking out of a supermarket I could use my ATM card and enter all the information while the cashier was still ringing up the items. Great! This allows me to get out my wallet, enter all the payment information and put everything away before the cashier is done. Saves time all around. So I was doing that. Until recently. I was about to do it at one store and the cashier asked me not to. It turns out that one of the metrics the individual cashiers are judged by is the time from a card swipe to the final closing of the transaction. I suppose it would be an indication that something was wrong at the station if it took more than a couple minutes from the total being registered and the completion of the transaction. But that shouldn’t be seen as automatically the cashier’s problem, there could be plenty of things which could cause a delay. Since trying to save myself, the cashier, other customers, and the store time in getting through the checkout could hurt the chances of a cashier keeping their jobs, I’ve started waiting until the total is completed before starting payment.
I’ve seen a lot of metrics in my time, and I doubt that 10% of them really are measuring accurately or the information gathered is used properly.
rockwhisperer says
I once attended an engineering management class (I was already doing the job, maybe I should learn how?) where a classmate said that his performance/compensation reviews as an engineer were mostly based on project bug-tracking metrics: how many were assigned to him, and how fast he could clear them. Now, unless the employee in question was extremely lazy or clueless--characteristics which should reveal themselves in any number of obvious ways--this is a ridiculous way to manage engineers. At the moment, my husband (who is a very good, very well-respected, and well-compensated firmware engineer) is clearing 3-year-old bugs. These have lingered in the bug tracker because they required some effort from another group, or they are only needed in operating modes that have not yet been sold to customers, or they are difficult to fix and low priority as defined by management.
Bug fixes ought to be assigned in priority order to whoever can do the job best, where “best” has many meanings and will vary, sometimes minute by minute, based on tons of project management variables. Software/firmware bugs are often most easily and quickly fixed by the author of the code, but not always. You, the engineer, may get assigned the bug because the original code author no longer works for the group, or is snowed under by a higher-priority task, or some such. You, the manager, want your team members to cheerfully take on whatever needs doing, rather than squeal “But it isn’t my code!” knowing that their next raise depends on that objection.
Back in the classroom, I think my jaw clanged to the floor when my classmate shared his story with us, and THEN, several other heads in the room nodded agreement. Never being one to let decorum get in the way of an objection, I found myself saying, out of turn and in a very loud voice, “But…but that’s STUPID.”) The professor nodded agreement, and proceeded to repeat what I said, only with greater politeness and clarity.
starskeptic says
One of my favorites is the number of ‘administrative meetings per patient-bed’ -- not well-known or used…