Chapter 12 of How to Measure Anything, Finding the Value of “Intangibles in Business” Third Edition is the second chapter in the final section of the book. Hubbard titled Chapter 12 The Ultimate Measurement Instrument: Human Judges. The majority of HTMA has focused on different statistical tools and techniques. This chapter examines the human as a measurement tool. Here is a summary of the chapter in a few bullet points:
- Expert judgement is often impacted by cognitive biases.
- Improve unaided expert judgment by using simple (sic) statistical techniques.
- Above all else, don’t use a method that adds more error to the initial estimate.
Hubbard begins the chapter by pointing out that the human mind has some remarkable advantages over the typical mechanical instrument. It has a unique ability to assess complex situations, but . . . the human mind often falls prey to a long list of common human biases and fallacies that generate error! If we want to use the human mind as a measurement instrument (and every shred of evidence is that we will), we need to develop techniques to exploit its strengths while adjusting for the mind’s weaknesses.
The rationale we humans use for many decisions is in Hubbard’s words “weird.” Cognitive biases hamper the decision-making process. Cognitive biases are patterns of behavior that reflect a deviation in judgment that occurs under particular situations. Biases affect how people perceive information, how teams and individuals behave and even our perception of ourselves. Hubbard identifies a few biases that affect how we interpret and how we make decisions. They include:
- Anchor bias refers to the tendency to rely heavily on one piece of information when making decisions. This type of bias is often seen in early estimates for a project or tasks.
- Halo/horns effect is the tendency for either positive or negative traits of an individual to overwhelm the perception of other traits by those around him or her.
- Bandwagon bias (bandwagon effect) occurs when there is a tendency to adopt an idea (or to do something) because an external group or crowd believes the same thing.
- Engineering preference is a form of bias that comes into play once a decision has been made. Respondents will actually change their mind about information to provide a supporting rationalization for the decision. They fit the decision to the facts generating more support for the decision. This bias holds true even for people that did not originally support the decision.
- The illusion of learning occurs when we believe that our judgement must be getting better with experience and time. This is a common bias seen in Agile teams as they estimate stories and accept work into sprints IF there is no feedback loop to compare estimates to actuals.
As we have seen in past essays, these represent only a few of the biases that we use or experience in our day-to-day lives. Biases affect how we interpret data, which data we will use in decision making, and even how we support the decision once it is made. The saving grace is that we can account for biases in decision making if we are aware of them and use structured methods.
Structured decision models are generally better than unaided expert judgment however experts often continue to fall prey to techniques that increase their confidence with improving their predictions. For example feedback (test) loops can contribute to the illusion of learning which makes it is possible to increase an experts confidence without generating better outcomes. Sorting collected data in Excel is a simple organizing technique that might be useful and interesting, but rarely improves the decision models.
Simple statistical techniques can increase the efficacy of decision and prediction models over human performance. For example, a simple weighted average model can be an improvement on the uncalibrated expert judgment. These models outperform human judgment because they smooth out the variability that bias introduces. A simple linear estimation model that multiplies the component size by productivity and then sums the result is an example of a weighted average model that often outperforms expert judgment. Hubbard suggests that simple techniques can have an effect because people are starting in such a bad place.
Techniques that Hubbard suggests improving expert performance include:
- For simple weighted average comparison models (which option is better than another – for example comparing five houses based on size and value), Robyn Dawes recommends converting the weighted value to a z-score ((value-mean)/standard deviation). The conversion process smooths out inadvertent weights.
- Rasch models are useful for analyzing categorized data (questionnaire responses). Most satisfaction questionnaires collect categorized data (fixed possible values). In the Rasch model, the probability of a specific response is a function of the person responding and the answering the question (item parameter). Rasch models reflect the premise that if one measurement instrument “A” returns a value is greater than “B”, then another measurement instrument should give the same answer. Rasch models allow us to judge the outputs of different expert measurement models to determine if the outcome is biased. A manager of a corporate PMO recently presented a model in which three panels of senior project managers interviewed and rated (using a standard questionnaire) several hundred project managers. A Rasch analysis in this circumstance would make sense to calibrate the responses.
- The Lens Model developed by Egon Brunswick is another method to remove human inconsistencies. The process for developing the Lens Model begins by asking experts to make decisions, map those decisions to results (estimates compared to estimates), and then create a regression model from the data. Brunswick found that the model performed better than any of the experts. The model removes the error of expert or judge inconsistency. Lens Models are useful to develop internal estimation models that synthesize the perspectives of multiple human experts. The Lens Model is also useful for avoiding the illusion of learning bias. (Note: Hubbard provides a great seven-step process for creating a Lens Model on page 322.)
- Professionally I have reviewed many estimation programs. In nearly every scenario, model-based estimates outperformed expert judgment when a program had more than one estimate. Research provided in HTMA by Robyn Dawes supports this observation.
Hubbard wraps up Chapter 12 by introducing “The Big Measurement Don’t – Above all else, don’t use a method that adds more error to the initial estimate.” If the measure does not reduce uncertainty, the measure does not have value. Defining measurement as a tool that reduces uncertainty makes measuring many different scenarios feasible (which might be why Hubbard choose the title). The definition might feel squishy, that anything can be a measure; however, forcing the measure to reduce uncertainty forces hard constraints. Humans can be a valuable measurement tool; however that value requires using techniques to correct for the certain errors in judgment that are common in unaided human judgment.
Past installments of the Re-read Saturday of How To Measure Anything, Third Edition, Introduction
Chapter 1: The Challenge of Intangibles
Chapter 2: An Intuitive Measurement Habit: Eratosthenes, Enrico, and Emily
Chapter 3: The Illusions of Intangibles: Why Immeasurables Aren’t
Chapter 4: Clarifying the Measurement Problem
Chapter 5: Calibrated Estimates: How Much Do You Know Now?
Chapter 6: Quantifying Risk Through Modeling
Chapter 7: Quantifying The Value of Information
Chapter 8 The Transition: From What to Measure to How to Measure
Chapter 9: Sampling Reality: How Observing Some Things Tells Us about All Things
Chapter 10: Bayes: Adding To What You Know Now
Chapter 11: Preferences and Attitudes: The Softer Side of Measurement
March 13, 2016 at 9:27 pm
[…] Chapter 12: The Ultimate Measurement Instrument: Human Judges […]
March 13, 2016 at 9:28 pm
Very good chapter notes!
Adding to (p. 325) “The Big Measurement Don’t – Above all else, don’t use a method that adds more error to the initial estimate.”, Hubbard warns us about using arbitrary scores (e.g., a scale of 1 – 5).
(p. 327) “I’ve always considered an arbitrary score to be a sort of measurement wannabe.” and then proceeds to list six reasons to support his statement.
March 14, 2016 at 8:58 am
Excellent point.
March 13, 2016 at 9:58 pm
[…] in Business” Third Edition by Douglas W. Hubbard on the Software Process and Measurement Blog. In Chapter 12 we discussed The Ultimate Measurement Instrument: Human Judges. Humans can be a valuable […]
March 17, 2016 at 11:55 pm
[…] bias. For example, when exercising the the Engineering preference (recently identified in How To Measure Anything, Chapter 12: The Ultimate Measurement Instrument: Human Judges), listeners change their mind about information to provide a supporting rationalization for […]
March 19, 2016 at 11:56 pm
[…] Chapter 12: The Ultimate Measurement Instrument: Human Judges […]
March 26, 2016 at 11:56 pm
[…] Chapter 12: The Ultimate Measurement Instrument: Human Judges […]
March 27, 2016 at 10:16 pm
[…] in Business” Third Edition by Douglas W. Hubbard on the Software Process and Measurement Blog. Chapter 14 is titled A Universal Measurement Method. In this chapter Hubbard provides the readers with a […]
April 2, 2016 at 11:58 pm
[…] Chapter 12: The Ultimate Measurement Instrument: Human Judges […]
March 9, 2017 at 11:56 pm
[…] Chapter 12: The Ultimate Measurement Instrument: Human Judges […]