14545519494_12ab1ba776_kSloth plagues many measurement programs as they age.  As time goes by, it is easy for practitioners to drift away from the passionate pursuit of transforming data into knowledge. Sloth in measurement programs is typically not caused by laziness. Leaders of measurement groups begin as true believers, full of energy. However over time, many programs fall prey to wandering relevance. When relevance is allowed to waiver it is very difficult to maintain the same level of energy as when the program was new and shiny. Relevance can slip away if measurement goals are not periodically challenged and validated. An overall reduction in energy can occur even when goals are synchronized, if there is a conflict on how the data will be used and analyzed between any of the stakeholder classes (measurement team, management or the measured). Your energy will wane if your work results in public floggings or fire drills (at the very least it will make you unpopular).

The drift into sloth may be a reflection of a metrics palette that is not relevant to the organization’s business, therefore not likely to produce revelations that create excitement and interest.  This can cause a cascade of further issues.  Few metrics programs begin life by selecting irrelevant metrics, except by mistake, however over time relevance can wander as goals and organizational needs change.  Without consistent review, relevance will wane and it will be easy for metrics personnel to lose interest and become indifferent and disengaged.

In order to avoid or reclaim your program from sloth due to drifting goals; synchronize measurement goals with the organization goals periodically.  I suggest mapping each measurement goal and measure to the organizations goals.  If a direct link can’t be traced, I suggest that you replace the measure.  Note: measurement goals should be reviewed and validated any time a significant management change occurs.

When usage is the culprit, your job is to counsel all stakeholders on proper usage. However, if management wants to use measurement as a stick, it is their prerogative. Your prerogative is to change fields or to act out and accept the consequences. If the usage is a driver for lack of energy, you probably failed much earlier in the measurement program and turning the ship will be very difficult. Remember that it pays to spend time counseling the organization about how to use measurement data from day one rather than getting trapped in a reactionary mode.

The same symptoms occur when management is either disinterested (not engaged and not disposed positively or negatively toward the topic) or has become uninterested (disengaged). The distinction between disinterested and uninterested is important because the solutions are different. Disinterest requires marketing to find a reason to care; to be connected.  A stakeholder that has become uninterested needs to be reconnected with by providing information so their decisions matter.  Whatever the reason for actively disengaging or losing interest, loosing passion for metrics will sap the vitality of your program and begin a death spiral.  Keep your metrics relevant and that relevance will provide protection against waning interest. Metrics professionals should ensure there is an explicit linkage between your metrics palate and the business goals of your organization.  Periodically audit your metrics program.  As part of the audit map the linkages between each metric and the organizations business goals.  Make sure you are passionate about what you do.  Sharing your passion of developing knowledge and illustrating truth will help generate a community of need and support.

Synchronizing goals, making metrics relevant and instilling passion may not immunize your metrics program from failure but they will certainly stave off the deadly sin of sloth. If you can’t generate passion or generate information and knowledge from the metrics program to generate relevance consider a new position, because in the long run not making the change isn’t really an option..

Wrath

Wrath

Wrath is the inordinate and uncontrolled feelings of hatred and anger.  I suspect that you conjure a picture of someone striking out with potentially catastrophic results.  When applied to measurement, wrath is the use of data in a negative or self-destructive manner (rather than an act of wrathful measurement). Very few people are moved to measure by wrath, rather they are moved by wrath to use measurement badly. Wrath causes people to act in a manner that might not be in their or in the organization’s best interest. Both scenarios are bad.  Data and the information (good or bad) derived from that data can used as a weapon in a manner that destroys the credibility of the program and the measurement practitioners.  

Anger impairs one’s ability to process information and to exert cognitive control over their behavior. An angry person may lose his/her objectivity, empathy, prudence or thoughtfulness and may cause harm to others. Actions driven by extreme anger is easily recognized by observers, but rarely by those perpetrating the behavior. This is an example of being blind with rage.  There is no room in the workplace for rage. Protect your measurement program and your career by staying in control. When confronted with scenarios that induce rage you need to learn how to step back and see the whole situation. Being mad or angry is fine if those emotions do not cloud your judgment. Teaching yourself to always see things more calmly will help your realize the truth of the harm that you are causing to yourself and others through rage. I once saw a CIO fly off the hook when are project shared it’s measurement dashboard, the project reporting that they were behind schedule, defects were above projections and the number of potential risks were rising. The uncontrolled rant was awe inspiring however the CIO lost the support of his senior leaders and within a month he was gone. Control puts you in a position to react in a more rational manner.

Measurement data and the information derived from that data deliver the ability to understand why things happen: why a project is late, why a project costs what it does or even why a specific level of quality was achieved.  Measurement is a tool to take action to improve how work is done.  What it should not be is a weapon of indiscriminate destruction. Acting in a rage changes all of that. When you strike out in an uncontrolled manner you have transformed that data into a weapon with very little guidance. Think of the difference between the indiscriminate nature of a land mine and the precision of phasers of the Star Ship Enterprise. Wrath turns a potentially valuable tool into something far less reliable. For example, a purposeful misrepresentation of the meaning of data can lead to team or organization making wrong decisions. Other examples include errors of omissions (leaving out salient facts) or inclusion (including irrelevant data that changes the conclusions drawn from the data).  Whether omission or inclusion, poor use of data erodes the value of the measurement program though politicization or placing doubt about the value of measurement into people’s minds. Remember that all analysis requires interpretation, however the interpretations are generally based on an assumption that people will act logically and consistently. That includes your behavior. Analysis based on an obviously false assumptions just to make a point does no one any good in the long run.  For example, assuming productivity is constant across all sized of projects so that you can show that a project under-performed to get back at someone will destroy your credibility even if you win the argument. Be true to the data or be the point of a failure in trust.  

Do not confuse passion and rage; they are not the same. You must have passion to be effective but what you can’t do, is to lose control of your emotions to the point that you stop thinking before you act. The deadly sin of wrath is a sin that reflection of bad behavior, if you let wrath affect your behavior you will begin a spiral that ends with a failure of trust.

Definition of done

Don’t trip the runner next to you just to win.

The results of software measurement can be held up as badge of honor. It is not uncommon for a CIO, department manager, project manager or even technical lead to hold up the performance on their projects in front others, engendering envy from other projects. Envy is a feeling of discontent and resentment aroused by and in conjunction with desire for the possessions or qualities of another. Measurement is a spotlight that can focus other’s envy if the situation is right. That can occur when  bonuses are tied to measurement and when the assignment and staffing of projects is driven by unknown factors. There are two major types of metrics-based envy: one must be addressed at the personnel level and the second must be addressed organizationally.

Envy can be caused when the metrics of projects managed by others in your peer group (real or perceived) are held up as examples to be emulated.  The active component of envy at this level is triggered by a social comparison that threatens a person’s self image, and can be exacerbated when the attributes that impact performance are outside of the team’s control. The type or complexity of the work coming to a team is generally negotiable. Teams that get the really tough problem will generally not have the highest productivity even though they may solved an intractable business problem. Envy generated by this type of problem translates into a variety of harmful behaviors. In benign cases, we might just pass it off as office politics (which everybody loves, not), or in a worst case scenario could generate a self destructive spiral of negative behavior which is not helpful to anyone.  Typical envy-driven behaviors to watch for include the loss of will, poor communication, withdrawal and hiding.  While the amateur psychologist in me would be happy to pontificate on the personal side of envy, I am self aware enough to know that I shouldn’t.  If you have fallen into the trap of envy, get professional help. If you are a manager of a person that is falling into this hole, get them help or get them out of the organization.

The other category of triggers are organizational.  These are the triggers that as managers, we have more control over and have the obligation to address.  As leaders we have a chance to mold the organizational culture to be supportive of efficiency and effectiveness.  Cultures and environments can facilitate and foster both good and bad behaviors.  Cultures that support an atmosphere of individual competition above collaboration can create an atmosphere where envy will flourish. This will act as a feedback loop to further deepen silos and the possibility of envy. For example, Sid may feel that Joe always gets the best recruits and he is powerless to change the equation (for whatever reason), therefore he can’t compete.  Envy may cause him to focus on stealing Joe’s recruits rather than coaching his own. Thisculture can disrupt communication and collaboration and create silos. In this type of environment positive behaviors, such as displaying measurement data, can act as feedback loop to deepen the competitive culture rather than generating collaboration and communication.  Typical behaviors generated by envy triggered by organizational issues include those noted earlier and outright sabotage of projects and careers (tripping the runner next to you so you can win), and just as bad, the pursuit of individual goals at the expense of the overall business goals.

Measurement programs can take the lead in developing a culture where teams can perform, be recognized for that performance and then share the lessons that delivered that performance when it is truly special. An important way to understand what type of performance really should be held up and emulated is based on the work of W. Edward Deming. In Deming’s seminal work Out of the Crisis, he suggested that only variation caused by by special causes should be specifically reviewed rather than normal or common cause performance. Understanding and using the concepts of common and special cause of variation as tools in your analysis will help ground your message in a reality that focuses on where specific performance is different enough to be studied. Common cause variation is generated by outcomes that are within the capability of the system.  Whereas special cause outcomes represent performance outside the normal capacity of the system. In every case, performance outside of the norm, should be studied and where positive, held up for others to emulate. By focusing your spotlight on these outcomes you have the opportunity to identify new cutting edge ideas and ideas that should be avoided.  Another technique for fostering collaboration (an environment where envy is less likely to take root) is to invite all parties to participate in the analysis of measurement data using tools such as a WIKI. The measurement group should provide the first wave of analysis, then let the stakeholders participate in shaping the final analysis, using the crowd sourcing techniques made famous by Jimmy Wales and Wikipedia.  Getting everyone involved creates a learning environment that uses measurement not only as tool to generate information, but also as a tool to shape the environment and channel the corporate culture.

Measurement and measurement programs don’t cause the sin of envy.  People and organizational cultures foster this sin in equal measure. Done correctly, measurement programs can act as a tool to tame the excess that lead to this sin. However the corollary is also true.  Done incorrectly or poorly, measurement ceases to be a positive tool and becomes part of the problem.  Measurement that fosters transparency and collaboration will help an organization communicate, grow and improve.

Can't see forest for the trees

Can’t see forest for the trees

The first deadly sin is pride. In the cannon of deadly sins, pride is the sin from which all other spring. In the world of metrics programs, the sin of pride is when a metrics program settles on a single metric that is used to reflect the value or well-being of a project, a group or organization. Examples of abound of metrics programs that fixated on cost or productivity to the exclusion of a broader palette of metrics and work attributes.  Most metrics professionals quickly learn that one metric cannot be used for all projects.  If you can’t easily answer the question, “Does this relate?” each time you use a metric and for each metric you use, the information generated through measurement and analysis will provide little or no value.  The goal is to understand the differences between groups of work so that when comparisons are made, you can discern what is driving the difference (or even if there is a difference).  Comparing package implementations, hardware intensive projects or custom development is rational only if you understand that there will be differences and what those differences mean.   The bottom line is that rarely does a single metric deliver that deep level of understanding that generates value from measurement.

Another example of the single metric syndrome generated by the sin of pride occurs when an organization uses a single metric to value performance in a contractual arrangement.  While entire contracts are rarely stipulated on a single metric, it easy for a single metric to be given disproportional weight due to the framers’ lack of understanding or a disconnect between the framers and the people that administer the contract.  Poor understanding of the relationship between the numbers and the concepts they represent is akin to failure in punctuation when writing.  The resulting meaning can be garbled as the contract is negotiated, implemented and managed.  We won’t get into an existential argument over whether something is a sin if it is inadvertent; the result is the same. Garbled concepts can lead to a single metric focus which once discovered will beg to be taken advantage of.  This usually causes an overemphasis on a specific portion of the value chain, such as productivity being emphasized over time-to-market, quality or cost.

3216110254_15d4a00bf2_b

In Christianity, the seven deadly sins are the root of all other sins. This concept has been used as an analogy for the ills or risks for many professions.  The analogy fits as well for software metrics; focusing attention on the behaviors that could sap your program’s integrity, effectiveness and lifespan. Here we will look at the deadly sins from the point of view of a person or group that is creating or managing a metrics program. As with many things in life, forewarned is forearmed, and knowledge is a step towards avoidance.

Here are the seven deadly sins of metrics programs:

  • Pride – Believing that a single number/metric is more important than any other factor.
  • Envy – Instituting measures that facilitate the insatiable desire for another team’s people, tools or applications.
  • Wrath – Using measures to create friction between groups or teams.
  • Sloth – Unwillingness to act on or care about the measures you create.
  • Greed – Allowing metrics to be used as a tool to game the system for gain.
  • Gluttony – Application of an excess of metrics.
  • Lust – Pursuit of the number rather than the business goal.

All of the deadly sins have an impact on the value a metrics program can deliver.  Whether anyone sin is more detrimental than another is often a reflection of where a metrics program is in it’s life cycle. For instance, pride, the belief that one number is more important than all other factors, is more detrimental than sloth or a lack of motivation as a program begins whereas sloth becomes more of an issue as a program matures.  These are two very different issues with two very different impacts, however neither should be sneezed at if you value the long-term health of a metrics program. Pride can lead to overestimating your capabilities and sloth can lead to not using those you have in the end self-knowledge is the greatest antidote.

Over the next few days we will visit the seven deadly sins of metrics!

Baseline, not base line...

Baseline, not base line…

Measuring a process generates a baseline.  By contrast, a benchmark is a comparison of a baseline to another baseline.  Benchmarks can compare baselines to other internal baselines or external baselines.  I am often asked whether it possible to externally benchmark measures and metrics that have no industry definition or occasionally are team specific. Without developing a common definition of the measure or metric so that data is comparable, the answer is no.  A valid baseline line and benchmark requires that the measure or metric being collected is defined and consistently collected by all parties using the benchmark.

Measures or metrics used in external benchmarks need to be based on published or agreed upon standards between the parties involved in the benchmark.  Most examples of standards are obvious.  For example, in the software field there are a myriad of standards that can be leveraged to define software metrics.  Examples of standards groups include: IEEE, ISO, IFPUG, COSMIC and OMG. Metrics that are defined by these standards can be externally benchmarked and there are numerous sources of data.  Measures without international standards require all parties to specifically define what is being measured.  I recently ran across a simple example. The definition of a month caused a lot of discussion.  An organization compared function points per month (a simple throughput metric) to benchmark data they purchased.  The organization’s data was remarkably below the baseline.  The problem was that the benchmark used the common definition of a month (12 in a year) while their data used an internal definition of a 13 period year. The benchmark data or their data should have been modified to be comparable.

Applying the defined metric consistently is also critical and not always a given.  For example, when discussing the cost of an IT project understanding what is included is important for consistency.  Project costs could include hardware, software development and changes, purchased software, management costs, project management costs, business participation costs, and the list could go on ad-infinitum.  Another example might be the use of story points (a relative measure based on team perception), while a team may well be able to apply the measure consistently because it is based on perception comparisons, outside of the team would be at best valueless and at worst dangerous.

The data needed to create a baseline and for a benchmark comparison must be based on a common definition that is understood by all parties, or the results will generate misunderstandings.  A common definition is only a step along the route to a valuable baseline or benchmark, the data collection must be done on a consistent basis.  It is one thing to agree upon a definition and then have that definition consistently applied during data collection. Even metrics like IFPUG Function Points, which have a standard definition and rigorous training, can show up to a five percent variance between counters.  Less rigorously defined and trained metrics are unknowns that require due diligence by anyone that use them.

Measure data that incentivizes behavior that moves you towards your goals.

Measure data that incentivizes behavior that moves you towards your goals.

The are three basic goals for measurement. The first goal is to drive change in an organization. The second goal is to use measurement as an enforcement tool. And the third is to provide data for other processes or decisions (estimation, for example).  The third of these goals is the most mundane and is the easiest to implement, therefore is where metrics implementations start.

Metrics programs that are part of a goal to drive change are considered of having reached the pinnacle of metrics program value (goal one). Metrics programs reach this level when they participate in identifying opportunities to make changes by data analysis.  Sifting through data, identifying potential changes and using measurement to guide change is a strategic  for organizations that want to evolve toward higher levels of capability. Unfortunately, most metrics programs do not have the time to follow this evolutionary approach. Organizations find it difficult to achieve the constancy of purpose required to gather the data and wait for the amount of data to reach the threshold needed for solid statistical analysis and data mining.  A second approach is to follow a more aggressive strategy. The intent of this approach is to go looking for areas or groups in pain.  When you identify a group in pain, use measures and metrics to help them find and prove the change that would remove their pain.  This is a opportunistic approach to measurement that links measurement with process improvement to create valuable change.

The second goal, measurement as an enforcement mechanism, is a bad idea.  Avoid being forced into this role.  Making measurement personnel into process police will politicize measurement and requires tons of effort for little value.  This usually happens in organizations where line management is spread too thin to actively manage people and processes. Measures are often used to answer the questions:

  • Are changes being adopted?
  • Are changes providing the expected rate of return?
  • Who is playing, and who is not?

Metrics programs spend the majority of their time and effort supporting the gathering and supporting other processes and decisions (the third goal). While important, the support role is the least obvious to the decision making portions of an organization. Metrics groups that only service the data needs of other groups in the organization are apt to be viewed as overhead, not as strategic assets.  The question will always be asked whether measurement over had can be cut therefore the measurement team will all be at risk.

Bottom Line:  Metrics programs deliver the most value when they are focused on finding and actively participating in delivering change within organizations.  When metrics programs are the measurement arm of the process police or just supporting other, more valued teams (like estimators) they can easily be branded as overhead.  All metrics organizations need to work on refocusing their efforts at getting involved in change, while supporting other process.  Being branded as overhead is dangerous for your career.

How fast are you getting to where you're going?

How fast are you getting to where you’re going?

What is the difference between productivity and velocity?  Productivity is the rate of production using a set of inputs for a defined period of time.  In a typical IT organization, productivity gets simplified to the amount of output generated per unit of input. Function points per person month is a typical expression of productivity.  For an Agile team, productivity could very easily be expressed as the amount of output delivered per time box.  Average productivity would be equivalent to the team’s capacity to deliver output.  Velocity, on the other hand, is an Agile measure of how much work a team can do during a given iteration.  Velocity is typically calculated as the average story points a team can complete. Conceptually the two concepts are very similar, the most significant differences relate to how effort is accounted for and how size is defined.

The conventional calculation for IT productivity is:


productivity

Function points, use case points, story points or lines of code are typical size measures. Work in progress (incomplete units of work) and defective units generally do not count as “delivered.” Effort expended is the total effort for the time box being measured.

The typical calculation for velocity for a specific sprint is:

velocity

Note, as a general rule, both metrics are an average.  One observation of performance may or may not be representative.

The denominator represents the team’s effort for a specific sprint in both cases, however when using velocity the unit of measure is the team rather than hours or months. Average velocity of a team makes the assumption that the team’s size and composition are stable.  This tends to be a stumbling block in many organizations that have not recognized the value of stable teams.

The similarities between the two metrics can be summarized as:

  • Velocity and productivity measure the output a team delivers in a specific timeframe.
  • Both metrics can be used to reflect team capacity for stable teams.
  • Both measures only make sense when they reflect completed units of work.

The differences in the two metrics are more a reflection of the units of measure being used.  Productivity generally uses measures that allow the data to be consolidated for organizational reporting.  While velocity uses size measures, such as story points, that are team specific. A second difference is convention. Productivity is generally stated as # of units of work per unit of effort (i.e. function points per person month), while velocity is stated as an average rate (average story points per sprint).  While there are differences, they are more a representation of the units of measure being used than the ideas that the metric represents.

Measure Twice or Good Numbers Can Go Bad

Measure Twice or Good Numbers Can Go Bad

Mark Twain once said “There are lies, damn lies, and statistics.” The same numbers can be used to support many causes. Even though numbers are just numbers, they can be used to tell a story.

What you do with the messages developed from the metrics you collect is important in its own right. Messages become tools (or weapons) to motivate. Motivation can range from the positive (look how good you are doing) to the negative (look how bad you are doing) or the ultimatum (do better or else). Here we will discuss what happens when there is no message or where the message and data aren’t synchronized (part two!)

Using Team Measures for Individuals:

Measurement is an intimate subject, because it exposes the person being measured to praise or ridicule. Management many times will begin with a group or team level focus only to shift inextricably to focus on the individual. The individual-view is fraught with difficulties, such as gaming and conflict, which typically becomes the norm for causing anti-team behavior, which, in the long run, will reduce quality, productivity and time-to-market (short-term gains, but long-term pain). The focus of measurement must stay at team level for measures that focus on the results of team behavior and only evolve to individual measures when measure relates to the output of an individual.

Don’t We Want To Be Average?:

Another classic mistake made with numbers is regression to the mean. Performance will tend to approximate the average performance demonstrated by the measures chosen. A method taken to address the mistaken is to:

1. Select the metric based on the behavior you want to induce,

2. Set goals to incent movement in the proper direction and away from average.

The proper direction is never to be average in the long run.

It’s All About People:

It is difficult to ascribe a motive to a number. It’s merely the tool of the person wielding it. Put a number in a corner and it will stay there minding its own business not attracting attention or detracting from anyone else. Add a person and the scenario begins to change. The wielder, whether Lord Voldemort, Dumbledore or some other high lord of metrics, becomes the determining factor on how the number will be represented. Measures and metrics can be used for good or evil. Even when the measures are properly selected and match the organization’s culture, “badness” can still occur through poor usage (people). There is one theory of management that requires public punishment of perceived laggards (keel haul the lubber) as a motivation technique. The theory is that punishment or the fear of punishment will lead to higher output. In real life, fire drills (team running around like crazy to explain the numbers) are a more natural output which absorb time that could be used to create value for IT’s consumers. The fire drills and attempts to game the numbers reduces value of the measurement specifically and management more generally.

Are Report Cards A Silver Bullet?:

Report cards are a common tool used to present a point-in-time view of an organization, project or person. At their best, report cards are benign tools, which are able to consolidate large quantities of data into a coherent story. However, creating coherence is not an easy feat. It requires a deft hand that allows a balanced, comprehensive view that is integrated with what is really important to the firm and the entity being measured. Unfortunately, since this is this difficult, often a stilted view is given based on that data which is easily gathered or important only to a few parties. This stilted view is surefire prescription for when Good Numbers Go Bad. The solution is to build report cards based on the input from all stakeholders. The report card needs to include flexibility in the reporting components so they can be tailored to include the relevant facts that are unique to a project or unit without disrupting its framework. Creating a common framework helps rein in out-of-control behavior by making it easy to compare performance (peer-pressure and comparison being the major strength of report cards).

Most of us have been introduced to report cards during school. In most cases, they were thrust upon us on a periodic basis, the report card presenting a summary of the basic school accounting. While we did not get to choose the metrics, at least we understood the report card, and the performance it represented seemed to be linked to our efforts. Good Numbers Go Bad when corporate report cards are implemented using team-level metrics as a proxy for individual performance. As I noted above, balance is critical to elicit expected behavior as well as application of metrics at the proper level of aggregation (teams to teams, people to people). Team metrics present information on how the whole team performed and was controlled, unless the metrics are applied to the unit that controlled performance it misses the mark.

The Beatings Will Continue Until . . .:

“One characteristic of a bad metrics program is to beat people up for reporting true performance.” — Miranda Mason, Accenture

Terms like “world-class” and “stretch” get used when setting goals. These types of goals are set to elicit teams or individuals to over-perform for a period of time. This thought process can cause inappropriate behaviors, in which the goal seekers act more like little children playing soccer (“kick the ball and everyone chases the ball”) rather than a coordinated unit. Everyone chases the ball rather than playing their position like a team,.  Goals that make you forget teamwork are   a perfect example of when Good Numbers Go Bad. Good measurement programs challenge this method of goal setting. Do not be impressed when you hear quotes like” we like to put it out there and see who will make it.”.

Goals are an important tool for organizations and can be used to shape behavior. Used correctly, both individual and team behaviors can be synchronized with the organization’s needs. However, when used incorrectly, high pressure goals can create opportunities for unethical behavior. An example of unethical behavior I heard recently was in an organization that promoted people for staying at work late. The thinking was that working more hours would increase productivity. In this case, a manager would check in at approximately 8 PM every evening ostensibly to do something in the office, but in reality to check to see who was in office. In this case, many people did nothing more than to go out to dinner then come back to work or just read a newspaper until the appointed hour. When the manager checked, there were many people working away at their desks. I suspect that little additional time was applied to delivering value to the organization or their customer. The manager should spend time determining the behavior (good and bad) that can be incentivized as he set metrics and the goals for those metrics. Spending the time on the psychology of measures will increase the likelihood that you will get what you want.

Politics:

As noted earlier, the way numbers are used has a dramatic impact of whether long-term good will is extracted from the use and collection of metrics. The way numbers are used is set by the intersection of organizational policy and politics. The mere mention of the word politics connotes lascivious activities typically performed inside a demonic pentagram. However, all human interactions are political in nature. Interactions are political and organizations are collections of individuals. Many times the use of the word “political” is a code to indicate a wide range of negative motives (usually attributed to the guy in the next cube) or to hide the inability to act. When you are confronted with the line “we can’t challenge that, it is too political,” step back and ask what you are really being shown. Is it:

  • • lack of power or will;
  • • lack of understanding, or
  • • lack of support?

Once you have identified the basis for the comment, you can build a strategic response.

Metrics, A Tool For Adding More Pressure?:

There are many types of pressure that can be exerted using metrics. Pressure is not necessarily a bad thing. Rather, it is the intent of the pressure that is the defining attribute of whether the metric/pressure combination is good or bad. Good Numbers Go Bad when measurement pressure is used to incent behavior that is outside of ethical norms. Pressure to achieve specific metrics, rather than a more constructive goal, can create an environment where the focus is misplaced. School systems that have shifted from a goal of creating an educated community to the goal of passing specific tests are a good example. An IT example that was once described to me was in an organization that measured productivity (with a single metric problem described before) in which a project gamed is productivity performance by under-reporting an effort (actually they hid it in another project). As in the discussion of using single metrics, creating a balanced view that targets organizational goals and needs is the prescription. When a balanced approach is applied, pressure can be applied to move the team or individual towards the organizational goals in a predictable (and ethical) manner.

Measure Twice or Good Numbers Can Go Bad
You can't just hope that mistakes will go away...

You can’t just hope that mistakes will go away…

Mistakes can come in many flavors, errors of commission and omission; calculation mistakes or errors in mathematics (wrong formulas, logic or just ignoring things like covariance), and just plain stupid mistakes. The group as a whole is the single biggest reason Good Numbers Go Bad. Mistakes by definition occur by accident and are not driven by direct animus. The grace and speed in which you recognize and recover from a mistake will determine the long-term prognosis of the practitioner and his/her program (assuming you don’t make the same mistake more than once or twice). Ignoring a mistake is bad practice; if you need to make a habit of brazening out the impact of mistakes, you should consider a new career as you have lost the long-term battle over the message.

Collection Mistakes:

Collection mistakes are a category that covers a lot of ground ranging from gathering the wrong data to erratic data collection. While collecting the wrong information can lead to many other kinds of mistakes, recognition and the recovery from collection errors, which lead to credibility issues will be explored in depth in this section.

“In order to capture metrics, the procedures, guidelines, templates, and databases need to be in sync with the standard practices.”

— Donna Hook, Medco

Data collection errors typically represent errors of omission (data not collected); however, occasionally the wrong information is collected or data is not collected at all. Collecting the wrong data (or data you do not understand) will create situations where your analysis will be wrong (garbage in) with the possibly that you won’t know it (gospel out). Someone will usually discover this error at the worst possible time, leading to profuse sweating and embarrassment. Gathering the wrong or incomplete data is a nontrivial mistake, which makes good numbers go bad. However, what you do about it will say a lot about your program.

Begin by making sure you have specified the data to a level that allows you to ascertain that what you collect is correct. Audit the collection process against the collection criteria periodically helps to make you collect the correct data and collect it correctly. Create rules (or at least rules of thumb) that support validation. Rules of thumb will help you to quickly interpret the data. Did you get the quantity of data you expected? Has the process capability apparently changed more than you would reasonably expect?

Erratic Collection:

Measures and metrics are can be perceived to be so important that panicked phone calls are known to precede collection. Equally as interesting are the long periods of silence that occur before the panic. Erratic data collection sends a message that the data (and therefore the results) are only as important as whoever goosed the caller (or slightly less important than whatever the caller was doing right before he/she called). Inconsistent collection leads to numerous problems including rushed collection (after the call), mistakes and an overall loss of face for the program (fire drills and metrics ought to be kept separate). Consistency spreads a better message of quiet importance.

Mathematical Mistakes:

“We accidentally used one number instead of a correct value. Now our stakeholders ask for a second source.”

— Rob Hoerr, Formerly Fidelity Information Services

“Mathematical mistakes happen! We are all human!” The excuses are anthem, which means all measurement programs must take the time and effort to validate the equations they use. Equations must be mathematically and intellectually sound. Inaction in the face of mistakes in the equations or results makes good numbers go bad. Neither your results nor equations should be ingrained to the point of freezing your project into inaction when a mistake is found. The need to avoid math mistakes driven by not understanding the data places a lot of stress on the need to create measurement and metrics specifications. Once the specification, including data like a description, formulas and definitions, is created it is easier to make sure you a measuring what you want and that you get the behavior you anticipate. The spec provides a tool to gauge the validity of the math, the validity of the presentation, and, by inference, the validity of the analysis.

Liars, Damn Liars and Statisticians:

Statistics has long been a staple of business schools, which instill the belief that numbers can prove anything. Numbers, however, require an understanding of the equations that flies in the face of this mentality. When simple relationships are ignored to make a point good numbers go bad. Examples of questionable math can include graphs with the same variable (in different forms) on both axes presented with linear regressions lines driven through them. The created co-variance goes unrecognized, leaving the analysts speculating on what the line means without the recognition that the relationship is self-inflicted.

Developing a simple understanding of the concepts of co-variance, ‘r”-squared values and standard error are easy steps to help sort out basic conceptual errors. A corollary to this is that the knowledge of statistics will not necessarily stop your other mistakes, like adding the wrong EXCEL cells together, but it can’t hurt. Always check your equations, your statistics, and never fail to check the math!