Measuring TDD is a lot like measuring a cyclone!

Measuring TDD is a lot like measuring a cyclone!

Teams and organizations adopt test-driven development for many reasons, including improving software design, functional quality, time to market or because everyone is doing it (well maybe not that last reason…yet).  In order to justify the investment in time, effort and even the cash for consultants and coaches, most organizations want some form of proof that there is some return on investment (ROI) from leveraging TDD. The measurement issue is less that something needs to be measured (I am ignoring the “you can’t measure software development crowd”), but rather what constitutes an impact and therefore what really should be measured. Erik van Veenendaal, an internationally recognized testing expert stated in an interview that will be published on SPaMCAST 406, “unless you spend the time to link your measurement or change program to business needs, they will be short-lived.”  Just adopting someone else’s best practices in measurement tends to be counterproductive because every organization has different goals and needs.  This means they will adopt TDD for different reasons and will need different evidence to assure themselves that they are getting a benefit.  There is NO single measure or metric that proves you are getting the benefit you need from TDD.  That is not to say that TDD can’t or should not be measured.  A pallet of measures that are commonly used based on the generic goal they address are:

Goal:  Improve Customer Satisfaction

  •       Customer Satisfaction Index – Measure the satisfaction of customers of the product or project by asking a series of question and then measure how their response changes over time.
  •         Net Promoter – Ask customers “how likely you are to recommend the product or organization being measured to a friend or colleague?” The change difference between the percentage that will recommend the product or project and those that will not recommend shows how customer satisfaction is changing.
  •         Delivered Defects:  A count (or a scaled count, such as defects per unit of work) of delivered defects are often used as a proxy for customer satisfaction.

Goal:  Decrease the Cost of Development

  •         Development and Delivered Defects (combined): The change in the counted (or a scaled count, such as defects per unit of work) defects discovered and those delivered across the development process is a direct measure of quality and has been shown to be directly correlated to quality.  As the number of defects created falls cost goes down.
  •        Labor Productivity: The ratio of output per person. Labor productivity measures the efficiency of the labor in the transformation of something into a product of higher value.  Improving efficiency is typically linked to decrease the cost of creating a product or delivering a project.

Goal: Improve Product or Project Time-to-Market

  •        Time to Market:  A measure of the measures the speed an item moves through the development process from backlog to production. This measure is always denominated by calendar time, but the numerator can either be value or size depending on the specific question the organization needs to answer.  
  •         Concept to Cash: A measure of the calendar time that it takes for an idea to go from being accepted into the portfolio backlog until it is first sold (or delivered) in the market place.

Goal: Improve Product or Project Quality

  •        Delivered Defects:  A count (or a scaled count such as defects per unit of work) of delivered defects.  All things being equal, defects that customers (or users) experience negatively affect the perception of product quality.  The higher the number of delivered defects the low the perception of quality.
  •         Defect Removal Efficiency: Defect Removal Efficiency (DRE) is the ratio of the defects found before implementation and removed to the total defects found through some period after delivery (typically thirty to ninety days).
  •         Test Code Coverage: A measure of the number of branches or statements that are covered by a group of tests. For example in TDD, when a developer pulls a story from the backlog, he or she would write a series of tests that would prove they have completed the story, run the tests (they should all fail) then they would write the code and re-run the tests which would all pass. There should be tests written that exercise each line of code written or changed (100% coverage).  In TDD, generally, as the code coverage goes up fewer defects fail to be discovered and get delivered to someone else.

Goal: Improve Software Design

  •         Improved Design:  Measuring design is a can of worms.  Attributes that can be measured include reliability, efficiency, maintainability, and usability.  Which design attribute should be measured is dependent on the needs of the business.  For example, for consumer products increased usability (how easy is the product to use) might be a critical measure.

Goal: Improve Compliance to Development Techniques (note: compliance is an internal goal and only tangentially relates to business goals; therefore, it should be adopted only if the indirect measures can be traced to delivering stated business goals.) 

  •         Ask and Count: A simple approach to measure TDD is when the code for stories is checked in either ask if TDD test cases were created and run or validate that test cases were committed (before and after) along with the code. 
  •         Change in the Automated Test Suite:  Count the number of tests that have been added, changed or deleted on a daily basis. This is a simple accounting approach that can be easily tracked.  As stories are accepted to be worked, changes to the automated test base will be apparent.

TDD is an important mechanism that puts the onus for unit testing directly on the members of the development team.  If you code, you test.  When adopting TDD compliance measures, it may be important to show progress HOWEVER they are not as important as measuring business value.Measures and metrics for TDD need to be focused on changing something that is important to the business, such as cost, quality, time-to-market or perhaps usability. Because in the end, that is really what counts!

Are there other options assuming you are going to measure your TDD implementation?