Story points make a poor organizational measure of software size.

Story points make a poor organizational measure of software size.

Recently I did a webinar on User Stories for my day job as Vice President of Consulting at the David Consulting Group. During my preparation for the webinar I asked everyone that was registered to provide the questions they wanted to be addressed.  I received quite a few responses.  I did my best to answer the questions, however I thought it would be a good idea to circle back and address a number of the questions more formally. A number of the questions concerned using story points.

The first set of questions focused on using story points to compare teams and to other organizations.  

Questions Set 1: Story Points as an Organizational Measure of Software Size

Story points make a poor organizational measure of software size because they represent an individual team’s perspective and can’t be used to benchmark performance between teams or organizations.

Story points (vs function points) are relative measure based on the team’s perception of the size of the work.  The determination of size is based on level of understanding, how complex and how much work is required compared to other units of work. Every team will have a different perception of the size of work. For example one team thinks that adding a backup to their order entry system is fairly easy and call the work five story points, while a second team might size the same work as eight story points.  Does the difference mean that the second team thinks the work is nearly twice as difficult or does it represent a different frame of reference?  Story points do not provide that level of explanative power and should not be used in this fashion. Inferring the degree of real difficulty or the length of time required to deliver the function based on an outsiders perception of the reported story point size will lead to wrong answers.

There are many published and commercially available benchmarks for function points include IFPUG, COSMIC, NESMA or MarkII varieties (all of which are ISO Standards).  These benchmarks represent data collected or reported using a set of internationally published standards for sizing software. Given that story points are by definition a measure based on a specific team’s perception and not on a set of published rules, there are no industry standards for story point performance. 

In order to benchmark and compare performance between groups, an organization needs to adopt a measure or metric based on a set of published and industry accepted rules. Story points, while valuable at a team level, by definition fail on this point. Story points, as they are currently defined, can’t be used to compare between teams or organizations. Any organization that is publishing industry performance standards based on story points have either redefined story points OR just does not understand what story points represent.