Estimating is a risky business.

Hey! How long is it gonna take… You know what, I’m not even gonna finish that sentence. It is a question I personally dread and sometimes even hate with the passion of a sugar’d up 8 year-old with a tantrum. Time or “relative size” (more on that later) estimations are all fun and games until a manager puts a finger in your calendar and says: “You said it would take three days”. 

But let’s all agree that planning and therefore harnessing the dark art of predicting the future is an important part of any project. 

In this blogpost we’ll outline a new to us method of estimating. An estimation method that allows accurate estimations, even if the backlog items are not comparable. We use risk factors to quickly and methodically estimate the amount of work backlog items represent. 

Why, why, why

Why do we need another estimation method? Well, most etimation methods require you to compare tasks to each other, In our case this comparability is not present. Hence the need for a different method. 

Why do you use risk as the central element? Because if hardly anything is comparable the only thing left is the risk that each task incorporates..

Why should I care about this method? That highly depends on your situation, if you have very similar tasks in your backlog, this method is not for you. If on the other hand you have a random forest of tasks all screaming to get done I suggest you keep reading. 

Estimation units

Assuming you know a few estimation techniques, you are probably aware of the estimation units of measure: “Story Points”, “Ideal Days”  or (my favorite ahm..) “Person Days”. The latter completely ignores the fact that a “person day” is a chaotically fragmented collection of working hours, so we will leave that unit for now. 

The Ideal Days unit represents the unicorns of working days: an uninterrupted, focused, healthy, motivated and inspired work day. Using Ideal Days as a unit addresses some of the problems with the Person Day unit but not all of them. You’re still estimating time. Time is too tangible, too easily interpreted as a number of days added to the current date and used as a deadline or implicit expectation, in other words Ideal Days get easily misinterpreted for Person Days. 

The Agile Community, innovative and not letting a pesky unit spoiling all the fun, made a different kind of unit popular: Story Points. Related to the popular agile representative of  a requirement, the User Story. Story Points represent an amount of work. I understand this does not seem like a big deal, but it is! Even in everyday speech an amount of work is our unit of choice: “I got a lot of work done” or “ I have tons of work left to do”.  You do not hear Marco from accounting mentioning that he has 6 hours of work left, or Linda from controlling casually dropping by to inform you she squeezed 3,45 hours of work into 2,7. 

The amount of work a story point represents can therefore mean anything. Example, a story point can mean the full implementation of a user story, including testing and updating the user documentation. The important central element is that we can compare tasks or backlog items and go: “that one is at least twice as much work as the other one”. The underlying and fundamental aspect of a Story Point, is the relative amount of work it represents, not the time it takes to complete the work. 

Size units, looking for similarities 

With story points we entered the realm of relative size, an amount of work. All fine if tasks are similar (e.g. coding a database interaction compared to coding some user interaction) making them somewhat easily comparable. 

But how do I estimate the relative size if my project tasks are all completely different? Answering this question becomes problematic especially if you and your team are confronted with a plethora of different tasks, encompassing different technologies, environments or usages. Those tasks seem to have almost nothing in common that would aid you in estimating the relative size of each task. How do you compare coding a batch script that triggers multiple programs to execute a macro with a configuringing an export of a complex requirement management tool? 

We have a suggestion: Use Risk Factors as a unit of size. 

Establishing Risk Factors 

So we know we need estimates for effective project management, therefore we need to select an estimation unit in combination with a unit of size, but this is very difficult if the project you’re in is figuratively speaking “all over the place”. One aspect project tasks always have in common is risk. Any Systems Engineer will tell you that managing risk is a big part of systems engineering. 

How do you find those risk factors? First, You’ll need a number of finished tasks and the total hours spent on those tasks. Second, you proceed as follows:

  1. Group historical backlog data in hourly groups
  2. Extract common group characteristics.
  3. Assign T-Shirt Sizes to Hour Group Characteristics.
  4. Create an algorithm that produces the story point results. 

1. Group Historical Backlog Item Data

Don’t overthink this step, you only need a few data points to start with and you can always refine over time. List  backlog items or tasks you or your team have completed and the total hours spent. Group those tasks that have similar spend hours. In the following table we list a fictional list of Product Backlog Items (PBI) and corresponding spend hours that we will use troughout the blogpost.

TaskSpend Hours
(PBI-42) import data from Programm XYZ16
(PBI-02) create Batch Script of multiple document creation 14
(PBI-15) Validate UML via JavaScript8
(PBI-20) Create basic report with Integrity Reporter4
(PBI-09) create interface documentation 5
(PBI-07) export data to Programm XYZ19

After we grouped the PBI’s we end up with the following table:

Hour GroupTasks characteristics 
15-20+   PBI-07 & PBI-42
– We had no idea how Programm XYZ works,
– While working figured out what the requirements really whereLots of exceptions that we didn’t anticipate. 
Batch Script black belt !no problems here.
Had a good understanding of what I needed to do, but the fine details were for me to discover. 
Some exceptions to the normal data flow popped up.
5-10 PBI-15 & PBI-09
We knew how to code the validation. 
Knew exactly what to do. Most of it was manual work Some exceptions that we anticipated.  
0-5  PBI-20 
We know exactly how to create the report. 
Not a lot of manual work. 
We knew exactly what to do. 

2. Extract common group characteristics 

In the table above we already entered the common characteristics of the completed task. Will doing this you probably notice a few patterns popping up. The more completed Product Backlog Items you have, the more accurate the grouping and patterns becomes. In the table above we see some similarities in the tasks characteristics:  

  • Knowing how a specific program or technology works is key. (Technology Risk)
  • The level of understanding of the actual task at hand.  (Functional Risk)
  • The amount of exceptions. (Exception Risk)
  • Amount of manual work. (Work factor)

In parenthesis are the risks that relate to the characteristic. Finally this blogpost is starting to come together. The work factor is a simple way of getting the amount of work into the mix. You could estimate this by making the work factor equal to the number of people working on it. 

3. Assign T-Shirt Sizes to Hour Group Characteristics

Of course we keep the whole lightweight-keep-it-simple-agile mindset all the way to the end. After you have identified the common problematic characteristics and corresponding risks as mentioned above, assign the well known T-shirt sizes to each of them: Small, medium, and large. 

Hour GroupTasks characteristics Risk and Size 
15-20+   PBI-07 & PBI-42
– We had no idea how Programm XYZ works,
– While working figured out what the requirements really where
-Lots of exceptions that we didn’t anticipate. 
PBI-07 & PBI-42
Technology Risk (large)
Functional Risk (large) 
Exception Risk (large)
Work Factor = 1
10-15 PBI-02
– Batch Script black belt !no problems here.
– Had a good understanding of what I needed to do, but the fine details were for me to discover. 
-Some exceptions to the normal data flow popped up.
Technology Risk (small)
Functional Risk (medium)

Exception Risk (medium)Work Factor = 2
5-10 PBI-15 & PBI-09
-We knew how to code the validation. 
-Knew exactly what to do. 
-Some exceptions that we anticipated.  
-Most of it was manual work 
PBI-15 & PBI-09Technology Risk (small)
Functional Risk (small)Exception Risk (medium) Work Factor = 2
0-5 PBI-20 
-We know exactly how to create the report. 
-We knew exactly what to do
-Not a lot of manual work
PBI-20Technology Risk (small)
Functional Risk (small)Work Factor= 1

4. Create the Risk Points Algorithm. 

So far this approach is super simple and lightweight. Here my attempt to make it a bit more scientific by introducing the term “Algorithm”. With the risks and their T-shirt sizes together with the work factor we can now apply some high level math and calculate what we want. Let’s start with the following product:

RiskPoints = WorkFactor * (TechnologyRisk(size) + FunctionalRisk(size) + ExceptonRisk(size))

We assign the T-shirt size with the following values: Small = 1, Medium = 2, and Large = 5 

And each fo the risk functions will remain simple: FunctionalRisk(size) = 1 * size and the same for the rest of the risks. Use a program that already rules the world like Excel or Google Sheets to create a table that calculates the results. 

Technology RiskFunctional RiskException Risk Risk Points
15-20+  1Large = 5Large = 5Large =5 15
10-151small = 1medium = 2medium = 25
5-10 2small = 1small = 1medium = 24
0-5  1small =1small = 1small = 13
Resulting risk points all riks equal

Wait what! That 15-20+ hours plus group really jumps out! Yes and that is exactly what we want. Remember we are working on a very limited data set, we only used 6 Product Backlog Items to come to this point. With more backlog items you can tune the algorithm more. For example, if you find that functional risk really affects you or your team, then you can adjust that risk function to: FunctionalRisk(size) = 3 * size.  With the extra weight on the functional risk the resulting risk point change dramatically:

Technology RiskFunctional RiskException Risk Risk Points
15-20+  1Large = 5Large = 45Large =5 55
10-151small = 1medium = 6medium = 218
5-10 2small = 1small = 3medium = 212
0-5  1small =1small = 3small = 15
Resulting Risk Points more weight on functional risk

What is the main benefit here! Risks are dangerous, they can occur or not. Risk likelihood can be well known or totally unknown. With exceptions for instance, you can see this in our table. The main benefit of this method is that you remove yourself from comparing tasks to each other. Comparing tasks has only real marrit if the tasks are comparable. Using the risk factors you can still work without heavy upfront analysis and documentation but have a way of estimating quickly and identifying problem areas. The team can say: all backlog items with over 30 Risk Points have to be analyzed further before we start. 

This risk-based method helps us as consultants in a broad range of companies and domains to still have some idea on what we are up against. The method is easy, fast, flexible, and above all: As objective as we can get it. Also having multiple people comparing risk factors now becomes possible. This way you can quickly identify differences in estimation and discuss what is the best way forward. 

You can also try to translate Risk Points back to hours. With our first algorithm we would just use Hours=RiskPoints*2 to get the estimated hours we need for a backlog Item. With the second algorithm we can say that Hours=RiskPoints. 

If you combine this method and tune it over the course of a few weeks it is remarkable how fast and above all accurate you become! 

Try it out, out risk-based method is simple but highly effective. 


If you cannot compare backlog items, how are you going to estimate their relative size or required time? We use a method that uses risk factors such as technology risk or functional risk in combination with T-shirt sizes that quickly allows us to calculate Risk Points.  Those Risk Points are then used to either identify backlog items that require more analysis. 

We effectively use this method to quickly and objectively estimate work.