Product Development, in the shape of an enterprise’s typical delivery centre, enhancing its core technical platforms, is best understood as a queuing system, where maximum value is generally obtained through short lead times from concept to consumption. The most important management task for such a delivery centre is to level demand entering the centre to match its capacity.

This post is a first response to reading Chapter 3 of The Principles of Product Development (Reinertsen, 2009), which powerfully brings together a number of concepts I’ve known for a while.

This is a fairly raw set of thoughts, filtering the queuing concepts of that chapter through my own experience, to help me understand it better. Dialogue and critique are therefore very welcome to help my learning.

Delivery centres generally have four sources of demand:

Business Projects
New functionality and features, requested by the business to deliver additional value to customers in the expectation of improved conversion, retention, revenue, margin etc. This can include all kinds of failure demand where current functionality does what it’s designed to do, but not what the market wants.
Technology Projects
Non functional changes sponsored by technical stewards of the platform to support long term sustainability, such as upgrades, API changes, replatforming (physical or vendor)
Small Change
Black box internal changes that shouldn’t have non-local impact, often at the level of data rather than deep logic. Ruleset changes, image updates, limited refactoring of technical debt are all here.
Defects
Anything not conforming to design.

The demand comes in from a range of stakeholders, and each type clearly conforms to a normal M/M/c queue. Simply put, this means work arrives unpredictably and is a range of sizes but generally each piece doesn’t fill the entire capacity. Actually, this models pretty well to cars on a multilane highway, so feel free to use your experience of motorway traffic to help you through this.

However, experience shows that this kind of input across a complex set of technologies leads to huge variation in demand for technical specialties. Worse, it’s nearly always compounded with a demand to ensure that the people in those teams are kept busy. Every delivery centre I’ve seen has had an imperative for utilisation, often in the form of cost recovery from project sponsors and ultimately from CapEx budgets to fund the delivery centre’s OpEx spend with suppliers.

Pain for Suppliers

This inevitably causes problems for the technology teams, as it leaves you with the unpleasant choice of erring either on the side of overstaffing, which causes financial pressures from both customer and internal management, or understaffing, causing massive overburden to the dedicated, skilled individuals working on the ground.

Trying to square this circle leads to huge pressure to upsize and downsize the team at fairly short notice as demand varies. Small, short term variation can just about be absorbed, as can long term, plannable trends, but anything significant and short term just can’t be. So you lose experienced people, and once they get swallowed back into the wider pool of a supplier with a bad taste in their mouths, you’re not getting them back. And it’s harder when you have split the types of demand into separate teams, with separate funding sources.

Pain for Sponsors

The pain for delivery centres’ customers is even greater. High utilisation rates in the context of M/M/c queues lead to horribly long queues at all stages of the work, and inevitable costly delays to delivery to production and value realisation by consumption. At 95% utilisation of a team of 10, you’re already queuing for three times the time the actual work takes and for a critical specialism of one, you’re hitting that wait time long before you hit 70% utilisation.

Queue curves showing wait:work proportions for teams of 1 to 10 people (known as Servers in queuing theory).
Computer Measurement Group

When you are boasting to your bosses at nearing utilisation of the desired level of 100%, queue lengths and wait times have the potential to approach intolerable lengths and like a motorway full to the brim with traffic, the tiniest problem that reduces the number of people in the team (ie reduce the Server count) causes the entire road to jam solid.

So when you fulfil perfectly rational economic imperatives about full cost recovery, you end up with significantly suboptimal systemic results. We can wish for a different world where those imperatives don’t apply all we like. But we get to work in this world instead, and changing it will take longer than we’ve got.

We need a way out.

What’s the solution?

I can think of 2 options:

1. Overrecover

The reason why we’re doing this is to avoid the delivery centre running at a net OPEX cost. Simple arguments about whether anyone is slacking off to play Tetris can be dealt with as long as the hard financials are taken care of.

There are a couple of ways to simply achieve the financial needs without changing much else.

a) Use a higher chargeout rate

Generally, the teams will be recording time against each project which will result in reporting and chargeback to project sponsors (and we’ll gently put the degree of value add involved in doing all of that in any kind of detail down on the table and quietly back away from it). Assuming the fully loaded daily cost of your people blends out at £495, increasing that rate to £550 will buy you the capacity to reduce utilisation by 10%, and to £620 buys 20%.

Utilisation reductions of that order make the difference between intolerable wait times and free flowing work, and will allow you to absorb demand variation within a stable team size.

To avoid sponsor revolt, you might offer a prioritisation deal: paying the higher rate means you get queue priority and don’t suffer the wait times of a highly utilised team. Once sponsors start understanding the financial impact of delay, chances are the demand for the priority service will increase and you can fund a full increase in the team size.

b) Recover based on output, not effort

Rather than going through the whole rigmarole of complex time recording, which is always subject to randomness, error and rounding when it’s conducted at a more granular level than All day today I did this, think about recovering per unit of customer value; per MMF or story point. This gives sponsors a much stronger link generally between cost and benefit, and starts shortcutting much of the need for time estimating of each piece. You’ll need strong historical data to be able to calculate an appropriate level to fund your teams at a sensible level of utilisation, but it factors out the whole question of Are the team busy today?

2. Level Demand and Match Capacity via a Single Product Owner

The other way to control all this pain is to present a single stream of work to the team, matched to their rate of throughput (aka Capacity). When you’re controlling the rate of flow like this, you eliminate variation in demand to the team and the risks of erring on either side of an rapidly fluctuating workload. Obviously long term forecasting combined with a sensible mobilisation horizon means that long term trends can be accommodated via career mobility/progression and natural people change cycles.

This works because, from a queuing theory perspective, what you’re doing is reducing the number of sources of demand, which has a very positive effect on queue lengths and wait times.

Impact of number of demand sources on queue curves
Source: Computer Measurement Group

If you only have one source of demand, only ever presenting a single demand (Please work on this next), your maximum queue length is 1, so delivering a much more effective flow, regardless of utilisation factors.

From a Lean perspective, this is all to be expected. You have just eliminated Muri, Muri and the Mudas of Overproduction — described as a crime by Masaakai Imai — Inventory — likewise described as an enemy to be exterminated — and Waiting. And our Agile friends are happy that we’ve achieved the principle of Sustainability.

Generally, this is achieved by saying No to all ideas that exceed the delivery centre’s capacity, otherwise you end up with a M/M/c queue building up before the flow control, just pushing many of the same problems upstream. A single, empowered individual with a strong understanding of value must be able to manage sponsors’ expectations until the delivery capacity approaches their demand on it.

Hang On, What About Prioritisation?

I’ve said up front that capacity-matched demand leveling is the most important thing a Product Owner can do. I’m sure there are some voices out there matching my first reaction to this proposition which was: but isn’t prioritising what’s next important too?

It is important, but the relative sensitivity of leveled demand and prioritised work leans heavily towards leveling. The cost of delay is generally such that even when you have to wait for periods tending to infinite time, it doesn’t matter how valuable the work you put into the queue is; you’re never getting it. On the other hand, if you’re getting rapid cycle times, allowing work through without filtering for value will be give you a better economic result. Having the Product Owner say No is far more important than what she says No to.

Fast flowing queues of unprioritised, mixed value work are more effective than slow or not-flowing queues of high value work. Achieve flow and reduce pain by leveling demand and matching capacity before any other activity without sacrificing financial imperatives.

  Recommend this post
Follow me

Martin Burns

Transformation Consultant at CA Inc (formerly Rally)
Previously: Leader of software delivery portfolios in large scale orgs.
Specialism: Transforming complex delivery organisations to be more Lean/Agile.
Mindset: Continuous Improvement Obsessive
Follow me

Latest posts by Martin Burns (see all)

CC BY-NC-SA 4.0 The Product Owner’s Prime Directive by Martin Burns is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

%d bloggers like this: