Queueing Theory: An introduction for software development

Queueing Theory: An introduction for software development




Was anyone else unsatisfied with this article? Yeah sure, queueing theory is cool and fun and interesting but this article is really just a short list of used abbreviations and falls woefully short of genuinely interesting content.

You'll get more out of reading the Wikipedia article on queueing theory and on M/M/1 queues.


I came first across queuing theory in the context of managing software development processes. it looked at the flow of bug fixes and feature requests through the hands and brains of people out to users, and how task switches of those doing the work and task and context switches of teams and set-up times for steps like testing or doing a single change how all these timings and little queues at each step add up and define team productivity.

"The Principles of Product Development Flow" by Donald Reinertsen.

it made me realize queuing theory as a framework for "agile" or "lean" workflows, pull (kanban) or push (scrum), without ever using any of these buzzwords.


Came here to post the same. I found it to be a valuable resource all around.



Hmm. This page is 90% trivial definitions (names of concepts), Little's Law and a bunch of references? Knowing definitions without knowing what conclusions you can draw is not going to help your software development. It will, perhaps, help you sound smart(er) at dinner, though.


This reminds me of one of my favorite times I got Cunninghammed[1]. It took me a minute to realize just who was correcting me. Obviously I’m grateful.

[1] https://news.ycombinator.com/item?id=37072825


As someone with zero prior knowledge on queuing theory I read this and it was very well written and gently introduced the notation which I appreciated, but as soon as I felt like I'd been told everything I needed to know to learn the theory, the article ended.


This is well timed - I was just wondering the other day: could you scale an app to infinity with just a distributed queue and a distributed transactional data store?

From my experience at big tech - I’m starting to become convinced but need to investigate more.


>> Little's law assumptions:

>> All measurement units are consistent.

>> Conservation of flow, meaning the average arrival rate equals the average departure rate.

Yeah, that checks out as the first and only two parameters most modern customer service behemoths are concerned about...

Reading this whole thing, my mind goes to how to wring optimal customer loyalty from a given queue, or how to relieve a queue in the way that would keep the most customers loyal. I guess that's why you shouldn't outsource.

Arguably, the theory of queues goes a long way toward explaining why the angriest people in a line are also the least coherent. It's funny to think about, in a Rowan Atkinson sort of way, that the more apoplectic you become while waiting the further you go back in the line. hah.


Queue Theory is known as Management Theory in Soviet Union. Soviet Union did not have managers and US did not have queues though.


Not to be too snarky, but it kinda feels like an Airtag might have helped here ...

The total U.S. military budget for fiscal year 2022 was approximately $778 billion. I'm genuinely surprised that the US Air force needs our help finding ANYTHING.


Wrong article, also not USAF.


Could someone recommend a book (ideally including problems sets with solutions) to get basic understanding of queuing theory?


It’s not exactly what you’ve asked for, but “Principles of Product Development Flow” by Don Reinertsen is a seminal work that covers a lot of queuing theory in the context of software development and project management.


I learnt it as part of discrete event simulation and modelling, though I can't find the book I learnt off, someone may have a suggestion. It was a very interesting course, covered queuing theory, random number generators and tests, probability, Markov processes and more. It was one of those, I need 10 points and its on Wednesday afternoon so I can have Friday off courses, and one of the most useful courses I studied.


I would recommend a guided series of lectures to get started with and then pick up a book or lectures pdf's once you are comfortable with the basics to expand.

In my opinion, there is a bit of a cliff of learning, where the beginning starts out harder, then begin the application of the theory. Others have mentioned, Little's Law, Erlang model, Kendall's notation etc..

Starting to learn, I would look at the following (since subject matter experts can explain the concepts and analogies much better):

    * MAP6264 Queueing Theory, Prof. Robert B. Cooper.

    * MIT OCW, 15.072J | Spring 2006 | Graduate
    * Queues: Theory And Applications.
For books, performance engineering/benchmarking topics usually contain a refresher on the basics and application of queuing theory which I think makes it easy to see its use.

To answer your question regarding one book to start with:

    * An Introduction to Queueing Systems by Sanjay K.Bose

    * Slides for Lectures Based on the Book

    * Sample tests and solutions

This book was a great exercise in understanding some of these concepts: https://www.oreilly.com/library/view/feedback-control-for/97...


I've taught queuing theory before, and my favorite part is teaching about cumulative flow diagrams. I will preach it to anyone who will listen.

The CMF is the single best way to monitor your queues in your monitoring system, and most people don't do it. Almost everyone has a graph showing the current queue depth. They might even have alarms on it.

But if you see the queue depth going up, how do you know the cause? There are two possibilities: arrival rate increased quickly, or service rate dropped quickly.

But with a CMF, you can tell just by looking! One of the two sections will be getting fatter, telling you immediately where to look for the problem. If service rate is slowing, then you know to scale up your processing fleet, or look for a pathological case that is blocking processing. If the arrival rate is increasing, you can put in some backoff or load shedding, or just be happy about how busy your product is!

[0] https://en.wikipedia.org/wiki/Cumulative_flow_diagram


Is http://brodzinski.com/2013/07/cumulative-flow-diagram.html good?

If so, I'll send a repost invite for https://news.ycombinator.com/item?id=18903965. Then please come back and post about it again :)


CFDs are almost always used for kanban and development, so almost every article is about that. That article is perfectly good in that regard.

I've never actually seen an article about using them for operational monitoring of queues. I've only ever given talks about them. Maybe I should just write one...

But if you do invite that article to repost, I'll comment on it anyway!


How about you write one? That sounds much better! If you do, and are willing to email hn@ycombinator.com, I'll be happy to give it the SCP treatment (https://news.ycombinator.com/item?id=26998308).


Well how could I turn that down? I'll let you know!


Any way to do that with grafana?


Probably just a stacked line graph?


Reading out the delay as a leading indicator is also a neat thing you can do with cumulative flow diagrams.

If you have stacked CFDs, you can immediate spot when a stage is the bottleneck, or when flow is jerky across one of the interfaces.


I wanna know, please send links senpai.

Edit: Couple of YT videos told me everything I need to know, I guess? A bit sad it seems mostly used for "Agile".


Are cumulative flow diagrams useful for serving systems, where arrival rate > departure rate causes immediate user visible issues?


If you like this stuff, play video games, and somehow haven't heard of Factorio, kiss your next few weeks goodbye!



If you're interested in Queuing theory and systems then this book will help. The link only discusses trivial definitions, nothing special. The book is by Prof. Mor Harchol-Balter of CMU. I referred this book extensively during my master's and it is still my favorite academic book. I can open it any day and start reading. The writing is very good. give it try if you're interested. https://www.cs.cmu.edu/~harchol/PerformanceModeling/book.htm...


I would recommend Warteschlangensimulator to simulate processes involving queues. It's often faster to simulate things than to build mathematical models.



Thanks for that. Looks super useful.

Probably more accessible than the theory (alone).


I enjoy knowing a little bit of queue theory because it is a subject where not only can you gain an advantage in many areas but few people will be able to figure out why. In many cases that lack of figuring will continue even if the trick is explained.

Nearly any queuing system can be considered as an M/M/1 queue (in the same way as a linear model fits everything in practice). M/M/1 queues compresses a huge number of observations down into a 2 parameter model. So being able to see that some situation is an M/M/1 queue lets you store 2 numbers, shut down your brain and move on to other things. Compare that to the cognitive load of someone who has to start deriving outcomes with statistical formula!

Someone who doesn't know about queuing theory has 3 options, all wasteful:

1) Re-derive queue theory from first principles (this is the best option)

2) Specifically connect 2 observations that matter - ie, probably do a lot of guessing, experimentation and / or solving equations that don't generalise well.

3) Overprovision out of fear.

Therefore the advantage a queue theorist has - speaking competitively - is that it is possible to work out any queuing theory insight from first principles in a little bit of time. That gives the competition lots of opportunities to waste time and resources trying to work out things which are in fact well known outcomes of M/M/1 queues.



A few companies back, there was a problem where we were provisioning phone numbers that appeared directly in ads and too few numbers caused problems, but of course it costs money to do the provisioning.

I don't know much about queue theory, but I immediately recognized that this was a queue theory problem, and recommended we figure out the answer as such. I got blank stares and "how about we just provision 10 numbers more than we think we need" or something like that.

Sometimes a quick answer like that is better than wasting a lot of engineering time over optimising, but this was actually fairly critical to their business and they were spending significant amounts on the ads and phone numbers and needed both to work well.


Could you please give a concrete example on this, sounds interesting?


Say you're assigned yea many tickets to complete each sprint. You have a tough time completing them all in one sprint and expect your boss is going to be unhappy about work-not-done.

If you've got a good grasp on queue theory, so you prepare for the talk by assuming an M/M/1 and working out the arrival rate of tickets vs. the rate you complete work at. You work out probability (uncompleted items >= tickets not done).

Now you're in a great position to negotiate workload because you have all the figures to work out what just happened - is the problem that you completed tickets too slowly, that there were too many tickets, that even basic variance in task completion rates would result in this happening 5% of the time, et cetera. You have on hand immediately how common this must be so you are in a position to guess at how you compare to colleagues. You can make relatively low-ego determinations about whether this is a sprint-specific mistake you made or if it was all but certain to happen due to basic task variance.

There are a lot of conclusions to be drawn there just by a cursory review of past performance. There isn't anything magic to it, but you're going to be able to spend the conversation worrying about the social aspects of how to manage said boss and don't have to waste valuable brain cyles working out what just happened yourself. You only need to remember 2 numbers and a few napkin-level formula. Works even better if you happen to be the boss because now you can make some quick guesses about whether there is a problem here or just statistical variance.


This relies on the assumption that work time per ticket follows a distribution described by the second M. If you get tickets that require much more work, i.e. tickets that are outliers and not from the distribution described by the M, then your estimates will end up wrong, too.


If you're able to multiplex a little bit (which most humans naturally do) the shape of the service time distribution does not matter as much as it may seem. (In the limit, perfect timesharing across tasks makes even the fattest of tails look M/M/k.)

The more severe constraint is actually the first M. Fortunately, that holds true in practise very often.


Eh, not necessarily. Queuing is a function of throughput, which by definition takes processing time into account.

  Queue length = arrival rate * processing time 
Transitive properties allow you to solve for one variable, given you have the other two. This is known as Little’s Law[0].

With this, you can now deduce/estimate how long a ticket will be in the backlog, how fast you need to complete a task, how long until everything is finished.

[0]: https://en.m.wikipedia.org/wiki/Little%27s_law


> This is known as Little’s Law

Little's law lets you calculate the third variable easily, given that you know the other two. You can calculate from existing data.

As such, it doesn't help you calculate forecasts. Or you can, if you make assumptions on the distributions. But some outliers can then turn the reality very different from what your forecast was.

> how fast you need to complete a task

You can calculate how fast you need to complete a task. But if an outlier task comes your way, it doesn't help you to actually complete that particular task in time.


Indeed! A great deal of mathematics has this property when it comes to applying it to the real world. Queuing theory might stand out as an especially high ROI example, given how simple its starting assumptions are and how easy it is to roughly model a lot of situations off of it.


Can almost use it as a checklist while balancing service level against the cost of resources needed to achieve the service.

A great tool in the toolbox to have while building software!


Hm, I wonder what other examples of high-ROI math are?


Simple random sampling. You can do anything with meaningful precision -- often at a fraction of the cost.

The key insight is you rarely need an exact number, just something in the right ballpark. And sometimes that ballpark is surprisingly big.

(More advanced sampling methods are basically just variance-reducing techniques which give you better precision at cost or lower cost for the same precision, but the big leap in ROI is learning to sample in the first place.)


Set theory might seem almost mundane in many cases, but it's really powerful to express certain solutions in a very concise way. I've refactored code into much simpler code by just thinking about it in terms of set manipulation, vastly reducing the number of classes/tables necessary. Operations became self-explanatory (it's "just a set").

It also gives you a good foothold to realize whether the solution works or not into the future, a bit of a barebones formalism. You get to play with it in math terms before writing any code. This particular refactoring became the only thing in the codebase that didn't need to change as new requirements came along.


Good pick, I'm a big fan of using set() in Python. The fact that they're unordered and do not allow duplicates communicates a lot about the nature of the problem you're working with.


Ah nice! In fact a neat little trick most people don't realize is system designs interviews are effectively queue designs!


> I enjoy knowing a little bit of queue theory because it is a subject where not only can you gain an advantage in many areas but few people will be able to figure out why

Yes! I’ve been building web applications for 15 years, but queuing systems were rarely discussed with any importance. I only picked up on it in the past few years, but since then, I consider it one of the most commercially-beneficial ways to scale and grow most kinds of software.

When it comes to scaling, it seems like 90% of the advice just focusses on RDMS, optimising indexes, etc. Database theory is vitally important to understand, but I now consider queue theory essential knowledge for modern application developers too.

Case in point: over the years I’ve reviewed thousands of PHP projects, articles, blogs, and other resources in the ecosystem. There are more guides to scaling Nginx or optimising MariaDB than I could ever need.

But I simply never see any discussion or usage of the low-level queue features offered by the language, like Deques: https://www.php.net/manual/en/class.ds-deque.php

I appreciate these have niche use cases, but a background queue handler builds on this kind of understanding, and is general purpose.

Knowledge of queue theory directly helped me scale my own projects, in ways I couldn’t have imagined otherwise.


> But I simply never see any discussion or usage of the low-level queue features offered by the language, like Deques

I respectfully disagree:

1. Deques is not a "feature[s] offered by the language". Its part of an external extension (meaning it's a C compiled plug-in to the main engine). It's development is outside of the language scope, see https://pecl.php.net/package/ds

2. Deques is barely a "low-level queue feature". It's just a data structure that can be performant for some implementations of queues. Native arrays provide slower but similar features.

3. It's sane that this (Deques) implementation detail (the in-memory data structure) is less discussed than more central and generic subjects, like scaling MariaDB.

If I consider the first two problems mentioned in the article (systems for issues and for kanban), I think Deques is unsuitable to them, because it's meant to replace arrays for in-memory works. The problems require a persistent storage, and the optimization of a PHP storage for the queue is probably irrelevant.


You're totally right and I agree too :) I forgot Deques (and DS) were via PECL.

It's a poor example to give in relation to the article's real-world cases. Nonetheless, it is relevant to queue theory, and knowledge of how, why, and when these data structures can be more optimal than simple arrays has helped me more broadly in the past.

A much more real-world example would be Laravel Horizon, which uses Redis queues to build scalable distributed systems very easily. The last time I set it up I think it took me 5 minutes at most... https://laravel.com/docs/10.x/horizon


> But I simply never see any discussion or usage of the low-level queue features offered by the language...

Assuming we're talking about something that serves requests over the network, it rarely makes sense to define or use queues within the program itself. The network stack already contains many layers of queues which will almost always manage load concerns (e.g. throttling, backpressure, etc.) more effectively than your application can do itself. In general you want your app to accept and process requests as fast as it can, and leave queueing concerns to other authorities.


I used the question, "is this a queue?" a fair bit in product and in engineering.

Question I have is, as an abstract object, what other things are there in that category? e.g. is this phenomenon governed by, a channel w/ information, a differential, an integral, a state machine, a fluid, - and are these objects at the same logical level of abstraction as a queue?


> a state machine

If it's Markov, then maybe. Simpler queues are often analysed as Markov chains.

> a fluid

I had this exact same question ("is queueing theory just discrete fluid dynamics?") and set out to learn some fluid dynamics to see what I could find out, but my math background was not (yet) sufficient for fluid dynamics. My hunch with the little I know is that "no, the connection is not obvious".

> a channel w/ information

Interesting lead! I have not yet studied information theory but when I do I'll keep this question in mind. Given the neat link between information theory and betting I wouldn't be surprised if there's something approaching queueing too.


Murat had an interesting review on a seminal queuing theory book: http://muratbuffalo.blogspot.com/2023/09/review-performance-.... It's surprising to me that these experts in distributed systems didn't find the book useful in practice.

I was wondering if there are other similar books that are as comprehensive and deep while offering practical values as well.


That is a fantastic book. I'm almost hesitant to recommend it because it's key to one of my secret superpowers.