@josephg 13d
Yeah; expected limits are also fantastically useful in performance engineering. It’s very common your code needs to handle an arbitrarily sized input, but 99% of the time the input will be bounded. (Or generally simpler). Special casing the common code path can make a lot of code run much faster.

For example, in some code I’m writing at the moment I have lists of integers all over the place. I call them lists - usually they only have 1 element. Sometimes they have 2 elements (10%) and very occasionally more than 2 elements or they’re empty (<1%).

If I used a language like Javascript, I’d use Arrays. But arrays are quite expensive performance wise - they need to be allocated and tracked by the GC and the array contents are stored indirectly.

Instead, I’m using an array type which stores up to 2 items inline in the container object (or the stack) without allocating. It only allocates memory on the heap when there are 3 or more items. This decreases allocations by 2 orders of magnitude, which makes a really big difference for performance in my library. And my code is just as readable.

I’m using the smallvec crate. There’s plenty of libraries in C and Rust for this sort of thing in arrays and strings. Swift (like obj-c before it) builds small string optimizations into the standard library. I think that’s a great idea.


@solveit 12d
I think a better way to formulate the same rule of thumb is that zero, one, and infinity are the only numbers that don't need to be justified. You should always have an answer to the question "why 5000 and not 6000?"

Of course, the justification can be as simple as "it has to be some number and 5000 is as good as any", but that opens the door to the discussion of whether 5000 really is as good as any other number, which is often surprisingly enlightening.

@crazygringo 13d
Totally agreed. I think it's absolutely a best practice to set limits.

Code is generally designed to operate within certain "reasonable" performance boundaries, and when it goes outside those you need to think whether code should be rewritten to accomodate it.

Just a tiny example, but I regularly deal with long (800+ page) PDF's on my iPad, reading parts of them in the stock Books app. When I select text to highlight, the context menu unfortunately puts "select all" directly next to "highlight". Of course, every so often I accidentally hit "select all", and then I have to force-close the app because otherwise it just freezes for 10 minutes as it extracts and selects the text on every single page.

When really, it needs a limit to detect that, hey, if the PDF is over 20 pages long then just don't allow "select all" anymore, because this isn't the right tool for the job.

@pyuser583 13d
The Wiki page says the problem isn’t with limits, but with arbitrary limits.

When a program limits me to 256 of something, it doesn’t seem arbitrary.

I’ve heard stories of programmers setting limits to multiples of two simply so nobody asks why.

@pixl97 13d
What should the limits be? And are the limits clear to the users of the system? Are they clear to other components of the system?

I agree we should have limits in software because we don't have unlimited memory and processing time, but I commonly find these limits are encoded by the imagination of the programmer working on the software at the time, and it's often you find limits that were not considered in systems design of the product.

@hinkley 12d
Someone implemented a fairly naive DAG in our system that sometimes sprouts cycles. The cheapest way to catch cycles is to limit the graph size to 10x nominal. If we ever encountered a request that was legitimately that far from normal, we would most likely time out during the subsequent work anyway, because processing time grows faster than n and our timeout is only about 20x our mean response time.

If there is a loop, we will hit the limit while evaluating the cycle because it’s a DFS. With a BFS we could hit the limit in a sibling of the problematic node.

Scheduling budgets are how most people avoid the halting problem. You set a fixed multiple of expected halting time and work hard to make sure that’s your p99 time instead of your p75 time, and stop trying to violate core principles of computational theory.

@Cthulhu_ 12d
A lot of limits are set explicitly to ensure things don't go haywire, think things like rotating log files. Other ones have to do with display logic, that is, there's only space for X characters in a title section. Although that can be handled with CSS, it's still sane to set a limit so you don't get megabytes of garbage data cut off by CSS.
@hammock 12d
>limits are quite useful to catch bugs

Can be solved with tests instead though.

Speed limits on roads are useful in catching unsafe driving behavior; but if every car actually had speed governors installed, that couldn’t be overcome, it should be clear that this is a suboptimal solution.

Arbitrary limits written into code will eventually be refactored into ZOI, one way or another

@hakre 11d
> limits are quite useful to catch bugs (or prevent degenerate cases)

Sure, this is why code designed by ZOI rule has those straight forward test cases: none, one, some, many aaaand crash (at most).

and for safeguard of correct use (limits as undefined behaviour) many languages have assertions.

@renox 12d
Yes, that's why I think than 'checked int64' are a far better default for language integers than 'big ints', because if you overflow an int64, it's most likely a bug/an hacking attempt.