O(1) in many cases involves a hashing function which is a non-trivial but constant cost. For smaller values of N it can be outperformed in terms of wall clock time by n^2 worst case algorithms.
For example, in applications where the sorted form can be readily maintained, a decent B+-tree tends to massively outperform a hashmap as soon as you get the streamed/non-indexed side (of what's in a way a lookup join) to opportunistically batch items:
as when you sort your lookups you can use exponential forward search (compare at exponentially increasing distances from the cursor; once you're past the target, run binary search between this now upper bounds and the previous probe point as lower bound) in your index for each next key to reduce the per-lookup cost to be logarithmic in distance from the last-looked-up key (asymptotically always better than single isolated lookups; in practice tends to cap out at 2x the cost in pathological cases if you respect page locality of B+-tree structures).
If you aren't ignorant of your LRU cache set during such, you'll get by with overall significantly fewer memory accesses to e.g. fresh DRAM pages (let alone NAND pages) than with a hashmap setup.
I've severely sped up a C++ program by replacing an `std::unordered_map` with a `std::vector` (iirc technically a pair of; for SoA purposes) by realizing that I could collect the items unordered; sort the structure; then use binary search instead of hashmap lookups. The function that I modified ran something like 3x faster as a result, and that's without anything like vectorization-friendly structures or such.
I mean, true obviously, but don't say that too loud lest people get the wrong ideas. For most practical purposes n^2 means computer stops working here. Getting people to understand that is hard enough already ;)
Besides, often you're lucky and there's a trivial perfect hash like modulo.
> What do you mean? Modulo is not a perfect hash function
It's a perfect hash function for the case where you work with ints and know the maximal min-max range beforehand; then you can modulo by the size of the range as long as it's not too large. In your example 33-21 --> mod 12
This comes up for me surprisingly often but obviously depends a lot on what you work on. It is often tempting to reach for a hashtable, but it's a lot less efficient in this case.
The challenge with big-O is you don’t know how many elements results in what kind of processing time because you don’t have a baseline of performance on 1 element. So if processing 1 element takes 10 seconds, then 10 elements would take 16 minutes.
In practice, n^2 sees surprising slowdowns way before that, in the 10k-100k range you could be spending minutes of processing time (10ms for an element would only need ~77 elements to take 1 minute).
Well, it doesn't seem that obvious, at least not to everyone, because several times I've seen people rewrite things to make it O(1) when it was actually slower for the vast majority of cases (or even all cases).
> In many cases it's guaranteed to be small or small-ish.
And in many cases it's assumed to be small, but not guaranteed. That's where the trouble lies.
A classic example is Windows using a quadratic algorithm for arranging desktop icons that caused severe performance issues when users had many icons on their desktop.