in reply to An informal introduction to O(N) notation
- As hash is so heavily used in Perl, it is worth to explain why hash search is O(n).
One runs into the worst case of a hash search when all the elements calculate to the same hash key. In this case, a hash becomes no more than a one-dimentional chain of elements.
Even worse, the element one trying to look up happens to be the last element in the sequence. Now the searching engine has to go throguh the entire chain, all n elements, to find what one wants.
- (In rest of the post, I will just use small o instead of big O, as now I am more focused on complexity, doesn't matter whether it is for worst case, best case or whatever. Big O is just a notion saying that it is for the worst case.)
In an average case, if n is the number of elements we have in a hash, and k is the number of all possible hash key values. Ideally all elements would spread nearly even among possible hash keys, so the chain length under each hash key is n/k. Averagely you need to go though half of the chain to get what you want, thus you need to go thru n/2k elements.
So averagely a hash search would close to o(n/2k) (be very careful, n is a variable when k is a constant, this is serious), which ~ o(n).
How come the average case is the same as the worst case, NO they are not the same, but they are ~.
- Some time it is seriously dangerous to casually simplify things that is seriously complex.
o(n) is not o(n/2k), but o(n) ~ o(n/2k) (again, n is variable, and k is constant, this is very serious), the easiest way to explain ~, yet does not lose seriousness too much is that: the speed o(n) and o(n/2k) approach infinite is the same.
Although o(n) ~ o(n/10^100), what takes o(n/10^100) algorithm 1 sec, might take a o(n) alogorithm 10^100 sec to finish. They are far from the same.