Yes, i will jump in with you, enthusiastically. Thank you
for the cogent explanation.
However, I wouldn't necessarily say the big O notation is a
measure of the rate of change. (it is, sort of, but i don't
think that's the best way to think of it).
Another way of thinking of it is this: Take two algorithms,
algorithm A, which is O(N*log N), and algorithm B, which is
O(N^2). Now let's say that the execution time for Algorithm
A is 100 seconds when N=1000, and that the execution time for
Algorithm B is 1 second when N=1000. At first glance, B might
appear faster than A, and it certainly is for N=1000. However,
what Big O tells us is that no matter what
execution times we measure for A and B at a given value of N,
A will be faster than B for some arbitrarily large N.
This, is essentially what the theorem behind big O states: That
an algorithm with a 'smaller' big O will run faster than an
algorithm with a 'larger' big O, for all N > X, X finite but
arbitrarily large.
B may run 10^100 times faster than A for N = 10^1000, but
since A has a 'smaller' big O, there is some N such that
A will run faster than B.