|Problems? Is your data what you think it is?|
Re: Shift, Pop, Unshift and Push with Impunity!by jonadab (Parson)
|on Feb 04, 2004 at 16:21 UTC||Need Help??|
David Eppstein claims that it is possible to solve the TSP in O(n*2^n) time, which would simultaneously be worse than O(2^n) and yet much better than O(n!). In any event, these are clearly the sorts of algorithms we want to avoid. I'm going to go off on a qualifying tangent and come back to explain why algorithms worse than about O(n^2) are bad news, even for those of us who will happily implement an O(n) algorithm without worrying about whether it's possible to do it in O(1).
I'm not generally terribly concerned about algorithm efficiency most of the time, for several reasons. Perhaps most important, I'm more concerned with maintainability; I'd usually rather have a function that runs in half a second and is clear, versus one that runs in 3 picoseconds and frightens people who look at the source. Since Perl provides built-in functions for sort, push, pop, shift, and unshift, I would never think about coding my own replacements for them. That would be inane, a huge waste of time. It would be inane and a huge waste of time *even* if I could slightly improve on the efficiency (which, of course, I can't), unless I were writing the replacement in C to be included into Perl (which, if you know anything about my relationship with C, is not going to happen).
A point comes, though, when you have to draw a line and say, "this algorithm is too slow". I personally draw that line after O(n^2). Anything slower than that, I call too slow.
Take a dataset size of 200, for example, a nice medium-sized dataset. (Maybe it's the number of files in a directory, or something.) O(n^2) comes to 40000 somethings; if we say that's tenths of milliseconds (in reality, it would be less than that on modern hardware, but go with me here), the operation completes in 4 seconds, which for something that doesn't happen very often (especially, something that only happens when the user deliberately orders it, such as starting an app) is reasonable. Now, let's say we did the same thing in O(2^n) time. It would take aeons. Lots of aeons. I'd say millenia, but that would be underestimating it. You can see the problem. Even if we get faster hardware so that the 4-second operation completes in 4 nanoseconds, the O(2^n) algorithm is still taking multiple lifetimes. O(n!) is even worse.
So that's why even people who don't generally care about efficiency still have to think about algorithm efficiency just enough to avoid the really *bad* algorithms. With O(n^2), the size of the data set could blow up to 1000 (quite a lot of somethings, for a desktop user), and we'd still be running in under two minutes (on the same scale that takes 4 seconds for n=200). Two minutes may be a minute and fifty seconds too long, but the operation will complete. Start messing around with O(2^n) or O(n!) algorithms and you could be waiting until you die.
Note that I'm not saying O(n^2) is "fast enough" for all purposes. (When n can blow up to hundreds of thousands, it gets to be a very significant nuisance (though it will still complete within your lifetime, and quite possibly within your computer's lifetime even, and if you sic a good-sized cluster on it you can brute force it in days).) What I am saying is that O(n^2) is "fast enough" for situations where brute force will do, but some algorithms are too slow even for that.