|Welcome to the Monastery|
Re^2: list of unique strings, also eliminating matching substringsby BrowserUk (Pope)
|on May 23, 2011 at 13:23 UTC||Need Help??|
Since memory-size is a ruling constraint here,
100,000 strings of max. 400 characters gives 40MB.
Even with the overhead of an array with 64-bit pointers, the total memory requirement is 44,25MB. (MAX)
Even my 233Mhz/128MB Thinkpad 770 from 1997 could have handled that.
thus taking advantage of the fact that (1) disk-based sorts are very efficient
No! They are not!
Not when compared to memory based sorts.
And given that the cheapest commodity PC you can buy can trivially handle sorting 44.25MB in memory in the blink of an eye, (0.404149055480957 seconds on my machine), there is absolutely no point what so ever in writing the stuff to disk in order to sort it.
Just writing it to disk (cache) takes almost exactly as long (0.361000061035156 seconds). And that's before you've loaded up another process, to read it back to memory, sort it, write it back to disk and then read it back in.
I just up voted one of your answers (re:COW) and then read this garbage. Why do you post this? It's like your brain is caught in a time warp.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.