Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^2: list of unique strings, also eliminating matching substrings

by BrowserUk (Pope)
on May 23, 2011 at 13:23 UTC ( #906297=note: print w/replies, xml ) Need Help??

in reply to Re: list of unique strings, also eliminating matching substrings
in thread list of unique strings, also eliminating matching substrings

Utter garbage!

Since memory-size is a ruling constraint here,

100,000 strings of max. 400 characters gives 40MB.

Even with the overhead of an array with 64-bit pointers, the total memory requirement is 44,25MB. (MAX)

Even my 233Mhz/128MB Thinkpad 770 from 1997 could have handled that.

thus taking advantage of the fact that (1) disk-based sorts are very efficient

No! They are not!

Not when compared to memory based sorts.

And given that the cheapest commodity PC you can buy can trivially handle sorting 44.25MB in memory in the blink of an eye, (0.404149055480957 seconds on my machine), there is absolutely no point what so ever in writing the stuff to disk in order to sort it.

Just writing it to disk (cache) takes almost exactly as long (0.361000061035156 seconds). And that's before you've loaded up another process, to read it back to memory, sort it, write it back to disk and then read it back in.

I just up voted one of your answers (re:COW) and then read this garbage. Why do you post this? It's like your brain is caught in a time warp.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^2: list of unique strings, also eliminating matching substrings

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://906297]
[Corion]: (or maybe I just work better from existing code that I munge until it works and I understand it rather than a short abstract text like "implement everything that's needed" ;) )
[ambrus]: Corion: I think in this case you can get away with only a stub for idle, one that always dies when you create it, because AnyEvent::HTTP doesn't use it, not even indirectly through AnyEvent::Handle or AnyEvent::Socket or AnyEvent::DNS.
[Corion]: The "and I understand it" part is optional.
[Corion]: ambrus: Yes but I also need to implement the file / IO watcher, because Prima has that (in Prima::File), and I need to supply the appropriate thing to make push_write etc. work with Prima
[ambrus]: Corion: yes, you need to implement the io watcher, which should be simple because Prima::File is basically that, and the timer watcher form Prima::Timer
[Corion]: ... or so I think. As I said, I'm somewhat vague on how to make AnyEvent cooperate with a callback-driven IO event loop that gives me callbacks when data is available or can be written
[ambrus]: what push_write thing? I don't think you need that. that's implemented generically by AnyEvent::Handle
[Corion]: ambrus: Yeah, that's what I think as well. But you give me an idea, maybe I should start with implementing the timer, as that should be far simpler and with fewer edge-cases/nasty interaction than the file watcher
[ambrus]: You only provide the watcher part that tells when the handle is readable or writable, not the actual writing and reading.
[Corion]: ambrus: Hmmm. It makes sense that AnyEvent would implement the push_write itself, but I think I don't have a good idea of where the boundary between AnyEvent and the underlying event system lies... Implementing the timer should give me a better idea

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (9)
As of 2016-12-08 12:17 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (141 votes). Check out past polls.