Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Slurping BIG files into Hashes

by BrowserUk (Pope)
on Jun 18, 2003 at 20:23 UTC ( #266977=note: print w/ replies, xml ) Need Help??


in reply to Slurping BIG files into Hashes

I have to concur with waswas-fng, your data, or rather your keys must be such that your approaching worst case behaviour. I tried your original code on my 233Mhz P2 and loading the hash took a little under 8 seconds.

To acheive the load times of 30 minutes you are quoting. you are either

  1. extremely unlucky.
  2. using an Apple II or ZX 81 :)
  3. your data has been picked to deliberately induce the pathological behaviour.

If a) is the case, my commiserations, but I do have a work-around for you. If you are currently using 5.8, revert to 5.6.1. If your currently using 5.6.1 or earlier, switch to 5.8. The hashing function changed between these two builds and if your data is genuinely inducing the O(n^2) insertion behaviour on one build, it will perform as you would hope on the other.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller



Comment on Re: Slurping BIG files into Hashes
Re: Re: Slurping BIG files into Hashes
by waswas-fng (Curate) on Jun 19, 2003 at 02:10 UTC
    ++, also if the data is not valuable or meaningful put it up on a siteand send me a message with the url, I will verify behavior across platforms (+ it might make a good dataset to see how the hash keys are vaulnerable)

    -Waswas

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://266977]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (14)
As of 2014-07-28 15:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (201 votes), past polls