Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: "Just use a hash": An overworked mantra?

by blakew (Monk)
on Nov 17, 2011 at 18:00 UTC ( #938657=note: print w/ replies, xml ) Need Help??


in reply to Re: "Just use a hash": An overworked mantra?
in thread "Just use a hash": An overworked mantra?

"An array on the other hand can only be used when your data is an integer"

I think you meant "maps 1:1 with integers."


Comment on Re^2: "Just use a hash": An overworked mantra?
Re^3: "Just use a hash": An overworked mantra?
by Tux (Monsignor) on Nov 17, 2011 at 18:38 UTC

    In this case, "data" is a bunch of integers. In moving from a hash to an array, the "keys" do have to be integers. I you're just counting, nothing else matters, but if it is about key-value pairs, that move is still valid if just the key is a (positive) integer. The value(s) in that pair do not have to be.

    Another thing not yet mentioned is that with datasets this large, not only the data itself may put a limit on the internal available memory footprint, but the overhead in perl structures add to that. Just today I checked what the internal representation of a 1 Mb .csv file was represented as an array(ref) of array(ref)s: it grew to 10Mb! A hash takes slightly more overhead than an array (most overhead goes into converting a single number into a refcounted SV), so when on the verge of swapping, an array might actually be much faster than a hash.


    Enjoy, Have FUN! H.Merijn
      Your data can be characters; in which case use ord to map to integers for the key. The point is your data just needs to be mappable to integers, not necessary integers themselves.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938657]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (5)
As of 2015-07-05 06:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls