Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^3: "Just use a hash": An overworked mantra?

by Tux (Monsignor)
on Nov 17, 2011 at 18:38 UTC ( #938659=note: print w/ replies, xml ) Need Help??


in reply to Re^2: "Just use a hash": An overworked mantra?
in thread "Just use a hash": An overworked mantra?

In this case, "data" is a bunch of integers. In moving from a hash to an array, the "keys" do have to be integers. I you're just counting, nothing else matters, but if it is about key-value pairs, that move is still valid if just the key is a (positive) integer. The value(s) in that pair do not have to be.

Another thing not yet mentioned is that with datasets this large, not only the data itself may put a limit on the internal available memory footprint, but the overhead in perl structures add to that. Just today I checked what the internal representation of a 1 Mb .csv file was represented as an array(ref) of array(ref)s: it grew to 10Mb! A hash takes slightly more overhead than an array (most overhead goes into converting a single number into a refcounted SV), so when on the verge of swapping, an array might actually be much faster than a hash.


Enjoy, Have FUN! H.Merijn


Comment on Re^3: "Just use a hash": An overworked mantra?
Replies are listed 'Best First'.
Re^4: "Just use a hash": An overworked mantra?
by blakew (Monk) on Nov 17, 2011 at 19:47 UTC
    Your data can be characters; in which case use ord to map to integers for the key. The point is your data just needs to be mappable to integers, not necessary integers themselves.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938659]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (16)
As of 2015-07-29 20:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (267 votes), past polls