Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^3: "Just use a hash": An overworked mantra?

by Tux (Abbot)
on Nov 17, 2011 at 18:38 UTC ( #938659=note: print w/replies, xml ) Need Help??


in reply to Re^2: "Just use a hash": An overworked mantra?
in thread "Just use a hash": An overworked mantra?

In this case, "data" is a bunch of integers. In moving from a hash to an array, the "keys" do have to be integers. I you're just counting, nothing else matters, but if it is about key-value pairs, that move is still valid if just the key is a (positive) integer. The value(s) in that pair do not have to be.

Another thing not yet mentioned is that with datasets this large, not only the data itself may put a limit on the internal available memory footprint, but the overhead in perl structures add to that. Just today I checked what the internal representation of a 1 Mb .csv file was represented as an array(ref) of array(ref)s: it grew to 10Mb! A hash takes slightly more overhead than an array (most overhead goes into converting a single number into a refcounted SV), so when on the verge of swapping, an array might actually be much faster than a hash.


Enjoy, Have FUN! H.Merijn
  • Comment on Re^3: "Just use a hash": An overworked mantra?

Replies are listed 'Best First'.
Re^4: "Just use a hash": An overworked mantra?
by blakew (Monk) on Nov 17, 2011 at 19:47 UTC
    Your data can be characters; in which case use ord to map to integers for the key. The point is your data just needs to be mappable to integers, not necessary integers themselves.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938659]
help
Chatterbox?
[stevieb]: I really dislike how perlbrew doesn't list your version of perl if a newer minor version bump has happened. eg: I use 5.24.1, but 5.24.2 is available, hiding my 5.24.1. I may look at this, as I made it display correctly in berrybrew in cases like this
[stevieb]: choroba just gleaned your post about Module::Starter. I use it too, pretty much for every dist I write
[LanX]: I remeber M::S (it was dialog driven?) to be buggy
[stevieb]: As far as Dist::Zilla goes, I don't like installing that other than on systems my test platorm runs on. I find it too heavy. I prefer being able to glean a Makefile.PL
[LanX]: what's frustrating me is that a distribution has lots of dupplicated info

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2017-08-18 20:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Who is your favorite scientist and why?



























    Results (310 votes). Check out past polls.

    Notices?