Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: List overhead

by jeroenes (Priest)
on Oct 18, 2001 at 12:52 UTC ( #119630=note: print w/ replies, xml ) Need Help??


in reply to List overhead

I analyze data consisting of many items but little information per item. I quickly ran across perl's hunger for memory. In my experience (linux 2.2/perl5.6/gcc) perl asks for 43 byte overhead per item (that is just from total memory footprint with large lists in memory, so not really accurate). If your list has 10 million items, you need 400 megabyte of memory. That's simply too much.

There are solutions:

  1. BerkeleyDB
  2. pack/unpack bytestrings
  3. vec on bitstrings
I conincedently started to use the latter one this week, and it works good. For large lists, I have to keep a few flags per item. vec returns or sets bits (the desired number of bits, actually) so you do not need pack conversions, and that really clears up the code.

For really large lists, BerkeleyDB is the way to go. Find it at http://www.sleepycat.com.

Cheers,

Jeroen
"We are not alone"(FZ)


Comment on Re: List overhead

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://119630]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2014-11-23 22:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (134 votes), past polls