I analyze data consisting of many items but little information
per item. I quickly ran across perl's hunger for memory.
In my experience (linux 2.2/perl5.6/gcc) perl asks
for 43 byte overhead per item (that is just from
total memory footprint with
large lists in memory, so not really accurate). If your list has 10 million
items, you need 400 megabyte of memory. That's simply
in reply to List overhead
There are solutions:
I conincedently started to use the latter one
this week, and it works good. For large lists,
I have to keep a few flags per item. vec returns
or sets bits (the desired number of bits, actually)
so you do not need pack conversions, and that really
clears up the code.
- pack/unpack bytestrings
- vec on bitstrings
For really large lists, BerkeleyDB is the way to go.
Find it at http://www.sleepycat.com.
"We are not alone"(FZ)