Optimizing Memory consumption

PerlingTheUK has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Optimizing Memory consumption by Corion (Patriarch) on Nov 09, 2006 at 20:29 UTC
Devel::Size will tell you the size of the Perl variables. I think your idea of storing the boolean values or integer values of an object together in a string is sensible. If the size gains make it worthwhile and you still want fast access, you can also stuff all your objects into one string/array, for example by using Tie::Array::PackedC, but modifying these values becomes ugly. If you have a large part of your needed memory in a hash, you can always "just" tie that hash to DB_File or one of the other btree implementations. Maybe you also can lazy-load some data or unload some data when it's not needed, but as all of that needs modification of the program, I'd go for one of the `tie` solutions that allow your program to run slower but otherwise unchanged.	[reply] [d/l]
Re: Optimizing Memory consumption by xdg (Monsignor) on Nov 09, 2006 at 20:29 UTC
I need to understand how much memory is used for what in perl Look at Devel::Size. To the larger question, I've had some luck optimizing a large data structure by storing records using `pack/unpack`. I.e. rather than a hash-of-hash, I switched to hash-of-packed-array. In this case, all the data elements were just integers, so they packed down pretty well. On any particular run, I only needed some of them, so the cost of unpacking the ones I needed to access was small relative to the savings of keeping the entire data set in memory. (Plus, I could save/load the entire packed structure using Storable, too.) -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l]
Re: Optimizing Memory consumption by davido (Cardinal) on Nov 10, 2006 at 01:00 UTC
Using bit fields instead of individual scalars to represent boolean data will save you a measure of memory consumption, but the same architecture that is getting you into trouble now will get you into trouble in the future as your needs grow again. I consider moving from individual scalars to bit fields a micro-optimization. But you're not micro-challenged; you've got a design issue. It may be easier to deal with all this than you think. Perhaps you could create helper classes to handle the behind-the-scenes storage and retrieval so that the majority of your script can remain unchanged. ...without seeing the details my suggestion has to be vague. Dave	[reply]
Re: Optimizing Memory consumption by brig (Scribe) on Nov 10, 2006 at 00:02 UTC
This isn't answering the question you asked; Mostly because I'm not convinced you are asking the right question. My Rule of Thumb is that anytime there is a significant change to requirements, there needs to be a significant review of the architecture. I would say that a quadrupling of your memory requirements and the fact that you are using a 200MB text file confirms this. There isn't really enough info about your situation to come to any realistic conclusion because you have already decided to optimize (fair enough). However, when I see a monstrous text file, I always wonder if the use of a Relational DB and clever queries would generate a more efficient solution. You mention that "the data to be stored has increased by (a?) large (margin?)" this is often a point at which data abstraction needs to be reviewed. Finally, please forgive me if I am off base. (update: recomposed 1 sentence.) Love, Brig	[reply]
Re: Optimizing Memory consumption by perrin (Chancellor) on Nov 09, 2006 at 21:25 UTC
In a similar situation, I saved memory by avoiding unintentional auto-vivification of nested hash structures. That helped a lot, since checking if $foo->{bar}->{baz}->[0] is true will create a hash and an array if they didn't exist already. Careful use of exists() can avoid this. Ultimately though, I had to switch to a more sensible system that didn't try to load all the data into RAM at once.	[reply]
Re: Optimizing Memory consumption by dave_the_m (Monsignor) on Nov 10, 2006 at 00:47 UTC
I am considering using a central scalar with an integer value that can be accessed as a bitfield In that case, you may find vec useful. In perl 5.8.x on a 32-bit platform, a scalar holding an integer will typically use 16 bytes; a float 20; a string 28 + length of string; an array 52; an array slot 4; a hash 60; a hash slot (24 to 48) + length of key. Dave.	[reply]
Re: Optimizing Memory consumption by jfroebe (Parson) on Nov 09, 2006 at 20:29 UTC
Hi, That's definitely pretty vague. How much memory perl uses varies a great deal (like any other language) depending on what you do with it, what platform you're running it on, and what version of perl and any supporting libraries (os libs for example). I don't think the problem is with perl itself, though I could be wrong. Probably the over zealous memory consumption is how you are accessing the data files. Are you reading the file entirely into memory or are you accessing it piecemeal? Can you provide some example code of how you're accessing the file? Jason L. Froebe Team Sybase member No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1	[reply]
Re: Optimizing Memory consumption by talexb (Chancellor) on Nov 10, 2006 at 01:34 UTC
I haven't seen any mention of putting this 200M of data into a database. Is that a possibility? Alex / talexb / Toronto "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds	[reply]
Re: Optimizing Memory consumption by zentara (Archbishop) on Nov 10, 2006 at 12:45 UTC
DBM::Deep has something in it's README REAL-TIME COMPRESSION EXAMPLE Here is a working example that uses the Compress::Zlib module to do real-time compression / decompression of keys & values with DBM::Deep Filters. Please visit <http://search.cpan.org/search?module=Compress::Zlib> for more on Compress::Zlib. use DBM::Deep; use Compress::Zlib; my $db = new DBM::Deep( file => "foo-compress.db", filter_store_key => \&my_compress, filter_store_value => \&my_compress, filter_fetch_key => \&my_decompress, filter_fetch_value => \&my_decompress, ); [download] I'm not really a human, but I play one on earth. Cogito ergo sum a bum	[reply] [d/l]
Re: Optimizing Memory consumption by mkirank (Chaplain) on Nov 11, 2006 at 15:53 UTC
If your data structures are of similar kind you could use Data::Reuse to reduce the memory	[reply]
Re: Optimizing Memory consumption by sandfly (Beadle) on Nov 13, 2006 at 23:29 UTC
I agree with most of the earlier posts, that you may have a design issue. But, there's a nice, crude alternative which may be of interest: where I work, we use an ActiveState build of Perl on Solaris. I assume it's a 64-bit build, because we can and do run scripts with memory usage over 2 GB; the same scripts run out of memory on Linux.	[reply]


Pathologically Eclectic Rubbish Lister
	PerlMonks