|Do you know where your variables are?|
|( #3333=superdoc: print w/replies, xml )||Need Help??|
Perl will release memory back to the OS -- if the circumstances are right. You can see this for yourself by trying the code snippets I posted here and here.
When your program calls for a large array to be allocated, it goes to the OS for a new chunk of memory if it doesn't already have enough available. However, the new chunk of memory isn't alocated directly to the array being created, it is allocated to perl's general memory pool. If the logic of the program then goes on to fill the new chunk of memory will the array and nothing else (like temporary variables, stack frames etc.) is allocated from that chunk of memory, then when you free the array, it is possible that it will be returned to the OS.
However, if the logic of your program is such that the array is filled in dribs and drabs, then it is also possible that the new chunk of memory allocated will not be big enough and yet another, larger chunk my be called for. Also, if in the process of filling the array, other demands for memory (those temp vars etc.) must be satisfied, then it it is quite likely that some of the chunk will be used for that purpose as well. If those "other" sub-allocations from the big chunk requested by the OS, persist past the point in your code where you free the large array, then although the array is no longer being used, the chunk allocated for it may still contain other variables that have not yet been freed. The array's memory will be available to perl to satisfy subsequent allocations, but the chunk as a whole can not be released back to the OS until those "other" allocations have been freed. And by the time they are, some of the original array may now be reused for other purposes.
As you can see, the picture is complicated. However, there are some steps that you can take to maximise the chance that the memory used for a large array can be released back to the OS. The major one is to pre-allocate the array to it's final size is a single step. The easiest way to do this is to assign to $#array. Eg.
However, this is not entirely without problems, nor is it the total solution. The problem with this is that once you have done it, you can no longer used some of the normal techniques for manipulating the array. If you try to push or unshift to the array having preallocated as above, then you will be extending the array, rather than using the space you've already allocated. This forces you into maintaining your own pointer into the array and explicitly assigning to the next "free" element and incrementing the pointer.
Note: I'm not advocating anyone should do this, but if you really need to maximise the chances of memory being returned to the OS, then this is the kind of step required to do it.
The second caveat of doing this is that when you pre-allocated the array this way, you are preallocating the space for the internal infrastructure of the array, but not whatever you subsequently store into the elements of the array! The latter will be allocated from a different chunk of memory.
There is also an optimisation built in to perl that will cause it not to release the memory used by an array immediately, even if the programmer has apparently taken steps to indicate that the array is not longer required, if the logic of the program might indicate that it may need to be recreated at some point in the future. As an example, if you have a sub which allocates a lexically local array and then discards it when the sub exits, the memory used by that array may not be returned to perls memory pool immediately. This is because if you call the sub again, it will be more efficient to re-use the array structures and memory the second and subsequent times you call it, than to reconstruct everything from scratch. If you have a sub that creates a large array for working storage and that sub is called many times, then you will be benefiting from this optimisation without realising it.
All-in-all, it is generally better to let perl get on with managing it's own memory allocation. Taking check points of the OS's view of a perl process memory allocation at different points in the life of that program is fraught with problems. I know, because I've (mis)spent some considerable time exploring this:)
If you have a real need to manage the amount of memory used by your perl process, as I do for one of my pet projects that uses prodigious amounts of the stuff, then you have a long and somewhat tedious task in working out ways of achieving this. In most cases, I've found that the best way to minimize memory usage is to think hard about your algorithms and try and avoid allocating memory in the first place. There are a surprising number of ways that re-casting the obvious perl idioms allows you to manage large volumes of data without incurring the additional overhead of perls basic datatypes. These mostly revolve around loading and storing the large volumes of data in scalars instead of arrays or hashes.
If you have a particular reason for wishing to reduce or minimize the memory usage for a particular application, then try posting (small) chunks of your code where you are using large amounts of data and request assistance in finding ways to minimize those. Best of luck!
To pre-empt the question of "Why not use a different language than perl for those applications that use large volumes of data?". I don't want to give up access to all of perl's great features -- regexes, memory management, built-ins, OO, CPAN etc. and I don't want to have to try and re-implement these in C. I did consider, and play with, embedding a perl interpreter into a C version of my application, but it creates more problems than it cures IMO.
In reply to Re: Freeing memory used by arrays