|Perl: the Markov chain saw|
Re: What are the symptoms of a memory leak?by oshalla (Deacon)
|on Feb 07, 2009 at 13:26 UTC||Need Help??|
I sense your frustration and feel for you.
I would expect a leak to cause ever increasing memory use, at some rate related to the program's demand for new memory, and caused by a failure either to release memory properly or an inability to reuse memory that has been released.
I agree that it is hard to see why memory use should suddenly balloon when processing $cnt++. So hard, in fact, that I would assume it's an artifact of the tools you have to observe the effects. I note from other postings that the $cnt++ is both immediately before and immediately after a subroutine exit. On subroutine exit Perl will be doing memory things, which look more plausible suspects. Mind you, the effect may have occurred earlier, but was not immediately visible.
I also note from other postings that there is a hash involved in this problem. I know that the growth of a hash can cause sudden jumps in memory use. When an empty hash is created it is allocated a small number of chain bases. As the hash grows, that number is doubled as the number of entries approaches the number of bases. You can see how a large growing hash could cause a jump in memory demand.
Some more background, based on observation:
As far as I can see, during each growth step, Perl needs the previous chain bases and the new set at the same time: releasing the previous chain bases when it has finished rearranging the contents of the hash across the new bases. So a hash that's grown to 256M of chain bases, has a trail of 128M, 64M, 32M and so on behind it.
Perl's memory management does not (as far as I know) attempt any kind of memory reorganisation in order to reclaim memory. So, when memory is released it leaves holes, all or part of which may be reused later -- memory will fragment. Much of the time, when lots of items of similar size are repeatedly allocated and released, things remain reasonably stable. You can see how a growing hash will leave holes of increasing size across the memory pool. A hash that grows up to some size, is deallocated and then grows again, may reuse the holes it used last time. But if one of "its" holes has had even a small piece chewed out of it, a new piece of memory will be required for the hash when it reaches its largest size.
Now, different systems behave differently. Perl does something different when allocating very large lumps of memory. As the hash grows, Perl will allocate memory for it from a "general pool". When it gets big enough, it will go and get a separate lump of memory from the system. Further, Perl does not appear give back those lumps, presumably hoping to reuse it later.
However, these unused lumps are only occupying virtual memory space, not real memory. System tools will show you the virtual space allocation, and may show you the real memory occupation. A large lump of memory which was used, but is no longer in use, will at first occupy some real memory but that will reduce over time as the system recycles it.
Given an apparently intractable and mysterious problem, I would set about trying to cut down the program that demonstates the problem: removing as much as possible, until I find something that cannot be removed without the problem going away.
Unfortunately, this does not instantly solve the problem. Though, from what I have seen so far, I would be eliminating stuff not related to the growth of a hash.