Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?

Bretheren,

I have read much about perl's garbage collection infrastructure, and I am familiar with the advantages and disadvantages of mark&sweep vs. reference counting. I understand that the reference counting takes care of most of the memory deallocation throughout the program's execution, and that upon shutdown of the interpreter, a full-on mark&sweep gc run is executed, and thus self-referential and circular-referential nodes can be collected and released to the kernel.

This makes sense to me in the context of a host program that runs an embedded interpreter, such as apache running a mod_perl system that re-uses modules and would not want to accumulate memory leaks.

But for a normal perl program, let's say a simple non-interactive script running from the command line as a seperate process, what benefit is there from doing exhaustive and time-consuming garbage collection just before exiting and after all productive work is done? When a process exits, ALL the memory is released back to the kernel - interpreter memory, program-space memory - all of it. Circular references or not. Isn't this last minute gc just a big waste of time for the cases (outside of persistent runtime environments) where any garbage would be collected?

Or am I missing something?

Your enlightenment would be most appreciated.

Thanks,
- Paul

Comment on Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?

Replies are listed 'Best First'.
Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown? by ikegami (Patriarch) on Jun 15, 2005 at 22:21 UTC
At the very least, it's needed in order to call the DESTROY of objects, which may release non-memory resources gracefully (e.g. closing a database connection without forcing the database to timeout) and may take actions (e.g. logging the object's destruction).	[reply]
Re^2: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown? by dynamo (Chaplain) on Jun 15, 2005 at 22:51 UTC
DESTROY methods are certainly a sufficient reason, at least in cases where they have to be called in a certain order (assuming that the M&S produces a tree of referenced nodes as a side effect, so the roots can be DESTROYed first and pass that along to leaves). If they don't need a certain order, I think just the sweep would do it. Any other known rationales I had imagined that the reason we'd have to do the gc at the end _would_ be somehow related to memory resource freeing. Anyway, thank you for your answer.	[reply]
Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown? by merlyn (Sage) on Jun 15, 2005 at 22:22 UTC
Perhaps to ensure that all DESTROY methods get called, in case those do something non-trivial? -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown? by mstone (Deacon) on Jun 16, 2005 at 00:28 UTC
Slightly off-topic, I recall Dennis Ritchie talking about the Plan9 garbage collection system, which (IIRC) balanced a fairly aggressive reference counter against a very mellow mark-and-sweep system. The basic idea was that the ref counter would do most of the heavy lifting, with the mark-and-sweep system getting rid of the rare loop before too awfully long.	[reply]
Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown? by adrianh (Chancellor) on Jun 16, 2005 at 23:19 UTC
But for a normal perl program, let's say a simple non-interactive script running from the command line as a seperate process, what benefit is there from doing exhaustive and time-consuming garbage collection just before exiting and after all productive work is done? When a process exits, ALL the memory is released back to the kernel - interpreter memory, program-space memory - all of it. Circular references or not. Isn't this last minute gc just a big waste of time for the cases (outside of persistent runtime environments) where any garbage would be collected? Lots of reasons :-) Speed. Reference counting is usually slower than mark and sweep. Memory leaks are important even for stand alone scripts. Sure decent memory management might not be an issue with a script if it's processing eight files. What about eight thousand? What about eight million? Simplicity. You get to write simpler code if you have to deal with self-referential data structures. No more tedious mucking about with weak references or proxy objects to avoid memory leaks.	[reply]


Don't ask to ask, just ask
	PerlMonks