Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?

by dynamo (Chaplain)
on Jun 15, 2005 at 22:18 UTC ( [id://467091]=perlmeditation: print w/replies, xml ) Need Help??

Bretheren,

I have read much about perl's garbage collection infrastructure, and I am familiar with the advantages and disadvantages of mark&sweep vs. reference counting. I understand that the reference counting takes care of most of the memory deallocation throughout the program's execution, and that upon shutdown of the interpreter, a full-on mark&sweep gc run is executed, and thus self-referential and circular-referential nodes can be collected and released to the kernel.

This makes sense to me in the context of a host program that runs an embedded interpreter, such as apache running a mod_perl system that re-uses modules and would not want to accumulate memory leaks.

But for a normal perl program, let's say a simple non-interactive script running from the command line as a seperate process, what benefit is there from doing exhaustive and time-consuming garbage collection just before exiting and after all productive work is done? When a process exits, ALL the memory is released back to the kernel - interpreter memory, program-space memory - all of it. Circular references or not. Isn't this last minute gc just a big waste of time for the cases (outside of persistent runtime environments) where any garbage would be collected?

Or am I missing something?

Your enlightenment would be most appreciated.

Thanks,
- Paul

  • Comment on Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?

Replies are listed 'Best First'.
Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?
by ikegami (Patriarch) on Jun 15, 2005 at 22:21 UTC
    At the very least, it's needed in order to call the DESTROY of objects, which may release non-memory resources gracefully (e.g. closing a database connection without forcing the database to timeout) and may take actions (e.g. logging the object's destruction).
      DESTROY methods are certainly a sufficient reason, at least in cases where they have to be called in a certain order (assuming that the M&S produces a tree of referenced nodes as a side effect, so the roots can be DESTROYed first and pass that along to leaves). If they don't need a certain order, I think just the sweep would do it.

      Any other known rationales I had imagined that the reason we'd have to do the gc at the end _would_ be somehow related to memory resource freeing.

      Anyway, thank you for your answer.

Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?
by merlyn (Sage) on Jun 15, 2005 at 22:22 UTC
Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?
by mstone (Deacon) on Jun 16, 2005 at 00:28 UTC

    Slightly off-topic, I recall Dennis Ritchie talking about the Plan9 garbage collection system, which (IIRC) balanced a fairly aggressive reference counter against a very mellow mark-and-sweep system. The basic idea was that the ref counter would do most of the heavy lifting, with the mark-and-sweep system getting rid of the rare loop before too awfully long.

Re: Why bother with the mark and sweep garbage collection on non-embedded interpreter shutdown?
by adrianh (Chancellor) on Jun 16, 2005 at 23:19 UTC
    But for a normal perl program, let's say a simple non-interactive script running from the command line as a seperate process, what benefit is there from doing exhaustive and time-consuming garbage collection just before exiting and after all productive work is done? When a process exits, ALL the memory is released back to the kernel - interpreter memory, program-space memory - all of it. Circular references or not. Isn't this last minute gc just a big waste of time for the cases (outside of persistent runtime environments) where any garbage would be collected?

    Lots of reasons :-)

    • Speed. Reference counting is usually slower than mark and sweep.
    • Memory leaks are important even for stand alone scripts. Sure decent memory management might not be an issue with a script if it's processing eight files. What about eight thousand? What about eight million?
    • Simplicity. You get to write simpler code if you have to deal with self-referential data structures. No more tedious mucking about with weak references or proxy objects to avoid memory leaks.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://467091]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-24 23:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found