Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^4: How to deal with the fact that Perl is not releasing memory

by carlbolduc (Novice)
on Jul 07, 2013 at 18:01 UTC ( #1042993=note: print w/ replies, xml ) Need Help??


in reply to Re^3: How to deal with the fact that Perl is not releasing memory
in thread How to deal with the fact that Perl is not releasing memory

I edited my reply to include the additional subs.

grab_page() can sometimes fetch a humongous email thread. It is also called several hundreds of times a day at work.

A bit of background... Our search engine indexes the emails that we exchange with our clients. Doing so, it also generates a HTML version of each email. By querying the search engine with the support case number (I'm a support agent), I can get all the exchanges of a support ticket in the correct order. By fetching the HTML version and by extracting the first part of each email thread, I can reconstitute a perfect flow of communication. This is very useful when grabbing the case of another support agent for example.

After watching the video mentioned by Dave, it seems that all the memory used by those ajax calls will remain into malloc and not returned to the OS.

Am I missing something?


Comment on Re^4: How to deal with the fact that Perl is not releasing memory
Download Code
Re^5: How to deal with the fact that Perl is not releasing memory
by Anonymous Monk on Jul 07, 2013 at 21:10 UTC
    Anything allocated will not be released to the operating system. See e.g. http://c-faq.com/malloc/free2os.es.html. The exception to this is mmapped memory, typically used for fast file access.

    I suspect your primary problem is the actual parsing of the humongous HTML pages. It just takes a lot of memory -- but the memory should still be freed and reused by the Perl interpreter. But I found this choice quote from HTML::TreeBuilder:

    4. previous versions of HTML::TreeBuilder required you to call $tree->delete() to erase the contents of the tree from memory when you're done with the tree. This is not normally required anymore. See "Weak References" in HTML::Element for details.

    One general technique for keeping the memory use to a minimum is to fork off a child for every (large) request, and pass the bare-minimum result (here, @messages) to the parent. It's far from elegant and relatively difficult to implement, and the peak memory use will still be the same -- just not in the parent process. A last-resort option.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1042993]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2014-09-02 00:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (18 votes), past polls