Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re^3: How to deal with the fact that Perl is not releasing memory

by Anonymous Monk
on Jul 06, 2013 at 21:02 UTC ( #1042946=note: print w/replies, xml ) Need Help??

in reply to Re^2: How to deal with the fact that Perl is not releasing memory
in thread How to deal with the fact that Perl is not releasing memory

Everything about the code you posted looks fine -- I'm not seeing any stray variables or anything and everything looks self-contained. You need to post more.

How large can the result of grab_page() be? Expect your peak memory use to be that number multiplied by 3 or so. (Even then it should not slow the script down.)

One bit that looks wasteful are these parts that cause a fairly large string copy (I think):

$content = template 'messages' => {messages => \@messages}; { content => $content }
Throw away the temporary variable and I don't think it'll fix your problem but it might slow down the memory bloat a bit:
{ content => template( 'messages' => {messages => \@messages} ) }

Replies are listed 'Best First'.
Re^4: How to deal with the fact that Perl is not releasing memory
by carlbolduc (Novice) on Jul 07, 2013 at 18:01 UTC

    I edited my reply to include the additional subs.

    grab_page() can sometimes fetch a humongous email thread. It is also called several hundreds of times a day at work.

    A bit of background... Our search engine indexes the emails that we exchange with our clients. Doing so, it also generates a HTML version of each email. By querying the search engine with the support case number (I'm a support agent), I can get all the exchanges of a support ticket in the correct order. By fetching the HTML version and by extracting the first part of each email thread, I can reconstitute a perfect flow of communication. This is very useful when grabbing the case of another support agent for example.

    After watching the video mentioned by Dave, it seems that all the memory used by those ajax calls will remain into malloc and not returned to the OS.

    Am I missing something?

      Anything allocated will not be released to the operating system. See e.g. The exception to this is mmapped memory, typically used for fast file access.

      I suspect your primary problem is the actual parsing of the humongous HTML pages. It just takes a lot of memory -- but the memory should still be freed and reused by the Perl interpreter. But I found this choice quote from HTML::TreeBuilder:

      4. previous versions of HTML::TreeBuilder required you to call $tree->delete() to erase the contents of the tree from memory when you're done with the tree. This is not normally required anymore. See "Weak References" in HTML::Element for details.

      One general technique for keeping the memory use to a minimum is to fork off a child for every (large) request, and pass the bare-minimum result (here, @messages) to the parent. It's far from elegant and relatively difficult to implement, and the peak memory use will still be the same -- just not in the parent process. A last-resort option.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1042946]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2018-07-18 14:52 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (393 votes). Check out past polls.