Beefy Boxes and Bandwidth Generously Provided by pair Networks BBQ
No such thing as a small change
 
PerlMonks  

How to deal with the fact that Perl is not releasing memory

by carlbolduc (Novice)
on Jul 06, 2013 at 00:42 UTC ( #1042824=perlquestion: print w/ replies, xml ) Need Help??
carlbolduc has asked for the wisdom of the Perl Monks concerning the following question:

Dear knowledgeable PerlMonks,

I recently learned the hard way that Perl is not releasing memory back to the OS (at least on Windows). I create a small Dancer web app that renders a list of emails following an Ajax request. After several days of usage, I noticed that the web page was getting very slow and found out that the Perl process was using 4 GB of memory.

Besides killing and restarting the app on a regular basis, is there a good practice to deal with this situation?

I currently use Plack to start my Dancer web app.

Carl

Comment on How to deal with the fact that Perl is not releasing memory
Re: How to deal with the fact that Perl is not releasing memory
by syphilis (Canon) on Jul 06, 2013 at 01:12 UTC
    Sounds like the program is designed (either intentionally or accidentally) to need more memory as it runs. Either way, that's not related to the memory-releasing characteristics of perl. (You can't expect a process to release memory that it thinks it's still using.)

    Do you think it should need extra memory as it runs ? If not, then there's probably a memory leak (unused memory not being freed for re-use by the process) that needs to be fixed.

    Cheers,
    Rob
Re: How to deal with the fact that Perl is not releasing memory
by davido (Archbishop) on Jul 06, 2013 at 01:46 UTC

    Watching Tim Bunce - Profiling Memory Usage (YAPC-NA 2013) is an excellent way to come to a better understanding of how Perl manages memory.

    Most memory leaks can be traced back to XS code, or to circular references. There are other possible sources (some are mentioned in his talk), but those are the big ones. At any rate, I really enjoyed that talk, and although I thought I knew a lot about Perl's memory management, the talk showed me that I still had a lot to learn, while also managing to provide a good dose of enlightenment. :)


    Dave

      Thank you, I will watch this talk.
Re: How to deal with the fact that Perl is not releasing memory
by kcott (Abbot) on Jul 06, 2013 at 02:36 UTC

    G'day Carl,

    Welcome to the monastery.

    There are a number of potential reasons why you're seeing this behaviour; it may not be limited to a single cause. There may be problems in your code, in modules you're using or both. Perhaps data is being continually added to some variable which is simply growing over time; perhaps you have an issue with Circular References; perhaps it's something else.

    So, without having seen your code, it's rather difficult (as I'm sure you'll appreciate) to either identify the problem or advise how to deal with it.

    If you can track down the problem to some part of your code, that would be a start. Test::LeakTrace may be helpful here. There's also a number of tools in CPAN: Development Support that might be useful.

    Take a look at the documentation for any modules you're using. There may be caveats regarding memory issues (and possibly workarounds). Also check for any bug reports that might be related: there may be fixes in more recent modules, patches you can apply or workarounds you can use. There may be modules with similar functionality (but without the same problems) that you could use instead.

    If you identify where in your code the problem lies but don't know how to fix it, you can post that code here and ask a more specific question. If you do this, you'll get the best answers if you follow the guidelines in "How do I post a question effectively?".

    -- Ken

      I will investigate potential memory leaks.

      ***EDIT*** Here is the complete code:

      package CommunicationStream; use Dancer ':syntax'; use Dancer::Plugin::Ajax; use URI::Escape; use WWW::Mechanize; use HTML::TreeBuilder; use HTML::FormatText; use DateTime; use DateTime::Format::Strptime; use SOAP::Lite; our $VERSION = '0.1'; $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0; # SOAP Lite configuration #my $soap = SOAP::Lite... my $firstResult = 0; get '/:ref' => sub { my @messages = (); my ($modifiedDate, $quickViewLinks, $linkToDocuments, $outlookProt +ocolLinks, $containsAttachment) = getCachedDocumentUris(params->{ref} +, 0, 5); #fix this to use the number of documents my $i = 0; foreach my $quickViewLink (@{$quickViewLinks}) { my $text = grab_page($quickViewLink); my $datetime = getDisplayDate(${$modifiedDate}[$i]); my ($date, $timestamp) = split(/\|/, $datetime); if (index($text, "class=\"email") != -1) { if (${$containsAttachment}[$i] eq "true") { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\"/ +images/emailInbound.gif\" /></a>", text => $text, attachment => "<img + src=\"/images/emailHasAttach.gif\" />"}); } else { push(@messages, {date => $date, timestamp => $timestamp, + link => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\ +"/images/emailInbound.gif\" /></a>", text => $text}); } } elsif (index($text,"||") != -1) { my @linkAndText = split(/\|\|/, $text); push(@messages, {date => $date, timestamp => $timestamp, l +ink => $linkAndText[0], text => $linkAndText[1]}); } else { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$linkToDocuments}[$i] . "\"><img src=\"/image +s/comment_icon.gif\" /></a>", text => $text}); } $i++; } template 'index' => {messages => \@messages}; }; ajax '/getNextResults:firstResultAndRef' => sub { my $content = ''; my ($salesforceRef, $firstResult) = split(/\|/, params->{firstResu +ltAndRef}); my @messages = (); my ($modifiedDate, $quickViewLinks, $linkToDocuments, $outlookProtoc +olLinks) = getCachedDocumentUris($salesforceRef, $firstResult, 5); my $i = 0; foreach my $quickViewLink (@{$quickViewLinks}) { my $text = grab_page($quickViewLink); my $datetime = getDisplayDate(${$modifiedDate}[$i]); my ($date, $timestamp) = split(/\|/, $datetime); if (index($text, "class=\"email") != -1) { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\"/ +images/emailInbound.gif\" /></a>", text => $text}); } elsif (index($text,"||") != -1) { my @linkAndText = split(/\|\|/, $text); push(@messages, {date => $date, timestamp => $timestamp, l +ink => $linkAndText[0], text => $linkAndText[1]}); } else { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$linkToDocuments}[$i] . "\"><img src=\"/image +s/comment_icon.gif\" /></a>", text => $text}); } $i++; } $content = template 'messages' => {messages => \@messages}; { content => $content } ; }; ajax '/getAllResults:firstResultAndRef' => sub { my $content = ''; my ($salesforceRef, $firstResult) = split(/\|/, params->{firstResu +ltAndRef}); my @messages = (); my ($modifiedDate, $quickViewLinks, $linkToDocuments, $outlookProtoc +olLinks) = getCachedDocumentUris($salesforceRef, $firstResult, 1000); my $i = 0; foreach my $quickViewLink (@{$quickViewLinks}) { my $text = grab_page($quickViewLink); my $datetime = getDisplayDate(${$modifiedDate}[$i]); my ($date, $timestamp) = split(/\|/, $datetime); if (index($text, "class=\"email") != -1) { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\"/ +images/emailInbound.gif\" /></a>", text => $text}); } elsif (index($text,"||") != -1) { my @linkAndText = split(/\|\|/, $text); push(@messages, {date => $date, timestamp => $timestamp, l +ink => $linkAndText[0], text => $linkAndText[1]}); } else { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$linkToDocuments}[$i] . "\"><img src=\"/image +s/comment_icon.gif\" /></a>", text => $text}); } $i++; } $content = template 'messages' => {messages => \@messages}; { content => $content } ; }; sub getResults { my ($query, $firstResult, $numberOfResults) = ($_[0], $_[1], $_[2] +); my @params = (SOAP::Data->name("p_Params"=> \SOAP::Data->value(SOAP::Data-> +name("BasicQuery" => $query) -> type("string"), SOAP::Data-> +name("NumberOfResults" => $numberOfResults), SOAP::Data-> +name("SortCriteria" => "ModifiedDateDescending"), SOAP::Data-> +name("NeedCachedDocumentUris" => "true"), SOAP::Data-> +name("FirstResult" => $firstResult), SOAP::Data-> +name("TimeZoneOffset" => 4), SOAP::Data-> +name("NeededFields" => + \SOAP::Data->name("string" => "\@sysoutlookuri"))))); my $result = $soap->call($method => @params); return $result; } sub getCachedDocumentUris { my @URIs; my ($salesforceRef, $firstResult, $numberOfResults) = ($_[0], $_[1 +], $_[2]); my $SearchResults = getResults("$salesforceRef \@sysfiletype==(exc +hangemessage,SFCaseComment)", $firstResult, $numberOfResults); my @modifiedDate = $SearchResults->valueof('//PerformQueryResponse +/PerformQueryResult/Results/QueryResult/ModifiedDate'); my @tmpURIs = $SearchResults->valueof('//PerformQueryResponse/Perf +ormQueryResult/Results/QueryResult/CachedDocumentUri'); my @clickableURIs = $SearchResults->valueof('//PerformQueryRespons +e/PerformQueryResult/Results/QueryResult/TargetUri'); my @outlookURIs = $SearchResults->valueof('//PerformQueryResponse/ +PerformQueryResult/Results/QueryResult/Fields/ResultField/Value'); my @containsAttachment = $SearchResults->valueof('//PerformQueryRe +sponse/PerformQueryResult/Results/QueryResult/ContainsAttachment'); foreach my $URI (@tmpURIs) { my ($LeftURI, $RemainingURI) = split('&docid=', $URI); my ($MiddleURI, $RightURI) = split('&q=', $RemainingURI); $MiddleURI =~ s/:/%3A/g; $MiddleURI =~ s/@/%40/g; $MiddleURI =~ s/\//%2F/g; $MiddleURI =~ s/\$/%24/g; $MiddleURI =~ s//%C3%A9/g; $MiddleURI =~ s/\+/%2b/g; $MiddleURI =~ s/Messages%2bTrait%C3%A9s/Messages\+Trait%C3%A9s +/g; #push (@URIs, uri_escape($URI)); push (@URIs, $LeftURI . '&docid=' . $MiddleURI . '&q=' . $Righ +tURI); } return (\@modifiedDate, \@URIs, \@clickableURIs, \@outlookURIs, \@ +containsAttachment); } sub grab_page { my @SupportAgents = ("Carl Bolduc", "Second Dude", "Third Dude"); # using shift to accept parameter passed to method my $urlString = shift; my $username = "domain\\user"; my $password = "great password"; my $mech = WWW::Mechanize->new(autocheck => 0); $mech->agent('Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/ +20100101 Firefox/4.0.1'); $mech->credentials($username, $password); my $response = $mech->get($urlString); my $page_contents = $mech->content(); my $Format = HTML::FormatText->new(leftmargin => 0); my $TreeBuilder = HTML::TreeBuilder->new; $TreeBuilder->parse($page_contents); my $Parsed = $Format->format($TreeBuilder); my (@EmailsEnglish, @EmailsFrench); my $FirstEmail; # is this an email exchange if (index($Parsed,"From:") != -1) { @EmailsEnglish = split(/From:/,$Parsed); $FirstEmail = $EmailsEnglish[0]; $FirstEmail =~ s/\n/<br>/g; $FirstEmail =~ s/-+\s?Original Message\s?-+//g; $FirstEmail =~ s/(<br>){3,}/<br>/g; return "<div class=\"email\">$FirstEmail</div>"; } elsif (index($Parsed,"De:") != -1) { @EmailsFrench = split(/De:/,$Parsed); $FirstEmail = $EmailsFrench[0]; $FirstEmail =~ s/\n/<br>/g; return "<div class=\"email\">$FirstEmail</div>"; } # is this a mantis exchange elsif (index($Parsed,"================") != -1) { my $LastNote = substr($Parsed,rindex($Parsed,"---------------- +------------------------------------------------------",rindex($Parse +d, "----------------------------------------------------------------- +-----") -1)); #$LastNote =~ s/-{11,}//g; $LastNote = substr($LastNote,0,index($LastNote,"===========")) +; my $LastNoteAuthor = substr($LastNote,0,rindex($LastNote,"---- +------------------------------------------------------------------")) +; $LastNoteAuthor =~ s/-{11,}//g; # Extract the link to the mantis comment my $MantisLink = ""; if ($LastNoteAuthor =~ m/http:\/\/mantis\/view.php\?id=\d*#c\d +*/) { $MantisLink = $&; } # Detect if Dev or Support is speaking foreach (@SupportAgents) { if (index($LastNoteAuthor,$_) != -1) { $LastNoteAuthor = "Support:"; last; } else { $LastNoteAuthor = "Dev:"; } } my $LastNoteMessage = substr($LastNote,rindex($LastNote,"----- +-----------------------------------------------------------------")); $LastNoteMessage =~ s/-{11,}//g; $LastNoteMessage =~ s/\n/<br>/g; $LastNoteMessage =~ s/(<br>){3,}/<br>/g; return "<a href=\"" . $MantisLink . "\" target=\"_blank\"><img + src=\"/images/mantis.gif\" /></a>||<div class=\"mantis\">" . $LastNo +teAuthor . "<br>" . $LastNoteMessage . "</div>"; #return "<div class=\"mantis\">" . $LastNoteAuthor . "<br>" . +$LastNoteMessage . "</div>"; } else { $Parsed =~ s/\n/<br>/g; $Parsed =~ s/(<br>){3,}/<br>/g; #salesforce comment return "<div class=\"comment\">$Parsed</div>"; } } sub getDisplayDate { my $modifiedDate = $_[0]; #"20090103 12:00"; my $format = new DateTime::Format::Strptime( pattern => '%Y-%m-%dT%H:%M:%S', time_zone => 'GMT', ); my $date = $format->parse_datetime($modifiedDate); #print $date->strftime("%Y%m%d %H:%M %Z")."\n"; $date->set_time_zone("America/New_York"); return $date->day_abbr().", ".$date->month_abbr()." ".$date->strft +ime("%d|%H:%M"); } true;

      My understanding was that each Ajax request was reassigning the variables. Reading all the comments here, I am definitely doing something wrong.

      Thank you kcott.

        Everything about the code you posted looks fine -- I'm not seeing any stray variables or anything and everything looks self-contained. You need to post more.

        How large can the result of grab_page() be? Expect your peak memory use to be that number multiplied by 3 or so. (Even then it should not slow the script down.)

        One bit that looks wasteful are these parts that cause a fairly large string copy (I think):

        $content = template 'messages' => {messages => \@messages}; { content => $content }
        Throw away the temporary variable and I don't think it'll fix your problem but it might slow down the memory bloat a bit:
        { content => template( 'messages' => {messages => \@messages} ) }

        Firstly, I'm not a Dancer user. What follows is just a suggestion for something to try; it may not help at all!

        In each of your coderefs (get ... sub { ...} [1 instance] and ajax ... sub { ...} [2 instances]), you have code like this:

        my @messages = (); ... template '...' => {messages => \@messages};

        When they (implicitly) return, @messages will not be garbage-collected while the reference (\@messages) still exists.

        I don't know why that reference might still exist. I followed it through Dancer (1.3116) source:

        sub template { Dancer::Template::Abstract->template(@_) }

        and then through Dancer::Template::Abstract (1.3116) source:

        sub template { my ($class, $view, $tokens, $options) = @_; ...

        but then, not knowing what templating engine you were using, was unable to continue. I'll leave that as an exercise for you.

        You can ensure @messages is garbage-collected by using this code instead (which contructs an anonymous arrayref):

        my @messages = (); ... template 'XXX' => {messages => [ @messages ]};

        Expanding @messages will incur some overhead but that appears to be very small. Also, you might just be deferring your memory leak problem.

        Be aware that memory may not be leaked on every cycle: a linear increase in memory usage may not be occurring at all. It might happen in code that's only run conditionally or perhaps as a result of some exception.

        -- Ken

Re: How to deal with the fact that Perl is not releasing memory
by sundialsvc4 (Monsignor) on Jul 06, 2013 at 19:12 UTC

    Start by moving the my @messages = (); declaration outside of all subs.   The variable is shared among several subs.

    The code in each case appears to just be pushing messages onto an array.   Maybe those “not-listed subroutines” actually contain the bug.

      It's building a response to an HTTP request. By moving the variable declaration outside, it would surely "leak" memory and the "responses" would soon likely be gigabytes in size...

        (Instead of blindly down-voting ...) would you please point out my error?   Seeing a negative number tells me nothing, although I am quite used to it by now.   Obviously, the my variable must be declared somewhere, and in addition, it must be set to empty-list or undef (if this is running in a persistent environment such as mod_perl).   It is genuinely my understanding that declaring the variable within one (in-lined) sub, but using it in another, is Not A Good Thing.™   Please explain the error of my ways.

        Edit:   Was the source-code revised since the first time I saw it, or is it just that my glasses are dirty?   I don’t think I saw multiple my declarations for the array.   More cof-fee... more cof-fee...

        It occurs to me also in re-reading that the my declaration is also the only place where the variable in question is set to empty-list.   Therefore, if the code repeatedly followed another path, the array would continue to grow.   (Although I’d expect that you would be able to see that happening very easily.)

        I also don’t see anything in the fragment that was posted that would account for a runaway memory leak.

        I really want to know, as any Monk would do, where I went wrong on this comment!   Thanks in advance.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1042824]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2014-04-17 19:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (454 votes), past polls