Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

How to deal with the fact that Perl is not releasing memory

by carlbolduc (Novice)
on Jul 06, 2013 at 00:42 UTC ( [id://1042824]=perlquestion: print w/replies, xml ) Need Help??

carlbolduc has asked for the wisdom of the Perl Monks concerning the following question:

Dear knowledgeable PerlMonks,

I recently learned the hard way that Perl is not releasing memory back to the OS (at least on Windows). I create a small Dancer web app that renders a list of emails following an Ajax request. After several days of usage, I noticed that the web page was getting very slow and found out that the Perl process was using 4 GB of memory.

Besides killing and restarting the app on a regular basis, is there a good practice to deal with this situation?

I currently use Plack to start my Dancer web app.

Carl

Replies are listed 'Best First'.
Re: How to deal with the fact that Perl is not releasing memory
by davido (Cardinal) on Jul 06, 2013 at 01:46 UTC

    Watching Tim Bunce - Profiling Memory Usage (YAPC-NA 2013) is an excellent way to come to a better understanding of how Perl manages memory.

    Most memory leaks can be traced back to XS code, or to circular references. There are other possible sources (some are mentioned in his talk), but those are the big ones. At any rate, I really enjoyed that talk, and although I thought I knew a lot about Perl's memory management, the talk showed me that I still had a lot to learn, while also managing to provide a good dose of enlightenment. :)


    Dave

      Thank you, I will watch this talk.
Re: How to deal with the fact that Perl is not releasing memory
by syphilis (Archbishop) on Jul 06, 2013 at 01:12 UTC
    Sounds like the program is designed (either intentionally or accidentally) to need more memory as it runs. Either way, that's not related to the memory-releasing characteristics of perl. (You can't expect a process to release memory that it thinks it's still using.)

    Do you think it should need extra memory as it runs ? If not, then there's probably a memory leak (unused memory not being freed for re-use by the process) that needs to be fixed.

    Cheers,
    Rob
Re: How to deal with the fact that Perl is not releasing memory
by kcott (Archbishop) on Jul 06, 2013 at 02:36 UTC

    G'day Carl,

    Welcome to the monastery.

    There are a number of potential reasons why you're seeing this behaviour; it may not be limited to a single cause. There may be problems in your code, in modules you're using or both. Perhaps data is being continually added to some variable which is simply growing over time; perhaps you have an issue with Circular References; perhaps it's something else.

    So, without having seen your code, it's rather difficult (as I'm sure you'll appreciate) to either identify the problem or advise how to deal with it.

    If you can track down the problem to some part of your code, that would be a start. Test::LeakTrace may be helpful here. There's also a number of tools in CPAN: Development Support that might be useful.

    Take a look at the documentation for any modules you're using. There may be caveats regarding memory issues (and possibly workarounds). Also check for any bug reports that might be related: there may be fixes in more recent modules, patches you can apply or workarounds you can use. There may be modules with similar functionality (but without the same problems) that you could use instead.

    If you identify where in your code the problem lies but don't know how to fix it, you can post that code here and ask a more specific question. If you do this, you'll get the best answers if you follow the guidelines in "How do I post a question effectively?".

    -- Ken

      I will investigate potential memory leaks.

      ***EDIT*** Here is the complete code:

      package CommunicationStream; use Dancer ':syntax'; use Dancer::Plugin::Ajax; use URI::Escape; use WWW::Mechanize; use HTML::TreeBuilder; use HTML::FormatText; use DateTime; use DateTime::Format::Strptime; use SOAP::Lite; our $VERSION = '0.1'; $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0; # SOAP Lite configuration #my $soap = SOAP::Lite... my $firstResult = 0; get '/:ref' => sub { my @messages = (); my ($modifiedDate, $quickViewLinks, $linkToDocuments, $outlookProt +ocolLinks, $containsAttachment) = getCachedDocumentUris(params->{ref} +, 0, 5); #fix this to use the number of documents my $i = 0; foreach my $quickViewLink (@{$quickViewLinks}) { my $text = grab_page($quickViewLink); my $datetime = getDisplayDate(${$modifiedDate}[$i]); my ($date, $timestamp) = split(/\|/, $datetime); if (index($text, "class=\"email") != -1) { if (${$containsAttachment}[$i] eq "true") { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\"/ +images/emailInbound.gif\" /></a>", text => $text, attachment => "<img + src=\"/images/emailHasAttach.gif\" />"}); } else { push(@messages, {date => $date, timestamp => $timestamp, + link => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\ +"/images/emailInbound.gif\" /></a>", text => $text}); } } elsif (index($text,"||") != -1) { my @linkAndText = split(/\|\|/, $text); push(@messages, {date => $date, timestamp => $timestamp, l +ink => $linkAndText[0], text => $linkAndText[1]}); } else { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$linkToDocuments}[$i] . "\"><img src=\"/image +s/comment_icon.gif\" /></a>", text => $text}); } $i++; } template 'index' => {messages => \@messages}; }; ajax '/getNextResults:firstResultAndRef' => sub { my $content = ''; my ($salesforceRef, $firstResult) = split(/\|/, params->{firstResu +ltAndRef}); my @messages = (); my ($modifiedDate, $quickViewLinks, $linkToDocuments, $outlookProtoc +olLinks) = getCachedDocumentUris($salesforceRef, $firstResult, 5); my $i = 0; foreach my $quickViewLink (@{$quickViewLinks}) { my $text = grab_page($quickViewLink); my $datetime = getDisplayDate(${$modifiedDate}[$i]); my ($date, $timestamp) = split(/\|/, $datetime); if (index($text, "class=\"email") != -1) { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\"/ +images/emailInbound.gif\" /></a>", text => $text}); } elsif (index($text,"||") != -1) { my @linkAndText = split(/\|\|/, $text); push(@messages, {date => $date, timestamp => $timestamp, l +ink => $linkAndText[0], text => $linkAndText[1]}); } else { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$linkToDocuments}[$i] . "\"><img src=\"/image +s/comment_icon.gif\" /></a>", text => $text}); } $i++; } $content = template 'messages' => {messages => \@messages}; { content => $content } ; }; ajax '/getAllResults:firstResultAndRef' => sub { my $content = ''; my ($salesforceRef, $firstResult) = split(/\|/, params->{firstResu +ltAndRef}); my @messages = (); my ($modifiedDate, $quickViewLinks, $linkToDocuments, $outlookProtoc +olLinks) = getCachedDocumentUris($salesforceRef, $firstResult, 1000); my $i = 0; foreach my $quickViewLink (@{$quickViewLinks}) { my $text = grab_page($quickViewLink); my $datetime = getDisplayDate(${$modifiedDate}[$i]); my ($date, $timestamp) = split(/\|/, $datetime); if (index($text, "class=\"email") != -1) { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$outlookProtocolLinks}[$i] . "\"><img src=\"/ +images/emailInbound.gif\" /></a>", text => $text}); } elsif (index($text,"||") != -1) { my @linkAndText = split(/\|\|/, $text); push(@messages, {date => $date, timestamp => $timestamp, l +ink => $linkAndText[0], text => $linkAndText[1]}); } else { push(@messages, {date => $date, timestamp => $timestamp, l +ink => "<a href=\"" . ${$linkToDocuments}[$i] . "\"><img src=\"/image +s/comment_icon.gif\" /></a>", text => $text}); } $i++; } $content = template 'messages' => {messages => \@messages}; { content => $content } ; }; sub getResults { my ($query, $firstResult, $numberOfResults) = ($_[0], $_[1], $_[2] +); my @params = (SOAP::Data->name("p_Params"=> \SOAP::Data->value(SOAP::Data-> +name("BasicQuery" => $query) -> type("string"), SOAP::Data-> +name("NumberOfResults" => $numberOfResults), SOAP::Data-> +name("SortCriteria" => "ModifiedDateDescending"), SOAP::Data-> +name("NeedCachedDocumentUris" => "true"), SOAP::Data-> +name("FirstResult" => $firstResult), SOAP::Data-> +name("TimeZoneOffset" => 4), SOAP::Data-> +name("NeededFields" => + \SOAP::Data->name("string" => "\@sysoutlookuri"))))); my $result = $soap->call($method => @params); return $result; } sub getCachedDocumentUris { my @URIs; my ($salesforceRef, $firstResult, $numberOfResults) = ($_[0], $_[1 +], $_[2]); my $SearchResults = getResults("$salesforceRef \@sysfiletype==(exc +hangemessage,SFCaseComment)", $firstResult, $numberOfResults); my @modifiedDate = $SearchResults->valueof('//PerformQueryResponse +/PerformQueryResult/Results/QueryResult/ModifiedDate'); my @tmpURIs = $SearchResults->valueof('//PerformQueryResponse/Perf +ormQueryResult/Results/QueryResult/CachedDocumentUri'); my @clickableURIs = $SearchResults->valueof('//PerformQueryRespons +e/PerformQueryResult/Results/QueryResult/TargetUri'); my @outlookURIs = $SearchResults->valueof('//PerformQueryResponse/ +PerformQueryResult/Results/QueryResult/Fields/ResultField/Value'); my @containsAttachment = $SearchResults->valueof('//PerformQueryRe +sponse/PerformQueryResult/Results/QueryResult/ContainsAttachment'); foreach my $URI (@tmpURIs) { my ($LeftURI, $RemainingURI) = split('&docid=', $URI); my ($MiddleURI, $RightURI) = split('&q=', $RemainingURI); $MiddleURI =~ s/:/%3A/g; $MiddleURI =~ s/@/%40/g; $MiddleURI =~ s/\//%2F/g; $MiddleURI =~ s/\$/%24/g; $MiddleURI =~ s/é/%C3%A9/g; $MiddleURI =~ s/\+/%2b/g; $MiddleURI =~ s/Messages%2bTrait%C3%A9s/Messages\+Trait%C3%A9s +/g; #push (@URIs, uri_escape($URI)); push (@URIs, $LeftURI . '&docid=' . $MiddleURI . '&q=' . $Righ +tURI); } return (\@modifiedDate, \@URIs, \@clickableURIs, \@outlookURIs, \@ +containsAttachment); } sub grab_page { my @SupportAgents = ("Carl Bolduc", "Second Dude", "Third Dude"); # using shift to accept parameter passed to method my $urlString = shift; my $username = "domain\\user"; my $password = "great password"; my $mech = WWW::Mechanize->new(autocheck => 0); $mech->agent('Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/ +20100101 Firefox/4.0.1'); $mech->credentials($username, $password); my $response = $mech->get($urlString); my $page_contents = $mech->content(); my $Format = HTML::FormatText->new(leftmargin => 0); my $TreeBuilder = HTML::TreeBuilder->new; $TreeBuilder->parse($page_contents); my $Parsed = $Format->format($TreeBuilder); my (@EmailsEnglish, @EmailsFrench); my $FirstEmail; # is this an email exchange if (index($Parsed,"From:") != -1) { @EmailsEnglish = split(/From:/,$Parsed); $FirstEmail = $EmailsEnglish[0]; $FirstEmail =~ s/\n/<br>/g; $FirstEmail =~ s/-+\s?Original Message\s?-+//g; $FirstEmail =~ s/(<br>){3,}/<br>/g; return "<div class=\"email\">$FirstEmail</div>"; } elsif (index($Parsed,"De:") != -1) { @EmailsFrench = split(/De:/,$Parsed); $FirstEmail = $EmailsFrench[0]; $FirstEmail =~ s/\n/<br>/g; return "<div class=\"email\">$FirstEmail</div>"; } # is this a mantis exchange elsif (index($Parsed,"================") != -1) { my $LastNote = substr($Parsed,rindex($Parsed,"---------------- +------------------------------------------------------",rindex($Parse +d, "----------------------------------------------------------------- +-----") -1)); #$LastNote =~ s/-{11,}//g; $LastNote = substr($LastNote,0,index($LastNote,"===========")) +; my $LastNoteAuthor = substr($LastNote,0,rindex($LastNote,"---- +------------------------------------------------------------------")) +; $LastNoteAuthor =~ s/-{11,}//g; # Extract the link to the mantis comment my $MantisLink = ""; if ($LastNoteAuthor =~ m/http:\/\/mantis\/view.php\?id=\d*#c\d +*/) { $MantisLink = $&; } # Detect if Dev or Support is speaking foreach (@SupportAgents) { if (index($LastNoteAuthor,$_) != -1) { $LastNoteAuthor = "Support:"; last; } else { $LastNoteAuthor = "Dev:"; } } my $LastNoteMessage = substr($LastNote,rindex($LastNote,"----- +-----------------------------------------------------------------")); $LastNoteMessage =~ s/-{11,}//g; $LastNoteMessage =~ s/\n/<br>/g; $LastNoteMessage =~ s/(<br>){3,}/<br>/g; return "<a href=\"" . $MantisLink . "\" target=\"_blank\"><img + src=\"/images/mantis.gif\" /></a>||<div class=\"mantis\">" . $LastNo +teAuthor . "<br>" . $LastNoteMessage . "</div>"; #return "<div class=\"mantis\">" . $LastNoteAuthor . "<br>" . +$LastNoteMessage . "</div>"; } else { $Parsed =~ s/\n/<br>/g; $Parsed =~ s/(<br>){3,}/<br>/g; #salesforce comment return "<div class=\"comment\">$Parsed</div>"; } } sub getDisplayDate { my $modifiedDate = $_[0]; #"20090103 12:00"; my $format = new DateTime::Format::Strptime( pattern => '%Y-%m-%dT%H:%M:%S', time_zone => 'GMT', ); my $date = $format->parse_datetime($modifiedDate); #print $date->strftime("%Y%m%d %H:%M %Z")."\n"; $date->set_time_zone("America/New_York"); return $date->day_abbr().", ".$date->month_abbr()." ".$date->strft +ime("%d|%H:%M"); } true;

      My understanding was that each Ajax request was reassigning the variables. Reading all the comments here, I am definitely doing something wrong.

      Thank you kcott.

        Firstly, I'm not a Dancer user. What follows is just a suggestion for something to try; it may not help at all!

        In each of your coderefs (get ... sub { ...} [1 instance] and ajax ... sub { ...} [2 instances]), you have code like this:

        my @messages = (); ... template '...' => {messages => \@messages};

        When they (implicitly) return, @messages will not be garbage-collected while the reference (\@messages) still exists.

        I don't know why that reference might still exist. I followed it through Dancer (1.3116) source:

        sub template { Dancer::Template::Abstract->template(@_) }

        and then through Dancer::Template::Abstract (1.3116) source:

        sub template { my ($class, $view, $tokens, $options) = @_; ...

        but then, not knowing what templating engine you were using, was unable to continue. I'll leave that as an exercise for you.

        You can ensure @messages is garbage-collected by using this code instead (which contructs an anonymous arrayref):

        my @messages = (); ... template 'XXX' => {messages => [ @messages ]};

        Expanding @messages will incur some overhead but that appears to be very small. Also, you might just be deferring your memory leak problem.

        Be aware that memory may not be leaked on every cycle: a linear increase in memory usage may not be occurring at all. It might happen in code that's only run conditionally or perhaps as a result of some exception.

        -- Ken

        Everything about the code you posted looks fine -- I'm not seeing any stray variables or anything and everything looks self-contained. You need to post more.

        How large can the result of grab_page() be? Expect your peak memory use to be that number multiplied by 3 or so. (Even then it should not slow the script down.)

        One bit that looks wasteful are these parts that cause a fairly large string copy (I think):

        $content = template 'messages' => {messages => \@messages}; { content => $content }
        Throw away the temporary variable and I don't think it'll fix your problem but it might slow down the memory bloat a bit:
        { content => template( 'messages' => {messages => \@messages} ) }
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1042824]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (6)
As of 2024-03-19 11:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found