http://www.perlmonks.org?node_id=1054248


in reply to Re^2: Memory Leak HTML::FormatText
in thread Memory Leak HTML::FormatText

What is  $content->delete(new) supposed to be or do (what is the string "new") ?

Nevermind

here is my tip, do a Data::Dumper of an object afer one or 10 files, and look for references

Note especially the bless'ed package names

Then go write some destructors, its what I did for bugs in HTML::TableExtract/HTML::TableExtract Memory Usage

Since you're using sub format_file { I'd copy/paste its source and Dumper the objects involved to find circular-references $VAR1 = { ... \$VAR1 };

Replies are listed 'Best First'.
Re^4: Memory Leak HTML::FormatText
by PerlNovice999 (Novice) on Sep 16, 2013 at 09:23 UTC

    The following code (adding the delete command in the eval call) works (no error message: "can't call method..."), but there is still a memory leak.

    use warnings; use strict; use diagnostics; use HTML::FormatText; use HTML::TreeBuilder 5 - weak; use constant HAS_LEAKTRACE => eval{ require Test::LeakTrace }; use Test::More HAS_LEAKTRACE ? (tests => 1) : (skip_all => 'require Te +st::LeakTrace'); use Test::LeakTrace; leaks_cmp_ok{ open INPUT, "< D:/websiteadresses.txt" or die "Problem: $!"; # The file contains the adresses of 28 000 websites my @INPUT=<INPUT>; close INPUT; while (@INPUT) { my $input=shift(@INPUT); #my $proposal; chomp $input; print $input; my $content=HTML::FormatText->format_file($input, leftmargin => 0, ri +ghtmargin => 50); eval { $content->delete; }; # followed by regular expressions, the results of which are saved in +a different file - all now disabled } } '<', 1;
Re^4: Memory Leak HTML::FormatText
by PerlNovice999 (Novice) on Sep 16, 2013 at 09:17 UTC

    Sorry, my mistake, this was supposed to read

    $content->delete();

    The hope was that this would destroy the Treebuilder object, which apparently gets build via HTML::FormatText, and thus prevent the memory leak. There is a reference to a delete function in the CPAN documentation, but I just get the error message "Can't call method...". The same goes for this line, which I just tried now after reading your earlier entry:

    $content->eof;
Re^4: Memory Leak HTML::FormatText
by PerlNovice999 (Novice) on Sep 18, 2013 at 12:25 UTC

    In my reading of the CPAN documentation of Data::Dumper (not pretending that I understood most of it...) I would need to know the variable names that I am tracking first. But I guess my problem is exactly that I do not know them. It seems that HTML::TreeBuilder is creating something in the background.

    I do not think that this is what you were suggesting, but I tried the following code:

    use warnings; use strict; use diagnostics; use HTML::FormatText; use Data::Dumper; open INPUT, "< D:/websiteadresses.txt" or die "Problem: $!"; my @INPUT=<INPUT>; close INPUT; while (@INPUT) { my $inputfile=shift(@INPUT); chomp $inputfile; my $content=HTML::FormatText->format_from_file($inputfile); print Dumper($_, $´); }

    The output is $var1=undef; and $var2=undef; - no problem there I guess...

      Where do $_ and $` appear in your code before now?

      Try Dumper()ing $content

        Dumper()ing $content returns the content of the files, exactly as it should. No references to other variables or self-references.