http://www.perlmonks.org?node_id=652516

awy has asked for the wisdom of the Perl Monks concerning the following question:

This question has been asked on a few occasions in the past, most recently (that I can find) about three years ago. Actually, I am pretty surprised that there does not seem to be an answer along the lines of: "Oh yes, you do this ...".

Is there a way that I can get a complete heap dump of all application data from a Perl program, like an hprof dump from a Sun JVM? I realise that this might be a rather blunt instrument but sometimes sifting through the data is the best way to get to the bottom of a problem.

Why? I'm working on an application - a long-running server program - which, under poorly-defined circumstances, exhibits massive memory growth. This could be a 'leak' (as in a discarded circular reference chain) or it could just be uncontrolled growth of some data structure. The program is large and complex and written by many people and I have little clue which module(s) may be responsible. So picking through details of a dump seems like a good strategy; except that I cannot work out how to get such a dump.

Replies are listed 'Best First'.
Re: Getting a memory dump
by eyepopslikeamosquito (Archbishop) on Nov 23, 2007 at 08:46 UTC
      If you want to use Devel::Leak with recent Perl versions, apply this patch: http://rt.cpan.org/Public/Bug/Display.html?id=22587. Took me a few hours to find that out a couple of weeks ago. Otherwise the module complains about requiring a Perl recompilation with -DDEBUGGING and after that it still does not output anything with sv_dump().
Re: Getting a memory dump
by okram (Monk) on Nov 23, 2007 at 08:18 UTC

    I see this as an XY problem.. you have a problem with X (high memory), and decide to do Y (dump of all the data) to try and solve it... based on the false assumption that Y will actually help you solving X.

    Many of the modules in the Devel:: category will help you "solving" Y: Devel::Peek will help you seeing the Dump() of every variable you use, showing you the exact data that Perl is using for it, such as showing you the value of the variable, the number of reference counts to it, the flags, etc.etc.

    Instead, Devel::Size will tell you the exact memory allocated for each of the entitites (watch out for reference though:

    my $ref = \$data; size($ref); # will give you the same size of $data my $ref = \$data; my $ref2 = \$ref; size($ref2); # will give you the s +ize of the reference itself

    Somehow, I don't think that Devel::Peek or Devel::Size will help you much.

    Now, off to solving X: have you tried using DProf and analysing where your code is spending its time? Are you using warnings/strict/diagnostic, and tried using hooks to the functions you think might cause the problem, via Hook::LexWrap? you can then wrap pieces of debugging code around any function, time it, check anything you want prior to and after the execution of any function..

    Just my 0.01
      You may be right to some extent in your XY supposition but nonetheless I think that a structured dump could be very helpful. This is especially the case because of the problem that I have of memory growth, where a (logical) diff between two snapshots could prove informative.

      The application in question has nearly 300 Perl modules and uses several hundred from CPAN. I do not really have much idea of the source of the problem. Yes, I may be able to narrow it down using DProf but memory use and CPU-use may not be all that well correlated.

      Interactive debugging is pretty much out as the application is single-threaded and has to operate in real time - delays of more than a few hundred milliseconds are fatal, although I can probably get into a sufficiently quiesced state for the few seconds that would be necessary to generate a dump

      I have read perlguts (twice) and other similar material but not found what I am looking for (yet). What I need is a way to walk the heap, finding every Perl variable. If I could work out how to do that then writing a dumper should not be too hard.

Re: Getting a memory dump
by erroneousBollock (Curate) on Nov 23, 2007 at 08:07 UTC
    CORE::dump() ?

    -David

      Thanks. I was rather thinking of something a bit more structured that would list each variable with its attributes and references broken out.
        I was rather thinking of something a bit more structured that would list each variable with its attributes and references broken out.
        Do you have an example of such a facility in some other language?

        Historically, various tools were used to examine a dump corefile, but I've not seen language-specific nor human-readable dumps before.

        Aside from perl variables, all kinds of things will be on the heap for your perl process (eg: things allocated by libraries wrapped in XS interfaces).

        -David

Re: Getting a memory dump
by naikonta (Curate) on Nov 23, 2007 at 13:56 UTC
    Using Data::Dumper and Devel::Peek modules, do print Data::Dumper::Dumper(\%main::) or print Devel::Peek::Dump(\%main::) give you what you want?

    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Getting a memory dump
by BrowserUk (Patriarch) on Nov 23, 2007 at 10:11 UTC
      I'm running this on Linux. It is possible I could try running it in Win-XP but it would take quite a bit of setup.
        It is possible I could try running it in Win-XP but it would take quite a bit of setup.

        Hm. It's possible that the code could be adapted for use on Linux, but I do not have that expertise.

        The basic idea is that I have a module (Devel::MemWatch) that hooks the DB::DB() interface and pushes the current caller information (package/file/line no) onto a Thread::Queue as each line executes. It has a setable threshhold value above which it discards old information as it pushes new, effectively turning the queue into a circular buffer of the last N lines executed.

        It also starts a background thread that wakes up every setable N millseconds and queries the process memory size from the system. If the memory expands beyond a setable limit, it dumps the circular buffer to stderr; doubles the memory limit, and goes back to monitoring.

        The idea is that you get a dump of the last N lines that executed when the memory expanded beyond your preset limit. (And again every time it doubles again). Thus, you can quickly get an idea of what code was executing when the memory started to snowball.

        The (unbelievably crude but functional) Win32 code looks like this:

        Usage is:

        perl -d:MemWatch=LINES=100,THRESHHOLD=100*1024,FREQUENCY=2000 yourscri +pt.pl

        Where

        • LINES: no. of caller traces held in the circular buffer.
        • THRESHHOLD is in Kbytes. (default 100MB)
        • FREQUENCY (at which the watchthread awakes) is in milliseconds. (Default:2 seconds)

        The output looks like:

        C:\test>perl -d:MemWatch=FREQUENCY=5000 junk9.pl FREQUENCY 5000 LINES 100 THRESHHOLD 102400 Watchthread started Memsize: 103156 main junk9.pl 8 main junk9.pl 9 main junk9.pl 9 main junk9.pl 8 main junk9.pl 9 main junk9.pl 9 ...

        Maybe you can adapt it to your needs?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.