http://www.perlmonks.org?node_id=632643

kyle has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I am here to ask the same question asked two years ago in Memory Profiling. That question didn't get the kind of answer I'm looking for, but it's exactly the question I want to ask. Is there a tool that will tell me how a program uses the memory that it uses?

I know about Devel::Size, which is good for when you have a particular data structure that you want to watch. I also know about Memchmark, which will compare the memory used by different algorithms. I'm not asking about those, and I'm not asking how to find a memory leak.

What I want is something like Devel::DProf, but for memory. Imagine you receive a huge piece of code that's using too much memory. How do you tell what parts are using how much?

I'd like a tool that will list things that are using memory along with how much (max, mean, median). I want it to tell me how to find that structure in the code that I'm working on (by name, or by the line on which it's created).

If it's true that nothing like this exists, is it possible to create it? Could an XS module (or even a pure Perl module) somehow trap and track the creation and modification of every hash, array, or scalar?

Your wisdom will be greatly appreciated.

Replies are listed 'Best First'.
Re: Profiling memory
by jbert (Priest) on Aug 15, 2007 at 06:07 UTC
    Outside of the perl world (and hence of little direct use to you), you have valgrind massif, which does pretty much what you ask. Memprof (if it is still working) and exmap (which is more focussed on fairly apportioning actually-used RAM amongst multiple processes, accounting for the vagaries of virtual memory).

    Within perl, I have in the past written code which walks all package variables in all packages (I did it directly, starting at the %main:: hash, but there's probably a module to make that easier), using Devel::Size::total_size to work out the size of each. This worked well fairly well (sorry - I was unable to release it, but it was a fairly quick hack).

    This approach would of course miss any leaked memory (which was no longer rooted in a package var) as well as anything only reachable via a lexical - although that might be fixable with PadWalker. It would also double-count any memory which was reachable via more than one place.

    The code basically took a snapshot of all vars and sizes and then gave diffs against it on each subsequent invocation, looking for growing data structures over time.

    It was all pure perl and I guess it wouldn't take more than a day or two to recreate from the above description.

Re: Profiling memory
by clinton (Priest) on Aug 15, 2007 at 08:02 UTC
    I realise that this isn't what you're asking, but it seemed a relevant place to post this link.

    With mod_perl, it is very useful to know how much memory in each child process is shared with the other processes. Having this knowledge makes it easier to calculate the maximum number of processes you can run without starting to swap. On linux, it is also difficult to assess, because top doesn't take Perl's copy-on-write into account.

    This script (smem.pl) uses Linux::Smaps (linux >= 2.6.14) to report what memory in use by a process is shared or private, and clean or dirty. The author has posted a blog entry about using smem.pl.

    Apologies for hijacking your question, but I thought it worth posting this script to PM.

    Clint

      Smaps is better than not taking shared memory into account but it only differentiates between 'shared' and 'not shared'. If you're only looking at a collection of mostly-identical processes (e.g. all your mod_perls) then this is probably good enough.

      exmap does per-page reckoning of how many processes have that page mapped, then maps that to ELF section and symbols for you.

      Exmap is a more complex to use and lacks a programmatic interface, but if you have a more heterogenous selection of processes to analyse I'd probably use that. (Actually, I'd probably use it anyway, since I wrote it...)