Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Dreaming of a Better Profiler

by samtregar (Abbot)
on May 25, 2004 at 19:27 UTC ( #356343=perlmeditation: print w/ replies, xml ) Need Help??

Perl's profilers bite. I should know, I wrote one: Devel::Profiler. It's marginally better than the most common alternative, Devel::DProf, but it's very slow and breaks in many common scenarios. I've got an idea that might produce something better, but I want to get a reality check before I go and dump time into it.

I got the idea while reading an article about Intel's VTune profiler in Doctor Dobb's Journal. VTune works via an interupt handler which wakes up periodically and samples the instruction pointer of the process. These samples are used to construct a profile of the activity of the process. It's not a "perfect" profile in that it will miss things that happen infrequently or take very little time, but that's entirely acceptable in a profiler.

The up-side is that a sampling profiler like VTune is minimally invasive. It doesn't instrument the code being profiled and it can run so fast that it doesn't slow down the code being profiled. This means that the profiling data should be very reliable, corresponding closely to the behavior of the code when it's not being profiled.

So I started wondering what VTune-for-Perl would be like. Here's some random ideas:

  • Something would run at the start of the process and put the pointer for the currently executing opcode somewhere predictable (on disk? on a fifo?). I think that's called 'curcop' in the Perl core, but I could be wrong.
  • The interupt handler would wake up and saves the value of curcop each time it runs. The interval would be configurable just like it is in VTune.
  • At the end of the process code would need to be run to turn the curcop values into subroutine names and (if possible) line numbers. It seems like a walk of the op-tree with the B:: tools might allow this, but I'm not sure. I do know that this has to happen at the end because of eval"", dynamic requires, autoloading, etc.
Given the above, I have a few problems:

  • I don't know how to write an interupt handler. I think it would require a kernel module under Linux, which would be entirely new ground for me.
  • I don't know how I would get ahold of &curcop at the start of the process. Maybe write a tiny XS module to do it?
  • I don't know how to map curcop values to subroutine names.

Help or pointers to RTFM on any of these topics would be greatly appreciated.

-sam

PS: Please don't suggest I use the Perl debugger (it's broken) or fix the Perl debugger (I can't and I think the only person that can is Ilya and he's too busy). If someone did fix the debugger I could just use Devel::DProf and it would probably be good enough. If you want to know what's wrong with the debugger, do a search on perl5-porters for bug-reports concerning seg-faults from Devel::DProf.

Comment on Dreaming of a Better Profiler
Re: Dreaming of a Better Profiler
by vek (Prior) on May 25, 2004 at 21:29 UTC

    PS: Please don't suggest I use the Perl debugger (it's broken) or fix the Perl debugger (I can't and I think the only person that can is Ilya and he's too busy).

    I thought that MJD was working on the Perl debugger.

    -- vek --
      I wish him all the luck in the world. He'll need it!

      -sam

Re: Dreaming of a Better Profiler
by Zaxo (Archbishop) on May 25, 2004 at 21:33 UTC

    An actual interrupt handler on Linux would indeed take a kernel module. Probably any other Unix, as well. There aren't that many unused interrupts around on x86, either.

    How about something like this?

    1. fork a child which issues kill USR1, $parent now and then.
    2. In the parent define a $SIG{USR1} handler which increments a global hash of counts of curcop.

    That is not perfect. Generally speaking, neither is the interrupt sampler you describe. With safe signals, the opcode following a long-running instruction like gethostbyname will be way oversampled. In either case sampling gets you running time statistics, but you give up coverage data.

    My own notion of a neat profiler for perl would make use of the performance counter registers of modern processors. I've worked on that some. Maybe I should resurrect that project.

    After Compline,
    Zaxo

      How about something like this?

      That's a reasonable idea, and it might just work. It might also have too much overhead to run fast enough to get a good sampling rate. At the very least it could be a good way to get the curcop-logging and subroutine-name resolution code working without tackling the interupt handler.

      My own notion of a neat profiler for perl would make use of the performance counter registers of modern processors. I've worked on that some. Maybe I should resurrect that project.

      That sounds very interesting. Can you suggest any reading material on the subject?

      Thanks,
      -sam

        The Intel processor manuals are probably the best reference for the low-level stuff. They are available online in pdf format.

        Perfctr is a Linux kernel patch and user library which permits users to access the performance counters on a per-process basis. One drawback to that approach is that is very platform specific. It would be difficult to write a perl module distribution of acceptable generality.

        After Compline,
        Zaxo

Re: Dreaming of a Better Profiler
by adrianh (Chancellor) on May 25, 2004 at 21:41 UTC
    These samples are used to construct a profile of the activity of the process. It's not a "perfect" profile in that it will miss things that happen infrequently or take very little time, but that's entirely acceptable in a profiler.

    I used an in-house profiler with similar sounding functionality a few years back (with some C/assembler not Perl). While the non-invasiveness is handy it does come with it's own set of hassles.

    The period of the interrupt can sometimes mesh with the period of the events in your code giving misleading results. At its most basic you can have something like:

    while ( 1 ) { foo(); # takes 10 microseconds bar(); # takes 100 microseconds };

    and have the interrupt occur in foo() every time because the period of the interrupt matches the period of the loop.

    Of course there are ways around this (vary the timing of the interrupt, run multiple profiles) - the technique is certainly not a bad idea in general. However unless you're paying attention the results can occasionally be a lot worse than 'perfectly acceptable'.

      Yeah, that's mentioned in the article I read. It seems particularly unlikely to bite in Perl given the much longer runtimes for your average Perl subroutine and the variable overhead of all the memory allocation inherent in even well-tuned Perl code.

      -sam

Re: Dreaming of a Better Profiler
by Anonymous Monk on May 26, 2004 at 01:43 UTC
      Yes. It uses Perl's debugger hooks so it will almost certainly suffer from the same problems as Devel::DProf. As far as I was able to determine the bugs in Devel::DProf are actually bugs in Perl's debugger.

      -sam

        I don't think so. I've used both. I've never had much luck with Devel::DProf. I've never had a problem with Devel::Profile.

        My understanding of the problem with Devel::DProf, is that is can't handle subs that don't return (exceptions, gotos, etc). Devel::Profile handles such situations.

Re: Dreaming of a Better Profiler
by DrHyde (Prior) on May 26, 2004 at 07:49 UTC
    Could you use the alarm function from Time::HiRes to generate an "interrupt" (actually a signal) every Nth of a second?
      Not if the process being profiled wants to use alarms. Unfortunately there's only one available per process.

      -sam

Re: Dreaming of a Better Profiler
by Dominus (Parson) on May 26, 2004 at 14:40 UTC
    SamTregar says:
    # don't know how to write an interupt handler. I think it would require a kernel module under Linux, which would be entirely new ground for me.
    I wouldn't use a Kernel module; I think that would be hard. I would replace Perl's standard 'runops' loop with a custom one. See perlguts: "Pluggable runops".

    Hope this helps.

Re: Dreaming of a Better Profiler
by toma (Vicar) on May 26, 2004 at 15:39 UTC
    Even though it was designed for audio work, JACK will provide you with a real-time framework that has low-latency interrupts. Use it with a low-latency linux kernel.

    I've used it for C++ profiling, it works great. You could use the same technique with perl.

    Since I'm a hardware guy, I flip a bit on the parallel port at the entrance and the exit of the process, and watch the duty cycle of the waveform with an oscilloscope. You could use a cheap voltmeter to do the same trick, just add a resistor and capacitor to convert the pulse widths into a DC voltage.

    It should work perfectly the first time! - toma
Re: Dreaming of a Better Profiler
by andyf (Pilgrim) on May 26, 2004 at 23:18 UTC
    To add a little to what Adrian says, periodicity could be a problem. Many code segments exhibit very periodic behaviour.
    As in all sampling Nyquist says that close to the point where the sampling rate and half the execution rate coincide you will get an error, or alias given by the difference of the two. If any two metrics you are measuring are opposite in magnitude on opposite sampling cycles you will get a zero average and completely lose the magnitude. I know we don't get negative memory usage etc, but the point is illustrated. Since the profiler is running at a fraction of the execution frequency the results may vary wildly between runs.
    Introducing a random time offset to every sample will give you the effect similar to dithering, which actually increases the accuracy given a few runs to average over.
    Sounds like a useful tool, good work.
Re: Dreaming of a Better Profiler
by theorbtwo (Prior) on May 27, 2004 at 15:10 UTC

    If I were you, and were a unix man (the two being somewhat redundant), then I'd look at using a sepperate process to that running perl, and using the ptrace system call (probably from C) to periodicly find the curop. You'll have to be able to find the address of the curop variable, which probably means using a perl compiled for debugging. Note that this requires a fair bit of system-specificness (ptrace(2) is documented, on linux, as conforming to a fair number of standards, and being quite system specific. Win32, OTOH, does it's debugging in a completely different way.)

    This would allow you to use all the normal methods to sleep, write, whatever. It, however, has a fair bit of slowness associated with it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://356343]
Approved by petdance
Front-paged by petdance
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2014-12-19 02:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (70 votes), past polls