Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Profiling your code.

by kyle (Abbot)
on Aug 20, 2007 at 03:26 UTC ( #633699=perltutorial: print w/ replies, xml ) Need Help??

This tutorial will cover the basics of profiling your code. I will tell you how to get profiling information and how to interpret it.

Modules you can use for profiling

I generally try Devel::DProf first. If it doesn't work (i.e., it crashes), I go to Devel::Profiler.

How to run existing code under the profiler

With Devel::DProf, it's

perl -d:DProf foo.pl

With Devel::Profiler, it's

perl -MDevel::Profiler foo.pl

...or you can "use Devel::Profiler" anywhere in your code.

It's worth noting that the profiler will slow down the code you're running, so expect a slow program to get slower while you're trying to figure out why it's so slow. Also, the raw output files can get very large, especially if your program runs a long time.

How to get the results of profiling

With profiling turned on, a file called "tmon.out" will be created in the current directory. That file contains all the raw timing information for your run. To get a condensed report, run

dprofpp

That will read "tmon.out" in the current directory and give you a summary of profiling information.

Interpreting the results of profiling

Now that we know how to get profiling data, we have to know what it means. I'll take you through a series of examples to show you the meaning of the numbers you get.

The basics

Consider the following program:

main(); sub main { my $now = time(); for( my $x = 0; $x <= 1_000_000; $x++ ) { print scalar localtime, "\n" if $now != time(); $now = time(); } return; }

This is a single call to main(), which loops a million times. On my system, I run this, and I get the following output from dprofpp:

Total Elapsed Time = 1.859881 Seconds User+System Time = 1.389881 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 98.5 1.370 1.370 1 1.3700 1.3700 main::main 0.72 0.010 0.010 2 0.0050 0.0050 main::BEGIN

Let's talk about what each of those numbers mean.

Total Elapsed Time
This is wall clock time from the program's start to finish, no matter how that time was spent.
User+System Time
This is how much CPU time the program took. This does not include time spent waiting on the disk, the network, or other tasks. ("User time" is time spent in your code and "system time" is time the operating system spent serving your code.)
%Time
This is the percent of "User+System Time" spent exclusively in the specified function. It does not include time spent in functions that this function called.
ExclSec (exclusive seconds)
Time spent just in this function, not in functions it called.
CumulS (cumulative seconds)
Time spent in this function, including time spent in functions that it called.
#Calls (number of calls)
Number of times the function was called.
sec/call (seconds per call)
The average number of seconds (in user+system time) spent in each call to this function.
Csec/c (cumulative seconds per call)
The average number of seconds spent in each call to this function, including time spent in the functions it called.

Exclusive seconds vs. cumulative seconds

The next example is meant to show the difference between exclusive seconds and cumulative seconds.

main(); sub main { all_the_work() } sub all_the_work { my $now = time(); for( my $x = 0; $x <= 1_000_000; $x++ ) { print scalar localtime, "\n" if $now != time(); $now = time(); } return; }

In this program, main() does nothing but call a single function all_the_work(), which loops a million times. Here's the data:

Total Elapsed Time = 1.809848 Seconds User+System Time = 1.379848 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 98.5 1.360 1.360 1 1.3600 1.3600 main::all_the_work 0.00 - 1.360 1 - 1.3600 main::main 0.00 - 0.010 2 - 0.0050 main::BEGIN

Notice that in the ExclSec column, all_the_work takes most of the time, and main takes so little time, it's not even noted. However, in CumulS, both of them take most of the time. The time spent in all_the_work is included in the cumulative time spent in main.

Elapsed time vs. User+system time

Here's the example:

main(); sub main { do_nothing(); all_the_work(); } sub do_nothing { sleep 5 } sub all_the_work { my $now = time(); for( my $x = 0; $x <= 1_000_000; $x++ ) { print scalar localtime, "\n" if $now != time(); $now = time(); } return; }

In this one, main() calls do_nothing(), which doesn't do any work, but it takes five seconds to finish (using sleep). Then main() calls all_the_work(), which loops a million times.

Here's the data:

Total Elapsed Time = 6.809867 Seconds User+System Time = 1.459867 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 97.9 1.430 1.430 1 1.4300 1.4300 main::all_the_work 0.68 0.010 0.010 2 0.0050 0.0050 main::BEGIN 0.00 - -0.000 1 - - main::do_nothing 0.00 - 1.430 1 - 1.4300 main::main

Notice that the elapsed time is a lot higher than the user+system time. Notice also that the elapsed time doesn't show up in any of the other statistics. The numbers for do_nothing do not show that five seconds passed while it was "running."

When there's a large gap between elapsed and user+system time like that, it is often useful to run dprofpp with the -r option. Here's the same raw data used above, but displayed with dprofpp -r:

Total Elapsed Time = 6.799829 Seconds Real Time = 6.799829 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 73.5 5.000 5.000 1 5.0000 5.0000 main::do_nothing 26.1 1.780 1.780 1 1.7800 1.7800 main::all_the_work 0.15 0.010 0.020 2 0.0050 0.0100 main::BEGIN 0.00 - 6.780 1 - 6.7800 main::main

With the -r option, dprofpp reports elapsed times for all functions instead of just the CPU time. With that, it's easy to see that most of the real time was spent in do_nothing().

What causes differences between elapsed time and user+system time? This can be anything that your program waits on when it's not actively running. Typically this is one of:

  • Network connections
  • Disk access
  • Other processes running on the same machine
  • Database access (which is often a combination of the above)

Other results available via dprofpp

The dprofpp tool has options to make its output more useful in different situations. By default it will show you the top 15 functions sorted by user+system time. With the -g option, you can have dprofpp focus on one function. With the -O option, you can have it show any number of functions.

Sorting options

-a
Alphabetical by name
-z
(default) Sort by user+system time used.
-l
Sort by number of calls to the subroutines.
-v
Sort by average time spent in subroutines.

Time measurements to display

-s
Display system times rather than user+system times.
-r
Display elapsed real times rather than user+system times.
-u
Display user times rather than user+system times.

More options

There are also options to show call trees and do other interesting things. See the dprofpp documentation for the full scoop.

Things I haven't covered

Comment on Profiling your code.
Select or Download Code
Re: Profiling your code.
by missingthepoint (Friar) on Jan 12, 2009 at 03:01 UTC

    There's also Devel::NYTProf, which is more up-to-date and feature-rich than the three modules above.


    Life is denied by lack of attention,
    whether it be to cleaning windows
    or trying to write a masterpiece...
    -- Nadia Boulanger
Re: Profiling your code.
by targetsmart (Curate) on Jan 28, 2009 at 06:03 UTC
    There's also Devel::sdb, which I frequently use (handles processes that does fork, with an additional script)

    -- In accordance with the prarabdha of each, the One whose function it is to ordain makes each to act. What will not happen will never happen, whatever effort one may put forth. And what will happen will not fail to happen, however much one may seek to prevent it. This is certain. The part of wisdom therefore is to stay quiet.
Re: Profiling your code.
by dHarry (Abbot) on Feb 06, 2009 at 10:46 UTC

    IMHE (In My Humble Experience), the sub based profilers often tell you what you already know, i.e. one particular sub is to blame. I found Devel::FastProf a line based profiler, very useful. It gives you information on what parts of the evil sub are the real performance killers. You use fprofpp to read the profile information.

    Example output

    # fprofpp output format is: # filename:line time count: source test_index.pl:567 7.33688 4248810: next LINE; test_index.pl:100 0.02172 10231: sub trim test_index.pl:154 0.02120 10230: sub rm_cmnts { test_index.pl:468 0.02108 6138: $KEYWORDS[$i]{'max_width'} = length($v +alue); test_index.pl:373 0.01980 7411: return unless (!(-d $_));
    This also gives you useful debugging information, i.e. a certain line of code is executed less/more often than anticipated. So in practice I combine DProf and FastProf.

    However... then I read about Devel::NYTProf, as suggested by missingthepoint above. It seems to combine the best of both DProf and FastProf and on top of that has tons of extra features. Well, thatís the next one on my TO DO list...

      What makes NYTProf so brilliant is that it puts a huge amount of information that you need to understand what's going on right in front of you. You don't need to go hunt it down yourself. And then it provides navigation to where you might want to go in order to continue the chase.

      /J

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perltutorial [id://633699]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (8)
As of 2014-07-11 03:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (218 votes), past polls