Say you have two Perl sentences A and B. A runs in 10us and is executed 20 times. B runs in 100us and is executed 10 times. Now suppose that the overhead introduced by Devel::SmallProf is 100us per sentence.
So, you should be optimizing B, but Devel::SmallProf tells you to look at A.
The current implementation of Devel::SmallProf performs a sub call, a call to caller, a hash lookup, a couple of tests and conditionals and two calls to Time::HiRes::time() for every sentence making the timings completely unreliable.
I tried to reduce that overhead as much as possible on Devel::FastProf writing the DB sub in C/XS. But on Devel::NYTProf, they went further (IIRC) replacing Perl OP loop by their own and eliminating much of the overhead of the DB interface.
Very true. “The presence of the experiment will affect the outcome.” But if you can, in fact, presume that it does so in exactly the same way, the various test results will be comparable to each other, even though they will not match performance outside of the profiler.
That can, however, be a big “if.”
Many of the things which affect the perceived speed of an application are external to it; they are environmental. Only things such as these tend to produce a human-perceptible difference in speed for garden-variety applications (vs. say, the much more intensive ones that BrowserUK commonly deals with at his place of business). So you might be testing the virtual-storage system ... how fast your disks can thrash (if they are overloaded), or how long it takes to page-in a very large working set (after which performance probably will stabilize and speed-up considerably, but if you stop the test too soon you won’t know it). (The way I heard it said, by an actual biologist, was thus: “be sure that you are actually measuring the mice, not the temperature of the room or the color of your shirt.”)
In addition, actual pragmatic performance is very dependent upon the pattern of inputs, which leads me often to find that macroeconomics produces more useable intelligence than microeconomics. Simply observing and logging the run time of “a complete job,” whatever that is, might be long-term more useful, and considerably easier. Unless you truly suspect that the performance hog is located in “a particular function or set of functions,” and can be measurably confined to it, it might not be terribly informative to profile on a function level.
I mostly use profiling to discover “hot spots” in unfamiliar code. Then, I look for ways to apply the advice in The Elements of Programming Style:“don’t diddle code to make it faster ... find a better algorithm.”