Very true. “The presence of the experiment will affect the outcome.” But if you can, in fact, presume that it does so in exactly the same way, the various test results will be comparable to each other, even though they will not match performance outside of the profiler.
That can, however, be a big “if.”
Many of the things which affect the perceived speed of an application are external to it; they are environmental. Only things such as these tend to produce a human-perceptible difference in speed for garden-variety applications (vs. say, the much more intensive ones that BrowserUK commonly deals with at his place of business). So you might be testing the virtual-storage system ... how fast your disks can thrash (if they are overloaded), or how long it takes to page-in a very large working set (after which performance probably will stabilize and speed-up considerably, but if you stop the test too soon you won’t know it). (The way I heard it said, by an actual biologist, was thus: “be sure that you are actually measuring the mice, not the temperature of the room or the color of your shirt.”)
In addition, actual pragmatic performance is very dependent upon the pattern of inputs, which leads me often to find that macroeconomics produces more useable intelligence than microeconomics. Simply observing and logging the run time of “a complete job,” whatever that is, might be long-term more useful, and considerably easier. Unless you truly suspect that the performance hog is located in “a particular function or set of functions,” and can be measurably confined to it, it might not be terribly informative to profile on a function level.
I mostly use profiling to discover “hot spots” in unfamiliar code. Then, I look for ways to apply the advice in The Elements of Programming Style: “don’t diddle code to make it faster ... find a better algorithm.”