Say you have two Perl sentences A and B. A runs in 10us and is executed 20 times. B runs in 100us and is executed 10 times. Now suppose that the overhead introduced by Devel::SmallProf is 100us per sentence.
real_time(A) = 20 * 20 = 400us
real_time(B) = 100 * 10 = 1000us
measured_time(A) = ( 20 + 100) * 20 = 2400us
measured_time(B) = (100 + 100) * 10 = 2000us
So, you should be optimizing B, but Devel::SmallProf tells you to look at A.
The current implementation of Devel::SmallProf performs a sub call, a call to caller, a hash lookup, a couple of tests and conditionals and two calls to Time::HiRes::time() for every sentence making the timings completely unreliable.
I tried to reduce that overhead as much as possible on Devel::FastProf writing the DB sub in C/XS. But on Devel::NYTProf, they went further (IIRC) replacing Perl OP loop by their own and eliminating much of the overhead of the DB interface.