Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
Just another Perl shrine
 
PerlMonks  

Re^8: "Automated" benchmarking

by salva (Monsignor)
on Jan 24, 2012 at 17:06 UTC ( #949733=note: print w/ replies, xml ) Need Help??


in reply to Re^7: "Automated" benchmarking
in thread "Automated" benchmarking

A nulltime compensation is effective when the overhead is relatively small compared to the sentence execution time. But when it is much bigger as in this case, the compensated overhead (aka error) can still be of the same order of magnitude of the time we want to measure.

Guess why the last line you have quoted from the source is there!

package DB; use Time::HiRes; my %testDB = ( null => sub { my($pkg,$filename,$line) = caller; $profile || return; %packages && !$packages{$pkg} && return; }, work => sub { $do_something{foo} = 'bar'; my($pkg,$filename,$line) = caller; $profile || return; %packages && !$packages{$pkg} && return; }); my $n = 10; for (1..$n) { for my $t (keys %testDB) { my $testDB = $testDB{$t}; my $nulltime = 0; for (1..100) { my($u,$s,$cu,$cs) = times; $cstart = $u+$s+$cu+$cs; $start = Time::HiRes::time; &$testDB; ($u,$s,$cu,$cs) = times; $cdone = $u+$s+$cu+$cs; $done = Time::HiRes::time; $diff = $done - $start; $nulltime += $diff; } $time{$t} = $nulltime / 100 * 1e6; } printf "null: %.2f, work: %.2f, delta: %.2f\n", $time{null}, $time{work}, $time{work} - $time{null}; }


Comment on Re^8: "Automated" benchmarking
Download Code
Re^9: "Automated" benchmarking
by BrowserUk (Pope) on Jan 24, 2012 at 18:32 UTC

    I get this output from your code. Is that good, bad or ugly?

    null: 6.63, work: 7.41, delta: 0.77 null: 6.45, work: 7.60, delta: 1.14 null: 6.41, work: 7.36, delta: 0.95 null: 6.49, work: 7.25, delta: 0.77 null: 6.47, work: 7.41, delta: 0.93 null: 6.52, work: 7.37, delta: 0.85 null: 6.38, work: 7.30, delta: 0.93 null: 6.44, work: 7.22, delta: 0.78 null: 6.49, work: 7.38, delta: 0.89 null: 6.50, work: 7.29, delta: 0.79

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Well, it seems that on your system the output is pretty consistent as all the delta values are very similar. On my computer it generates a broader range of values:
      null: 4.34, work: 4.53, delta: 0.19 null: 4.70, work: 4.67, delta: -0.03 null: 5.33, work: 6.38, delta: 1.05 null: 4.83, work: 6.05, delta: 1.22 null: 4.68, work: 5.27, delta: 0.58 null: 6.56, work: 5.12, delta: -1.44 null: 4.42, work: 9.26, delta: 4.85 null: 5.26, work: 4.58, delta: -0.68 null: 6.63, work: 5.86, delta: -0.78 null: 4.34, work: 7.54, delta: 3.20

        If I decrease the count on the inner loop (and correspondingly the divisor), then I begin to see much wider variations. Including some negative values if I drop the count to one. But with the count set to 100, the results I posted are typical even over quite long time frames. Even when I have a video playing in the background.

        Could this be a benefit of using a multicore system? (I'm not doing anything trick with the affinity mask or similar.)


        As an aside, I was exploring the differences between smallprof and NYTprof and found a couple of interesting things. This is the (cut-down) output from the two profilers run against code designed to highlight discrepancies and overheads. Basically, just a bunch of millisecond accurate sleeps of various lengths run in loops:

        count wall tm cpu time line 0 0.00000 0.00000 1:#! perl -slw 0 0.00000 0.00000 2:use strict; 0 0.00000 0.00000 3:use Time::HiRes qw[ time ]; 0 0.00000 0.00000 4: 0 0.00000 0.00000 5:sub aaa { 5003 4.90137 0.00000 6: Win32::Sleep( 1 ); 0 0.00000 0.00000 7:} 0 0.00000 0.00000 8: 0 0.00000 0.00000 9:sub bbb { 500 4.98694 0.00000 10: Win32::Sleep( 10 ); 0 0.00000 0.00000 11:} 0 0.00000 0.00000 12: 0 0.00000 0.00000 13:sub ccc { 50 4.99772 0.00000 14: Win32::Sleep( 100 ); 0 0.00000 0.00000 15:} 0 0.00000 0.00000 16: 0 0.00000 0.00000 17:sub ddd { 5 4.99970 0.00000 18: Win32::Sleep( 1000 ); 0 0.00000 0.00000 19:} 0 0.00000 0.00000 20: 0 0.00000 0.00000 21:sub eee { 10 0.00002 0.00000 22: aaa() for 1 .. 1000; 10 0.00001 0.00000 23: bbb() for 1 .. 100; 10 0.00002 0.00000 24: ccc() for 1 .. 10; 10 0.00002 0.00000 25: ddd() for 1 .. 1; 0 0.00000 0.00000 26:} 0 0.00000 0.00000 27: 2 0.00000 0.00000 28:eee() for 1 .. 5; 0 0.00000 0.00000 29: # Profile data generated by Devel::NYTProf::Reader # Version: v2.08 # More information at http://search.cpan.org/dist/Devel-NYTProf/ # Format: time,calls,time/call,code 0.003000 1 0.003000 #! perl -slw 0.002000 3 0.000667 use strict; 0.003000 3 0.001000 use Time::HiRes qw[ time ]; 0.000000 0 0.000000 0.000000 0 0.000000 sub aaa { 5.010000 5003 0.001001 Win32::Sleep( 1 ); 0.000000 0 0.000000 } 0.000000 0 0.000000 0.000000 0 0.000000 sub bbb { 5.000000 500 0.010000 Win32::Sleep( 10 ); 0.000000 0 0.000000 } 0.000000 0 0.000000 0.000000 0 0.000000 sub ccc { 5.000000 50 0.100000 Win32::Sleep( 100 ); 0.000000 0 0.000000 } 0.000000 0 0.000000 0.000000 0 0.000000 sub ddd { 5.000000 5 1.000000 Win32::Sleep( 1000 ); 0.000000 0 0.000000 } 0.000000 0 0.000000 0.000000 0 0.000000 sub eee { 0.000000 10 0.000000 aaa() for 1 .. 1000; 0.000000 10 0.000000 bbb() for 1 .. 100; 0.000000 10 0.000000 ccc() for 1 .. 10; 0.000000 10 0.000000 ddd() for 1 .. 1; 0.000000 0 0.000000 } 0.000000 0 0.000000 0.000000 2 0.000000 eee() for 1 .. 5; 0.000000 0 0.000000
        1. The first thing I notice is that smallprof is less accurate.

          The interesting thing is that many of the lowest timecost lines are being clamped to zero.

          And looking at the timing of the highest timecost lines, the reason is obvious. It is consistently over compensating with its nulltime calculation. There looks to be ready scope there for improvements by introducing a fudge factor.

        2. The second, and rather baffling thing I noticed is that iteration counts are wrong in some cases. By both profilers.

          For the last line eee() for 1 .. 5;, the count is two but ought be one. That in trun has a multipler effect on the iteration counts for the 4 lines in sub eee().

          But strangely, that multiplier effect disappears for the subs called from eee(), which are as they should be if the last line was calle only once and eee() is called only 5 times.

          But most mysterious of all, the count for the Win32::Sleep(1) line inside aaa(), is out by 3, from its expected value.

          These results would be mysterious enough were they confined to one profiler; but are (for me at least) totally baffling as they are consistently wrong for both profiles.

      Have you any thoughts, wisdoms or cluebats as to what gives?

      If not, I'll probably lift this into a new thread and seek a wider audience.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://949733]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2014-04-21 14:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (495 votes), past polls