Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Algorithm Efficiency vs. Specialized Hardware?

by Russ (Deacon)
on Jun 20, 2000 at 11:39 UTC ( #18944=perlquestion: print w/ replies, xml ) Need Help??
Russ has asked for the wisdom of the Perl Monks concerning the following question:

Aighearach and I have been sharing some interesting code.

I have been exploring the efficiency differences between a few algorithms. The benchmarking results I see are difficult to explain. (See the top of the previous thread for more background.)

I would like to ask other Monks to try this code and post the results, along with the Perl version (perl -v) and OS/hardware info.

Thank you all,

Russ

#!/usr/bin/perl -w use strict; my $now = 8; my %url = ( monday => { @{[map(($_,1), (1..1000))]} } ); use Benchmark; timethese(1000000, { Grep => q{ $now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1]; }, Ternary => q{ $now = ($now < $_ && $_ < 8 ? $_ : $now) for keys %{$url{'monday'} +}; }, Max => q{ foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ > $now }; } });

Comment on Algorithm Efficiency vs. Specialized Hardware?
Download Code
RE: Algorithm Efficiency vs. Specialized Hardware?
by Russ (Deacon) on Jun 20, 2000 at 11:42 UTC
    P.S. to the above post: grep should be slower than the other two algorithms. Aighearach's results were predictable, where mine are distinctly opposite...
    Update:
    Having seen a few posts, I have to "update my version of reality" (as a friend always chides me for doing slowly). The grep sort implementation is running faster on everyone else's machines, too. Algorithm analysis would conclude that since sort is in O(N log N), it should be slower than an O(N) algorithm (Max and Ternary should both be in O(N)). Real-world results don't show this.

    What's going on? Somebody help me update my version of reality! :-)

    Here is what happens on my system:

    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 10 wallclock secs (10.75 usr + 0.05 sys = 10.80 CPU) Max: 31 wallclock secs (31.10 usr + 0.05 sys = 31.15 CPU) Ternary: 32 wallclock secs (31.29 usr + 0.01 sys = 31.30 CPU)
    It is a dual PPro 200 (180, overclocked), RedHat 5.2 (seriously upgraded - 2.2.12 kernel).
    perl -v says:
    This is perl, version 5.005_57 built for i686-linux-thread
Re: Algorithm Efficiency vs. Specialized Hardware?
by le (Friar) on Jun 20, 2000 at 11:45 UTC
    Perl Version: 5.005_03
    OS/Hardware: FreeBSD 4.0-STABLE, Pentium II 233, 98MB RAM, running many other things including X, Apache, MySQL... (the also running xmms didn't stutter one moment :)
    Skript Output:
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 13 wallclock secs ( 9.12 usr + 0.01 sys = 9.13 CPU) Max: 36 wallclock secs (26.21 usr + 0.02 sys = 26.23 CPU) Ternary: 35 wallclock secs (26.66 usr + 0.04 sys = 26.70 CPU)
Re: Algorithm Efficiency vs. Specialized Hardware?
by mdillon (Priest) on Jun 20, 2000 at 11:51 UTC
    [mike@prometheus PerlMonks]$ ./Russ.pl Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 16 wallclock secs (14.74 usr + 0.10 sys = 14.84 CPU) @ 67 +385.44/s (n=1000000) Max: 37 wallclock secs (36.62 usr + 0.00 sys = 36.62 CPU) @ 27 +307.48/s (n=1000000) Ternary: 39 wallclock secs (38.23 usr + 0.01 sys = 38.24 CPU) @ 26 +150.63/s (n=1000000) [mike@prometheus PerlMonks]$ perl -v This is perl, v5.6.0 built for i386-linux [mike@prometheus PerlMonks]$ uname -mrs Linux 2.2.16-1 i586 [mike@prometheus PerlMonks]$ free total used free shared buffers cac +hed Mem: 63032 61552 1480 20196 2012 23 +872 -/+ buffers/cache: 35668 27364 [mike@prometheus PerlMonks]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 5 model : 2 model name : Pentium 75 - 200 stepping : 12 cpu MHz : 199.436
Re: Algorithm Efficiency vs. Specialized Hardware?
by dempa (Friar) on Jun 20, 2000 at 11:57 UTC

    Sun Ultra 10 440MHz with 320MB RAM with Solaris 7 running Perl 5.6.0:
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 9 wallclock secs ( 8.85 usr + 0.00 sys = 8.85 CPU) @ 11 +2994.35/s (n=1000000) Max: 12 wallclock secs (12.10 usr + 0.00 sys = 12.10 CPU) @ 82 +644.63/s (n=1000000) Ternary: 12 wallclock secs (12.17 usr + 0.00 sys = 12.17 CPU) @ 82 +169.27/s (n=1000000)

    Update: I killed a few processes, exited X and so on... Got this result:
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 5 wallclock secs ( 5.32 usr + 0.00 sys = 5.32 CPU) @ 18 +7969.92/s (n=1000000) Max: 9 wallclock secs ( 8.64 usr + 0.00 sys = 8.64 CPU) @ 11 +5740.74/s (n=1000000) Ternary: 8 wallclock secs ( 8.63 usr + 0.01 sys = 8.64 CPU) @ 11 +5740.74/s (n=1000000)
RE: Algorithm Efficiency vs. Specialized Hardware?
by redmist (Deacon) on Jun 20, 2000 at 12:00 UTC
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...
    Grep: 2 wallclock secs ( 2.33 usr + 0.00 sys = 2.33 CPU)
    Max: 7 wallclock secs ( 6.27 usr + 0.00 sys = 6.27 CPU)
    Ternary: 6 wallclock secs ( 6.20 usr + 0.00 sys = 6.20 CPU)

    This was tested on Win2K, dual Celies 366MHz->523MHz, 128MB PC133 SDRAM, while playing some MP3s (if that matters), using Perl 5.005_03.

    redmist
Re: Algorithm Efficiency vs. Specialized Hardware?
by Aighearach on Jun 20, 2000 at 14:10 UTC
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...
          Grep:  4 wallclock secs ( 3.02 usr +  0.00 sys =  3.02 CPU) @ 331125.83/s (n=1000000)
           Max:  7 wallclock secs ( 5.79 usr +  0.00 sys =  5.79 CPU) @ 172711.57/s (n=1000000)
       Ternary:  5 wallclock secs ( 5.72 usr +  0.00 sys =  5.72 CPU) @ 174825.17/s (n=1000000)
    

    Which is very different than what I had before. To shed some light on this, let me first point out that the code above is almost but not quite the code I got my previous results from.

    The only difference is, my code put each snippet to be benchmarked inside a sub, and then game Benchmark the function calls. I did this because I thought it more acurately reflected real-world max() usage, but this is certainly not true for all programs. Also different is that I changed my to our, so as not to change the function calls or add use vars. Would this make any difference speed difference? I know our is new to 5.6.0, and I am already loving it, but I don't know it's internals.

    Now I'll take Russ's code given above and just make that one change. The function calls are SO expensive compared to the rest of the code, that I will turn down the iterations to 10_000, but we will see it take much, much longer. Here is the code:

    #!/usr/bin/perl -w use strict; our $now = 8; our %url = ( monday => { @{[map(($_,1), (1..1000))]} } ); use Benchmark; timethese(10000, { Grep => q{ Grep(); }, Ternary => q{ Ternary(); }, Max => q{ Max(); } }); sub Grep { $now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1]; } sub Ternary { $now = ($now < $_ && $_ < 8 ? $_ : $now) for keys %{$url{'monday'} +}; } sub Max { foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ > $now }; }

    And the results:

    Benchmark: timing 10000 iterations of Grep, Max, Ternary...
          Grep: 72 wallclock secs (68.98 usr +  0.00 sys = 68.98 CPU) @ 144.97/s (n=10000)
           Max: 64 wallclock secs (60.01 usr +  0.00 sys = 60.01 CPU) @ 166.64/s (n=10000)
       Ternary: 66 wallclock secs (63.24 usr +  0.01 sys = 63.25 CPU) @ 158.10/s (n=10000)
    

    And so this is more consistant with my earlier resutls.

    Platform is an AMD K6-II/400 128megs SDRAM, running linux 2.2.12 and perl 5.6.0.

    I really don't understand these results. I understand that function calls are very, very expensive( the reason OO is often slow; accessor methods vs lookups ), but I don't understand why the proportions are different. Why doesn't the function call add the same amount of proc time for each call? Or is there something else?

    Paris Sinclair    |    4a75737420416e6f74686572
    pariss@efn.org    |    205065726c204861636b6572
    I wear my Geek Code on my finger.
    
      aighearach, I saw the same thing, but forgot to post anything about it. Thank you.

      I wonder about this as well. I suspected it was the profiler causing the extra overhead, but your results show it happening with just Benchmark. I have run it this way myself and found similar results.

      This is funny. I "preach" to co-workers to forget their 'C' days, and ignore performance. "Perl is about programmer efficiency, not CPU efficiency," I say. Now, I am fascinated with this discussion, which is all about esoteric trivia of Perl efficiency.

      :-)

      Russ

        No, it's not the extra weight of the subroutine calls, although sub calls do cause more overhead. What's causing this is what I mentioned earlier: you have scoping problems. None of the code you're benchmarking is actually using a valid value for either %url or $now. So your results are completely bogus.

        The valid results are those that use a properly scoped variable. Aighearach's code did this, though I'm not sure it did so intentionally.

        You can accomplish this by declaring your vars as package globals:

        use vars qw/$now %url/; $now = 8; %url = ( monday => { @{[map(($_,1), (1..1000))]} } );
        This gives valid results.
        Benchmark: timing 1000 iterations of Grep, Max, Ternary... Grep: 6 wallclock secs ( 5.49 usr + 0.00 sys = 5.49 CPU) Max: 5 wallclock secs ( 5.81 usr + 0.00 sys = 5.81 CPU) Ternary: 7 wallclock secs ( 6.42 usr + 0.00 sys = 6.42 CPU)
Re: Algorithm Efficiency vs. Specialized Hardware?
by Ovid (Cardinal) on Jun 20, 2000 at 14:20 UTC
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 5 wallclock secs ( 4.44 usr + 0.00 sys = 4.44 CPU) Max: 12 wallclock secs (12.42 usr + 0.00 sys = 12.42 CPU) Ternary: 12 wallclock secs (12.09 usr + 0.00 sys = 12.09 CPU)
    Win98, 96 MB RAM, AMD K6-350 3D processor
    ActiveState Perl 5.005_03

    (Nice to know it matched the Solaris machine in Performance :)

    Update: So, Ol' Mister Sun/Solaris (see dempa above) thinks he gonna outperform my Windows box, huh? (Um, well...duh!)

    So, I just closed everything I was working on, booted to DOS and reran the benchmarks. My results:

    This program does not run in DOS mode.
    *sniff, sniff*
      Gotta love (sarcasm) Microsoft.
Re: Algorithm Efficiency vs. Specialized Hardware?
by t0mas (Priest) on Jun 20, 2000 at 14:27 UTC
    Win2k on P233/128Mb RAM.
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 9 wallclock secs (10.02 usr + 0.00 sys = 10.02 CPU) @ 99 +760.57/s (n=1000000) Max: 21 wallclock secs (20.92 usr + 0.00 sys = 20.92 CPU) @ 47 +801.15/s (n=1000000) Ternary: 22 wallclock secs (21.31 usr + 0.00 sys = 21.31 CPU) @ 46 +926.33/s (n=1000000) This is perl, v5.6.0 built for MSWin32-x86-multi-thread
    Linux RH6.0 on same machine (but different perl).
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 8 wallclock secs ( 8.39 usr + 0.00 sys = 8.39 CPU) Max: 26 wallclock secs (25.82 usr + 0.00 sys = 25.82 CPU) Ternary: 27 wallclock secs (25.67 usr + 0.00 sys = 25.67 CPU) This is perl, version 5.005_03 built for i386-linux


    /brother t0mas
Re: Algorithm Efficiency vs. Specialized Hardware?
by Michalis (Pilgrim) on Jun 20, 2000 at 14:54 UTC
    Perl Version: 5.005_03

    CPU Pentium III 450MHz

    Memory 128 Mb

    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 5 wallclock secs ( 4.41 usr + 0.01 sys = 4.42 CPU) Max: 10 wallclock secs ( 9.32 usr + 0.02 sys = 9.34 CPU) Ternary: 10 wallclock secs ( 9.43 usr + -0.02 sys = 9.41 CPU)
Re: Algorithm Efficiency vs. Specialized Hardware?
by httptech (Chaplain) on Jun 20, 2000 at 16:29 UTC
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 3 wallclock secs ( 2.68 usr + 0.00 sys = 2.68 CPU) Max: 7 wallclock secs ( 6.14 usr + 0.33 sys = 6.47 CPU) Ternary: 7 wallclock secs ( 6.54 usr + 0.00 sys = 6.54 CPU) Perl 5.005_03 Linux 2.2.9-19mdk i686 Dual Celeron 466Mhz 128MB
Re: Algorithm Efficiency vs. Specialized Hardware?
by lhoward (Vicar) on Jun 20, 2000 at 16:29 UTC
    I've done some benchmarking of my own over various dataset sizes.
    dataset sizeGrepMaxTernary
    100319704.72/s167783.44/s160956.63/s
    1000324017.59/s158933.12/s157394.83/s
    10000328377.50/s162308.63/s159661.13/s
    100000332964.44/s160964.22/s157891.99/s
    You will notice that the times for dataset sizes of 100, 1000, 10000 and 100000 are nearly identical for each algorithm. This clearly shows that for datasets of that size it is not the size of the dataset that is the limiting factor in performance. I suspect the preformance issue is "Algorithmic Friction"; the overhead involved in doing the calls themselves. Also remember that the sort is implemented in the C library, so it is really fast. (still O(n log(n)), but a very fast O(n log(n))).

    What I expect you're seeing is that it is much quicker to sort (implemented in binary code) and index the last element than it is to iterate through an array (even once) in an interpreted language for datasets of the size we are considering.

    You see similar results with sorting algorithms. Quicksort is well know to be the fastest general purpose sorting algorithm for large sets O(n log(n)). However the overhead incurred by quicksort is overkill for small datasets. For datasets with very few elements bubblesort is often faster, even though bubble sort is O(n^2).

      I did a test where I did this, and increasing the %url population from 1000 to 100_000 caused each algorithm to take over 100 times longer.

      the change is:

      old:

      my %url = ( monday => { @{[map(($_,1), (1..1000))]} } );
      new:
      my %url = ( monday => { @{[map(($_,1), (1..100_000))]} } );

      I turned the iterations down for the large dataset, because otherwise I would be posting tomorrow instead of today. ;)
      results are:

      Large dataset
      Benchmark: timing 100 iterations of Grep, Max, Ternary...
            Grep: 83 wallclock secs (80.90 usr +  0.02 sys = 80.92 CPU) @  1.24/s (n=100)
             Max: 74 wallclock secs (71.38 usr +  0.00 sys = 71.38 CPU) @  1.40/s (n=100)
         Ternary: 77 wallclock secs (74.15 usr +  0.00 sys = 74.15 CPU) @  1.35/s (n=100)
      
      Small dataset
      Benchmark: timing 10000 iterations of Grep, Max, Ternary...
            Grep: 73 wallclock secs (68.50 usr +  0.05 sys = 68.55 CPU) @ 145.88/s (n=10000)
             Max: 61 wallclock secs (57.77 usr +  0.05 sys = 57.82 CPU) @ 172.95/s (n=10000)
         Ternary: 64 wallclock secs (61.54 usr +  0.02 sys = 61.56 CPU) @ 162.44/s (n=10000)
      

      Can you post the rest of the Benchmark output?

      Paris Sinclair    |    4a75737420416e6f74686572
      pariss@efn.org    |    205065726c204861636b6572
      I wear my Geek Code on my finger.
      
        Here's the code I used to run my test:
        my @size=(100,1000,10000,100000); foreach(@size){ my $size=$_; print "size=$size\n"; my $now = 8; my %url = ( monday => { @{[map(($_,1), (1..$size))]} } ); timethese(0, { Grep => q{ $now = (sort grep {$_ <= $now} keys %{$url{'monday'} +})[-1]; }, Ternary => q{ $now = ($now < $_ && $_ < 8 ? $_ : $now) for keys + %{$url{'monday'}}; }, Max => q{ foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ +> $now }; } }); }
        I am awful suspicious of my tests taking almost the same amount of time to run no matter how many elements there were in the dataset.

        I think I am onto something... I don't think the code posted in teh original question really benchmarks what the user thinks it does... I'm crafing a new reply to the original question that should be up soon.

Re: Algorithm Efficiency vs. Specialized Hardware?
by ChuckularOne (Parson) on Jun 20, 2000 at 17:07 UTC
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...
    Grep: 7 wallclock secs ( 5.87 usr + 0.00 sys = 5.87 CPU)
    Max: 11 wallclock secs (10.87 usr + 0.00 sys = 10.87 CPU)
    Ternary: 11 wallclock secs (10.95 usr + 0.00 sys = 10.95 CPU)
    $ perl -v
    This is perl, version 5.005_03 built for sun4-solaris


    Your Humble Servant,
    -Chuck
(jcwren) RE: Algorithm Efficiency vs. Specialized Hardware?
by jcwren (Prior) on Jun 20, 2000 at 17:24 UTC
    Perl 5_005.63
    Pentium III 600b/256M running RH 6.2, Kernel 2.2.15 (1196 Bogomips reported)
    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...<br> Grep: 2 wallclock secs ( 2.38 usr + 0.04 sys = 2.42 CPU) @ 41 +3223.14/s (n=1000000)<br> Max: 5 wallclock secs ( 5.16 usr + 0.02 sys = 5.18 CPU) @ 19 +3050.19/s (n=1000000)<br> Ternary: 5 wallclock secs ( 5.13 usr + 0.00 sys = 5.13 CPU) @ 19 +4931.77/s (n=1000000)
    --Chris
RE: Algorithm Efficiency vs. Specialized Hardware?
by Kozz (Friar) on Jun 20, 2000 at 17:53 UTC
    Slackware 3.5, Kernel 2.2.14 Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 5 wallclock secs ( 5.98 usr + 0.00 sys = 5.98 CPU) Max: 13 wallclock secs (12.67 usr + 0.00 sys = 12.67 CPU) Ternary: 12 wallclock secs (12.77 usr + 0.00 sys = 12.77 CPU) This is perl, v5.6.0 built for i586-linux processor : 0 vendor_id : GenuineIntel cpu family : 5 model : 4 model name : Pentium MMX stepping : 3 cpu MHz : 233.033008 64MB RAM
RE: Algorithm Efficiency vs. Specialized Hardware?
by draconis (Scribe) on Jun 20, 2000 at 18:13 UTC
    My results were as follows:
    Benchmark Info ============== Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 12 wallclock secs ( 9.85 usr + 0.01 sys = 9.86 CPU) Max: 40 wallclock secs (34.38 usr + 0.02 sys = 34.40 CPU) Ternary: 37 wallclock secs (34.76 usr + 0.02 sys = 34.78 CPU) Y Memory Info =========== total used free shared buffers cach +ed Mem: 144172 139260 4912 88820 31684 52 +680 -/+ buffers/cache: 54896 89276 Swap: 128484 704 127780 CPU Info ======== processor : 0 vendor_id : GenuineIntel cpu family : 5 model : 4 model name : Pentium MMX stepping : 3 cpu MHz : 233.322757 fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : yes coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr mce cx8 mmx bogomips : 465.31 Perl Version Info ================= version 5.005_03 built for i386-linux OS Info ======= Red Hat Linux release 6.1 (Cartman) Kernel 2.2.12-20 on an i586
RE: Algorithm Efficiency vs. Specialized Hardware?
by tenatious (Beadle) on Jun 20, 2000 at 18:39 UTC

    This is perl, version 5.005_03 built for i386-linux

    results from the script:

    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...

    Grep: 5 wallclock secs ( 3.90 usr + 0.00 sys = 3.90 CPU)

    Max: 11 wallclock secs ( 9.33 usr + 0.01 sys = 9.34 CPU)

    Ternary: 11 wallclock secs ( 9.48 usr + 0.00 sys = 9.48 CPU)

Re: Algorithm Efficiency vs. Specialized Hardware?
by Greener (Sexton) on Jun 20, 2000 at 19:57 UTC
    C:\WINDOWS\DESKTOP>perl -w benchmark.pl Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 1 wallclock secs ( 2.30 usr + 0.00 sys = 2.30 CPU) Max: 7 wallclock secs ( 7.91 usr + 0.00 sys = 7.91 CPU) Ternary: 8 wallclock secs ( 8.08 usr + 0.00 sys = 8.08 CPU) C:\WINDOWS\DESKTOP>perl -v This is perl, version 5.005_03 built for MSWin32-x86-object
    Windows 95 B, Dell Optiplex GX1, PII 450 (447.6895) MHz, 128MB RAM
RE: Algorithm Efficiency vs. Specialized Hardware?
by husker (Chaplain) on Jun 20, 2000 at 20:10 UTC
    I did this on two machines: an old dog, and a new burner. System 1: HP 9000 Model H50 (PA-7100, 96 Mhz), 256 MB RAM, HP-UX 10.20
    $ perl -v This is perl, version 5.005_03 built for PA-RISC1.1 results: Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 26 wallclock secs (21.98 usr + 0.17 sys = 22.15 CPU) Max: 33 wallclock secs (31.87 usr + 0.07 sys = 31.94 CPU) Ternary: 35 wallclock secs (33.29 usr + 0.13 sys = 33.42 CPU)
    System 2: HP9000 C3000 (PA-8500, 400 Mhz), 512 MB RAM, HP-UX 10.20
    Same version of perl results: Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 5 wallclock secs ( 4.83 usr + 0.00 sys = 4.83 CPU) Max: 7 wallclock secs ( 6.77 usr + 0.01 sys = 6.78 CPU) Ternary: 7 wallclock secs ( 6.90 usr + 0.00 sys = 6.90 CPU)
    These are my answers, and I'm sticking to them!
Re: Algorithm Efficiency vs. Specialized Hardware?
by eduardo (Curate) on Jun 20, 2000 at 22:02 UTC
    [ed@deadbeef1 ed]$ perl russ.pl Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 2 wallclock secs ( 2.01 usr + 0.00 sys = 2.01 CPU) @ 49 +6364.52/s (n=1000000) Max: 5 wallclock secs ( 3.96 usr + 0.02 sys = 3.97 CPU) @ 25 +1782.64/s (n=1000000) Ternary: 5 wallclock secs ( 2.32 usr + 0.00 sys = 2.32 CPU) @ 43 +0976.43/s (n=1000000) [ed@deadbeef1 ed]$ perl -v This is perl, v5.6.0 built for alpha-linux Copyright 1987-2000, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5.0 source +kit. Complete documentation for Perl, including FAQ lists, should be found +on this system using `man perl' or `perldoc perl'. If you have access to + the Internet, point your browser at http://www.perl.com/, the Perl Home Pa +ge. [ed@deadbeef1 ed]$
    damn... alpha's are fast...
Re: Algorithm Efficiency vs. Specialized Hardware?
by lhoward (Vicar) on Jun 21, 2000 at 00:02 UTC
    I believe the original benchamrk code as posted is broken. Thanks to Aighearach for questioning my earlier results and leading me along this path.

    Consider this code:

    my $now = 8; my %url = ( monday => { @{[map(($_,1), (1..1000))]} } ); timethese(0, { Grep => 'Grep();', Grep2 => q{(sort grep {$_ <= $now} keys %{$url{"monday"}})[-1]; +}, Grep3 => sub{(sort grep {$_ <= $now} keys %{$url{"monday"}})[-1 +];} }); sub Grep{ $now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1]; }
    The output from that:
    Benchmark: running Grep, Grep2, Grep3, each for at least 3 CPU seconds...
          Grep:  4 wallclock secs ( 3.00 usr +  0.02 sys =  3.02 CPU) @ 219.21/s (n=662)
         Grep2:  4 wallclock secs ( 3.14 usr +  0.01 sys =  3.15 CPU) @ 356475.56/s (n=1122898)
         Grep3:  4 wallclock secs ( 3.30 usr +  0.04 sys =  3.34 CPU) @ 213.47/s (n=713)
    
    shows that something is definitely up, the "inline sub" and "called sub" take the same time, but the "inline quoted" version is unbelievably faster. I think it has something to do with the inline quoted version not really working the way the original poster intended (I think its some sort of quoting/scoping problem, but I'm not sure exactly what).

    I have rerun my benchmarks with all the routines as "inline subs" and the results follow:
    dataset sizeGrepMaxTernary
    1002440.20/s2511.96/s2495.22/s
    1000207.32/s222.81/s216.31/s
    1000015.31/s16.77/s16.61/s
    And here is the code I used to run the test:

    my %url; my $now = 8; my @size=(100,1000,10000); foreach(@size){ my $size=$_; print "size=$size\n"; %url = ( monday => { @{[map(($_,1), (1..$size))]} } ); timethese(0, { Grep => sub {(sort grep {$_ <= $now} keys %{$url{"mond +ay"}})[-1];}, Ternary => sub {($now < $_ && $_ < 8 ? $_ : $now) for key +s %{$url{"monday"}};}, Max => sub {foreach ( keys %{$url{"monday"}} ) { $now + = $_ if $_ > $now };} }); undef %url; }
      Wondering why the original benchmark didn't work as it should, lhoward said:
      > (I think its some sort of quoting/scoping problem, > but I'm not sure exactly what).
      It's a scoping problem. The quoted code (Grep2) isn't accessing the same %url hash as the other two pieces of code, which you could tell if you had warnings on. You get a warning about a "Useless use of list slice in void context" for that code.

      Why aren't the others in void context? Grep is accessing the lexical %url because it's lexical to the file scope, and the Grep subroutine is defined in this file. And Grep3 (the third one) is accessing the same lexical %url because the anonymous sub acts as a closure (I think).

      So Grep2 is looking at a different (not defined) %url, and is thus not actually a very good test in terms of benchmarking. :)

      It should be noted that most of my tests used the new our package global in place of my. Well, it's a lexical alias to a package global, anyhow. Or, something close to that... ;)

      Also, all my code used -w and use strict;. I wouldn't code without them... because I've made that mistake and spent all day looking for a typo bug.

      Paris Sinclair    |    4a75737420416e6f74686572
      pariss@efn.org    |    205065726c204861636b6572
      I wear my Geek Code on my finger.
      
RE: Algorithm Efficiency vs. Specialized Hardware?
by ddeyoung (Novice) on Jun 21, 2000 at 02:16 UTC
    This is perl, version 5.005_03 built for i386-linux

    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...

    Grep: 4 wallclock secs ( 3.63 usr + 0.00 sys = 3.63 CPU)

    Max: 9 wallclock secs ( 8.36 usr + 0.03 sys = 8.39 CPU)

    Ternary: 9 wallclock secs ( 8.43 usr + -0.01 sys = 8.42 CPU)

    Linux 2.2.12-20

    HP Kayak 450MHz P3

RE: Algorithm Efficiency vs. Specialized Hardware?
by ddeyoung (Novice) on Jun 21, 2000 at 02:48 UTC
    ADDED Double sorry! I didn't realize I could go bask and modify my posts.

    Really cool!

    Sorry, didn't realize I needed tags...

    Benchmark: timing 1000000 iterations of Grep, Max, Ternary...

    Grep: 4 wallclock secs ( 3.61 usr + 0.01 sys = 3.62 CPU)

    Max: 9 wallclock secs ( 8.47 usr + -0.01 sys = 8.46 CPU)

    Ternary: 9 wallclock secs ( 8.51 usr + 0.02 sys = 8.53 CPU)

    This is perl, version 5.005_03 built for i386-linux

    Linux 2.2.12-20

    P3 450MHz

    HP Kayak

      Here ay go:

      Benchmark: timing 1000000 iterations of Grep, Max, Ternary... Grep: 6 wallclock secs (5.63 usr + 0.00 sys = 5.63 CPU) @ 177619.8 +9/s (n=1000000) Max: 9 wallclock secs (9.34 usr + 0.00 sys = 9.34 CPU) @ 107066.3 +8/s (n=1000000) Ternary: 10 wallclock secs (9.61 usr + 0.00 sys = 9.61 CPU) @ 104058.2 +7/s (n=1000000)

      SysInfo + Perl

      System Configuration: Sun Microsystems sun4u Sun Ultra 5/10 UPA/PCI +(UltraSPARC-IIi 400MHz) System clock frequency: 100 MHz Memory size: 256 Megabytes ========================= CPUs ========================= Run Ecache CPU CPU Brd CPU Module MHz MB Impl. Mask --- --- ------- ----- ------ ------ ---- 0 0 0 400 2.0 12 9.1 ========================= IO Cards ========================= Bus# Freq Brd Type MHz Slot Name Model --- ---- ---- ---- -------------------------------- ------------- +--------- 0 PCI-1 33 1 ebus 0 PCI-1 33 1 network-SUNW,hme 0 PCI-1 33 2 SUNW,m64B ATY,GT-C 0 PCI-1 33 3 ide-pci1095,646 0 PCI-2 33 2 scsi-glm Symbios,53C87 +5 0 PCI-2 33 2 scsi-glm Symbios,53C87 +5 0 PCI-2 33 3 pci108e,5043-pci108e,5043 No failures found in System =========================== ========================= HW Revisions ========================= ASIC Revisions: --------------- Cheerio: ebus Rev 1 System PROM revisions: ---------------------- OBP 3.25.3 2000/06/29 14:12 POST 3.1.0 2000/06/27 13:56 [jconner@kwan ~]$ perl -v This is perl, v5.6.0 built for sun4-solaris </clip> [jconner@kwan ~]$ uname -a SunOS kwan 5.7 Generic_106541-12 sun4u sparc SUNW,Ultra-5_10

      ----------
      - Jim

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://18944]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2014-08-30 21:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls