http://www.perlmonks.org?node_id=513406


in reply to Re: Speed/Efficiency tweaks for a fannkuch benchmark script? (300%)
in thread Speed/Efficiency tweaks for a fannkuch benchmark script?

That's great. I'm surprised it makes such a massive difference. I wonder whether the bottleneck in the original code was the @q = @$copy, because that's the only explanation I can think of for why this change gives such a big speed improvement.
  • Comment on Re^2: Speed/Efficiency tweaks for a fannkuch benchmark script? (300%)
  • Download Code

Replies are listed 'Best First'.
Re^3: Speed/Efficiency tweaks for a fannkuch benchmark script? (300%)
by BrowserUk (Patriarch) on Dec 01, 2005 at 23:47 UTC

    Far and away, by an order of magnitude, the most computationally expensive line is

    count wall-time cpu-time line# 22169434 157.7315 326.3050 26: @q[ 0 .. $k-1 ] = reverse @q[ 0 .. $k-1 ];

    Which is no surprise really since it is run an order of magnitude more times than any other line, but is also innocuously doing a lot of operations.

    1. Generating two lists;
    2. using those to slice across two arrays to produce two more lists (a & b);
    3. one of those lists (b) is then inverted to produce another list (c);
    4. and finally that list (c) is assigned the first list (a) (via aliases?) to complete the swap.

    Slices are a great notational convenience and what VHLLs are all about, but they do hide a deal of complexity.

    The other expensive lines in order of cost are:

    4500244 18.0823 53.2090 39: @copy[ $index - 1, $index ] = @copy[ $index, $index - 1 ] 4037913 10.6528 42.0950 22: if ( $copy[0] != 1 and $copy[-1] != @copy ) { 3265920 11.2764 37.4510 32: push @max_sequence, join '', @copy, "\n" 3265920 12.2326 37.4380 23: my @q = @copy;

    For comparison, here are the same lines from profiling the string version:

    22169434 96.3426 273.7840 28: substr( $q, 0, $k ) = reverse substr( $q, 0, $k ); 4500244 20.3415 55.0800 41: substr( $copy, $index - 1, 2 ) = reverse substr( $copy, $index + -1, 2 ); 4037913 11.3571 43.1890 22: if( ord( $copy ) != 1 3265920 8.6338 35.9450 25: my $q = $copy; 3265920 9.1315 34.8530 34: push @max_sequence, $copy

    They make it easy to see where the saving came from.

    I do love Devel::SmallProf. Line-by-line profiling is so much more useful that function-by-function. Of course, it does take an inordinately long time to run, hence the delay on my responding while I waited for the second profile to complete.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.