http://www.perlmonks.org?node_id=935922


in reply to Re^2: code optimization
in thread code optimization

Yes, I think the C version would however be faster by some factor not depending on the input size. That's because Perl has some overhead because of the data structures it uses, because of the garbage collection and so forth. There's also the silly version of comparing a/b < c/d by multiplying everything with bd then you reach ad<cb. Now, are two multiplications faster than a division, even if the multiplication is carried out with Karatsuba's algorithm (the article says that "Karatsuba is usually faster when the multiplicands are longer than 320–640 bits" and also gives complexity) or linear time multiplication ? I'm just wondering what the cost of a normal division is in relation to the cost of two multiplications..

Replies are listed 'Best First'.
Re^4: code optimization
by BrowserUk (Patriarch) on Nov 04, 2011 at 13:15 UTC

    I'm not 100% certain, and I've failed to find confirmation with a quick search, but I am pretty sure that on Intel's recent (last 5 or so years) processors, 32&64-bit, that integer division and integer multiplication take the same number of clock cycles. I'll knock up a quick test to verify that though.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I tried the following test for division.
      perl -MTime::HiRes=gettimeofday,tv_interval -e '$istart=1000;$iend=500 +0; $t0=[gettimeofday]; for $x($istart..$iend){for $y($istart..$iend) +{ $x/$y }}; $t1=[gettimeofday]; print "lasted->".tv_interval($t0,$t1) +."\n"'
      I did the same thing for multiplication. On a couple of runs for division I got the times:
      • 1.032998
      • 1.029043
      • 1.043523
      • 1.059354
      • 1.034561
      • 1.072301
      • 1.10864
      • 1.034843
      On a couple of runs for multiplication I got the times:
      • 1.075402
      • 1.093403
      • 1.089273
      • 1.077661
      • 1.074203
      • 1.091646
      • 1.080421

      These numbers are seconds.

      I couldn't draw any conclusions from this...

        I did a similar thing using C:

        #include <stdio.h> #define ITERS 1000000000ul int main( int argc, char **argv ) { __int64 start; int i; double d; getch(); start = GetTickCount64(); if( argc > 1 ) { printf( "%u integer divisions: ", ITERS ); start = GetTickCount64(); for( i = 1; i < ITERS; i++ ) d = 1 / i; printf( "Took %I64d ticks\n", GetTickCount64() - start ); } else { printf( "%u integer multiplications: ", ITERS ); start = GetTickCount64(); for( i = 1; i < ITERS; i++ ) d = 1 * i; printf( "Took %I64d ticks\n", GetTickCount64() - start ); } }

        And on my 64-bit processor, for 32-bit ints I got:

        C:\test>muldiv-b 1 1000000000 integer divisions: Took 3432 ticks C:\test>muldiv-b 1000000000 integer multiplications: Took 2917 ticks

        The numbers vary ~+-30 ticks for individual runs, but division is always ~10% slower than multiplication. I put this down the subsequent promotion of the result to a double rather than the opcode itself.

        Conversely, if I use 64-bit ints division is almost 7X slower than multiplication:

        C:\test>muldiv-b 1000000000 integer multiplications: Took 3011 ticks C:\test>muldiv-b 1 1000000000 integer divisions: Took 20764 ticks

        This is think is due to the fact that two 64-bit registers are involved in the result.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.