Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re^5: Behold! The power of recursion.

by tachyon (Chancellor)
on Oct 18, 2004 at 22:19 UTC ( #400334=note: print w/replies, xml ) Need Help??

in reply to Re^4: Behold! The power of recursion.
in thread Behold! The power of recursion.

Benchmark cannot "know" what my hypothetical "good" compiler would do.

And? It shows what *any* non hypothetical, can compile code now, real world useful compiler can do. Out in the real world hypothetical compilers producing theoretical code are less that totally useful. The results presented were for CL.EXE /O2 on a Win2K AMD K7 system. FWIW GCC does a significantly worse job of optimising the recursive code.

By all means compile it with anything you like on any platform and present any evidence that supports your hypothesis.

[root@devel3 root]# ./ Benchmark: timing 1000000 iterations of iterative_c, iterative_p, recu +rsive_c, recursive_p... iterative_c: 1 wallclock secs ( 0.42 usr + 0.00 sys = 0.42 CPU) @ 2 +380952.38/s (n=1000000) iterative_p: 20 wallclock secs (18.34 usr + 0.00 sys = 18.34 CPU) @ 5 +4525.63/s (n=1000000) recursive_c: 1 wallclock secs ( 0.60 usr + 0.00 sys = 0.60 CPU) @ 1 +666666.67/s (n=1000000) recursive_p: 38 wallclock secs (37.75 usr + 0.00 sys = 37.75 CPU) @ 2 +6490.07/s (n=1000000) Rate recursive_p iterative_p recursive_c iterative_c recursive_p 26490/s -- -51% -98% -99% iterative_p 54526/s 106% -- -97% -98% recursive_c 1666667/s 6192% 2957% -- -30% iterative_c 2380952/s 8888% 4267% 43% -- [root@devel3 root]# gcc -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)



Replies are listed 'Best First'.
Re^6: Behold! The power of recursion.
by jordanh (Chaplain) on Oct 19, 2004 at 01:03 UTC
    I don't have inline installed on the Perl on the machine I'm typing this on, so I took the liberty of doing a pure C benchmark. I had to admit that I was, at first, surprised at the result, since I had heard that gcc does tail call elimination, then I found this article that explained why gcc was failing to perform this optimization. Subsequently, I changed guess_rec_c to read:
    int guess_rec_c( int ans, int lower, int higher ) { int guess; guess = (int)((lower + higher)/2); if (ans == guess) return guess; if ( guess > ans ) { return guess_rec_c( ans, lower, guess -1 ); } else { return guess_rec_c( ans, guess +1, higher ); } }
    ...and using:
    gcc -v
    Reading specs from /usr/lib/gcc-lib/i586-suse-linux/3.3.3/specs
    Configured with: ../configure --enable-threads=posix 
    --prefix=/usr --with-local-prefix=/usr/local 
    --infodir=/usr/share/info --mandir=/usr/share/man 
    --disable-checking --libdir=/usr/lib --enable-libgcj 
    --with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib 
    --with-system-zlib --enable-shared --enable-__cxa_atexit 
    Thread model: posix
    gcc version 3.3.3 (SuSE Linux)
    with the compiler optimization -O3, the recursive version was slightly faster than the iterative version:
    jordan@linux-jordan:~> more guess.c int guess_it_c( int ans, int lower, int higher ) { int guess; for(;;) { guess = (int)((lower + higher)/2); if ( ans == guess ) break; if ( guess > ans ) higher = guess -1; else lower = guess +1; } return guess; } int guess_rec_c( int ans, int lower, int higher ) { int guess; guess = (int)((lower + higher)/2); if (ans == guess) return guess; if ( guess > ans ) { return guess_rec_c( ans, lower, guess -1 ); } else { return guess_rec_c( ans, guess +1, higher ); } } int main() { int n; for (n=0; n<1000000; n++) VERSION(4999,1,10000); } jordan@linux-jordan:~> gcc -Wall -DVERSION=guess_it_c -O3 -o guess_it +guess.c jordan@linux-jordan:~> gcc -Wall -DVERSION=guess_rec_c -O3 -o guess_re +c guess.c jordan@linux-jordan:~> time ./guess_it real 0m0.291s user 0m0.285s sys 0m0.003s jordan@linux-jordan:~> time ./guess_rec real 0m0.284s user 0m0.276s sys 0m0.004s
    I think this demonstrates that the recursive version of a program can be as efficient as an iterative solution. You might point to the fact that I changed the code, but the code is now arguably more "correct" as the -Wall check would complain about the routines declared to return int and reaching the end without a return statement. The fact that gcc did not recognize the equivalence of the routine call and the routine call preceded by the return is a quality of implementation issue. Just as I said, there's no reason why the recursive version couldn't be compiled to yield an equivalent program. I am disappointed that more compilers don't automatically offer this obvious and simple optimization.

    Updated: Some formatting

      The reason for the optimisation failure is interesting, thanks very much for the link.



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://400334]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2018-01-20 14:17 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (226 votes). Check out past polls.