Testing a number for oddness

Falkkin has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Testing a number for oddness by Adam (Vicar) on Jan 23, 2001 at 03:52 UTC
Your results imply a significant savings using % instead of &, which made me wonder about them. Thank you for posting your code, I ran it myself (on a dual P3-650 running NT4 with Perl 5.6) and got: `odd_1: 2 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 586854. +46/s (n=1000000) odd_2: 3 wallclock secs ( 1.67 usr + 0.00 sys = 1.67 CPU) @ 598086. +12/s (n=1000000)` [download] Which implies that the two are roughly equivalent, which is what I expected. I ran a second test with this code (and the following results): `#!perl -w use strict; use Benchmark; timethese( 1_000_000, { odd_1 => sub { my $num = rand(231); return $num % 2 }, odd_2 => sub { my $num = rand(231); return $num & 2 }, }) __END__ Benchmark: timing 1000000 iterations of odd_1, odd_2... odd_1: 3 wallclock secs ( 1.69 usr + 0.00 sys = 1.69 CPU) @ 59 +2417.06/s (n=1000000) odd_2: 3 wallclock secs ( 1.69 usr + 0.00 sys = 1.69 CPU) @ 59 +2417.06/s (n=1000000)` [download] Ok, I lied. I ran it 5 times, and that result only happened once. But the other 4 runs were also very close. Perhaps there is an optimization running, since %2 would look like "grab the bit shifted off when doing >>1". And I would be surprised if no one (from the authors of Perl on down to the folks at intel) took the time to optimize this rather common operation. Update: So I was thinking about this some more, and thought, "I wonder what the deparser has to say about this." Turns out, nothing. Sigh. `>perl -MO=Deparse -we "$n=1231234;print $n % 2" $n = 1231234; print $n % 2; -e syntax OK >perl -MO=Deparse -we "$n=1231234;print $n & 1" $n = 1231234; print $n & 1; -e syntax OK >perl -MO=Deparse,-p -we "$n=1231234;print $n % 2" ($n = 1231234); print(($n % 2)); -e syntax OK >perl -MO=Deparse,-p -we "$n=1231234;print $n & 1" ($n = 1231234); print(($n & 1)); -e syntax OK` [download]	[reply] [d/l] [select]
Re: Testing a number for oddness by runrig (Abbot) on Jan 23, 2001 at 04:30 UTC
Whenever running a benchmark, you should, as much as possible, take out things that you are not comparing and use functions/operations that are as low cost as possible (in comparison to what you're comparing). That call to rand() is slowing things down alot, and if you remove it, you get a better comparison: #!/usr/local/bin/perl -w use strict; use Benchmark; my $i = 0; timethese(200000, { RAND_AND=>\&and_rand, RAND_MOD=>\&mod_rand, TWO_AND=>\&two_and, TWO_MOD=>\&two_mod, }); sub and_rand { my $num = rand(231); return $num & 1; } sub mod_rand { my $num = rand(231); return $num % 2; } sub two_mod { $i++ % 2; } sub two_and { $i++ & 1; } ~ "tst" 31 lines, 359 characters ~/tst>./tst Benchmark: timing 200000 iterations of RAND_AND, RAND_MOD, TWO_AND, TW +O_MOD... RAND_AND: 1 wallclock secs ( 1.12 usr + 0.00 sys = 1.12 CPU) @ 17 +8571.43/s (n=200000) RAND_MOD: 0 wallclock secs ( 1.10 usr + 0.00 sys = 1.10 CPU) @ 18 +1818.18/s (n=200000) TWO_AND: 0 wallclock secs ( 0.36 usr + -0.01 sys = 0.35 CPU) @ 57 +1428.57/s (n=200000) (warning: too few iterations for a reliable count) TWO_MOD: 2 wallclock secs ( 0.42 usr + -0.00 sys = 0.42 CPU) @ 47 +6190.48/s (n=200000) ~/tst> [download] Update: Even the ++ in the other two subroutines is a sort of unwanted operator. If I take it out and set $i to a constant, the AND method is almost twice as fast as the MOD method (ok, on second checking maybe that twice as fast was a fluke, its more like 30-40% faster consistently).	[reply] [d/l]
Re: Re: Testing a number for oddness by extremely (Priest) on Jan 23, 2001 at 06:30 UTC
On constants, the AND may be getting optimized away, and in checking I find that it is... `# perl -MO=Deparse,-p -we "print 15 & 1" print(1); -e syntax OK # perl -MO=Deparse,-p -we "print 16 & 1" print(0); -e syntax OK` [download] -- $you = new YOU; honk() if $you->love(perl)	[reply] [d/l]
Re: Re: Re: Testing a number for oddness by chipmunk (Parson) on Jan 23, 2001 at 09:40 UTC
It turns out that almost all operators in Perl get optimized away in this manner. 'Constant folding', as it's called, is a very common optimization. `% perl -MO=Deparse print 15 + 1, 15 - 1, 15 * 1, 15 % 1, 15 & 1, 15 ^ 1, 15 << 1, !15, 15 \|\| $x, 15 . 1, 15 < 1, 15 <=> 1, 15 ? $x : $y; __END__ print 16, 14, 15, 0, 1, 14, 30, 0, 15, '151', 0, 1, $x;` [download] In fact, the only operator I've found that could have this optimization but doesn't is x.	[reply] [d/l]
Re: Re: Re: Testing a number for oddness by runrig (Abbot) on Jan 23, 2001 at 08:35 UTC
That's why I set '$i' to a constant (my $i=15), then AND'ed it. I checked, and THAT won't get optimized away.	[reply]
Re: Testing a number for oddness by Fastolfe (Vicar) on Jan 23, 2001 at 05:02 UTC
Update: chipmunk notes that I was using `% 1` instead of `% 2`, which was a stupid mistake on my part. I've updated the code and benchmarks, but the results do not differ significantly. Because I'm bored and in love with Inline: use Benchmark 'timethese'; use Inline C => <<'EoF'; int c_and(int num) { return(num & 1); } int c_mod(int num) { return(num % 2); } EoF my $num = 5; sub perl_mod { $_[0] % 2 } sub perl_and { $_[0] & 1 } timethese(10_000_000, { perl_mod => sub { perl_mod($num) }, perl_and => sub { perl_and($num) }, c_mod => sub { c_mod($num) }, c_and => sub { c_and($num) }, }); __END__ c_and: 7 wallclock secs ( 7.09 usr + 0.05 sys = 7.14 CPU) @ 14 +00560.22/s (n=10000000) c_mod: 8 wallclock secs ( 7.44 usr + -0.01 sys = 7.43 CPU) @ 13 +45895.02/s (n=10000000) perl_and: 17 wallclock secs (16.63 usr + -0.01 sys = 16.62 CPU) @ 60 +1684.72/s (n=10000000) perl_mod: 16 wallclock secs (16.65 usr + 0.01 sys = 16.66 CPU) @ 60 +0240.10/s (n=10000000) [download] For all intents and purposes, it seems like both methods are equally fast, at least as far as any Perl implementation is concerned, and at least on my architecture (i686-linux, P3/850). A few microseconds isn't going to make a difference in any real-world application, and Perl itself may have quirks that cause one implementation to be slower or faster than another. It may very well be that a 100% C solution to all of this will show the expected results (with & being marginally faster), but it doesn't look like you're going to get those results in Perl. Additionally, you guys are relying on Benchmark results with way too few iterations. If you're wanting to compare and contrast algorithms and are getting Benchmark results in the 0-2 second range, you need to increase your repetitions at least an order of magnitude.	[reply] [d/l] [select]
(tye)Re: Testing a number for oddness by tye (Sage) on Jan 23, 2001 at 04:46 UTC
`$x & 1` will probably have to convert the double-precision floating point value stored in $x into a "long int" before it can do the fast bit-wise "and". This conversion may well take longer than the simple floating-point division (which will have the aid of floating-point acceleration hardware) that `$x % 2` must do. This probably explains some of the benchmark results. To me, a much worse problem than the misguided "optimization" of such a simple operation is that `$x & 1` will only work for integral values that fit within 32 bits (on most platforms). However, `$x % 2` will work correctly on integeral values of approximately 57 bits (on most platforms). Using % 2 will probably make your code handle a much huger range of values correctly. - tye (but my friends call me "Tye")	[reply] [d/l] [select]
Re: Testing a number for oddness by Maclir (Curate) on Jan 23, 2001 at 04:31 UTC
OK, I was struck my the significant difference Falkkin obtained. I ran the same script on my development box here (a Sun Ultra 2, with a 300MHz Ultrasparc CPU and 128 MB of ram), and these were my results: `odd_1: 9 wallclock secs ( 8.55 usr + 0.00 sys = 8.55 CPU) odd_2: 9 wallclock secs ( 9.46 usr + 0.00 sys = 9.46 CPU)` [download] The divide is about 10% faster than the shift. Why is this so? I personally would have thought a shift operation - a basic opcode in just about any instruction set - would be faster that a division. Update: I followed runrig's advice and took out the rand call. The results were different: `odd_1: 7 wallclock secs ( 6.41 usr + 0.00 sys = 6.41 CPU) odd_2: 7 wallclock secs ( 6.00 usr + 0.00 sys = 6.00 CPU)` [download] Now the division is longer - by about 7%. Strange stuff.	[reply] [d/l] [select]
Re: Testing a number for oddness by lhoward (Vicar) on Jan 23, 2001 at 04:45 UTC
One thing you have to remember is that these operations are, at the perl layer, not being done on an integer (or anything else that is native to the underlying architecture) but on a perl scalar value. I bet if you ran the same tests in C (or assembler, or any language that was "right on the hardware") and you were sure that modulus division wasn't part of the CPU's instruction set (or produced as a side-effect of the div operation) the results would be much more like you expected.	[reply]
Re: Testing a number for oddness by mr.nick (Chaplain) on Jan 23, 2001 at 20:05 UTC
It's kind of funny, runrig even mentioned that when benchmarking you should exclude operations that don't concern the thing you are trying to test, yet every single example called their "perl_and/mod" function for one operation. Don't you think the function call from timethese/timeit to perl_and/mod is taking time? So I did it this way: `sub perl_mod { for my $z (1..10_000) { $_[0] % 2; } } sub perl_and { for my $z (1..10_000) { $_[0] & 1; } } timethese(10_000, { perl_mod => sub { perl_mod($num) }, perl_and => sub { perl_and($num) }, });` [download] and got the following results: `nicholas(/5)@neko [106 /<1>nicholas/tmp] > ./time Benchmark: timing 10000 iterations of perl_and, perl_mod... perl_and: 210 wallclock secs (209.32 usr + 0.79 sys = 210.11 CPU) perl_mod: 260 wallclock secs (258.62 usr + 0.93 sys = 259.55 CPU) nicholas(/5)@neko [107 /<1>nicholas/tmp] >` [download] Which shows the perl_and to be around 20% faster than perl_mod: about what I would expect. Then, I did a pure C test: `#include <stdio.h> void c_and(int i) { int a; for (a=0;a<10000;a++) { i & 1; } } void c_mod(int i) { int a; for (a=0;a<10000;a++) { i % 2; } } int main(void) { int a; for (a=0;a<100000;a++) { c_and(a); } //for (a=0;a<100000;a++) { // c_mod(a); //} exit(0); }` [download] and ran it seperately for each of the types (I was using /bin/time to test): `For c_and: 29.800u 0.040s 0:29.84 100.0% 0+0k 0+0io 69pf+0w` [download] and `for c_mod: 29.540u 0.390s 0:29.99 99.7% 0+0k 0+0io 69pf+0w` [download] Which again shows and to be faster, though no wheres as much of a difference as with perl (I presume gcc is optimized). test platform: dual PII-233 with a load avg of around 7 :)	[reply] [d/l] [select]
Re: Re: Testing a number for oddness by Anonymous Monk on Jan 23, 2001 at 22:41 UTC
gcc is OBVIOUSLY optimized. % is always craploads slower than &. On a modern processor, a multiply, add, subtract, bitshift, etc., can be done in one cycle (assuming all operands are already loaded into registers), whereas a divide could take 10-50 or even more cycles. To get gcc to not optimize anything away do something like this: `#include <stdio.h> volatile int yb; int foo; void blah(int t) { foo = (t==0)? 1: 2; } int main(int argc, char *argv) { int x = 0xdeadbeef; volatile int y = &yb; int i; int t = atoi(argv[1]); blah(t); if (t == 0) { printf("Using the & method\n"); for (i=0;i<100000000;++i) y=x & foo; } else { printf("Using the %% method\n"); for (i=0;i<100000000;++i) y=x % foo; } return 0; }` [download] NOTES: x is initialized with a non-zero value. On x86 I have noticed sometimes ALU/Mult operations on zero-valued inputs can be significantly faster than on non-zero inputs. y is a volatile pointer to prevent gcc from optimizing the loops away, if it was so inclined. 'foo' is not a constant to prevent gcc from optimizing an odd/even test. 'foo' is initialized in a non-static function to prevent gcc from (if it could/would do this) optimizing the loops to odd/even test anyway. We do 100 million iterations, to drown out any noise. Running on an unloaded 933Mhz P-III: `% time ./foo 0 Using the & method 0.77user 0.01system 0:00.80elapsed 96%CPU (blah blah) % time ./foo 1 Using the % method 4.29user 0.00system 0:04.43elapsed 96%CPU (blah blah)` [download] I ran the tests about 10 times each and the results are about the same. Now if you are just doing a odd-even test, gcc will likely be able to optimize it for you to the point that you can use whatever method you want and it will be fast. If you programming on a DSP, however, or maybe in a super-high-speed driver or something, you'd want to make sure you use the fast version, cuz the compiler in that case likely can't help you (or you don't want it to). And in Perl anyway, you are operating many levels about the underlying machine architecture. % vs & is the least of your worries speed-wise.	[reply] [d/l] [select]
Re: Re: Testing a number for oddness by runrig (Abbot) on Jan 24, 2001 at 05:58 UTC
See the Benchmark docs. It factors out the null loop, so even looping within your subroutine adds operations that you'd rather not time (for '&' and '%' anyway; if you are timing more expensive operations/algorithms then it is not significant). See this: #!/usr/local/bin/perl -w use strict; use Benchmark; my $i = 34; timethese(3000000, { AND=>\&and_them, MOD=>\&mod_them, NULL_LOOP=>\&null_loop, }); sub null_loop { } sub and_them { $i & 1; } sub mod_them { $i % 2; } ~ "tst" 23 lines, 229 characters [rover:DEV]~/tst>./tst Benchmark: timing 3000000 iterations of AND, MOD, NULL_LOOP ... AND: 1 wallclock secs ( 1.68 usr + 0.00 sys = 1.68 CPU) @ 17 +85714.29/s (n=3000000) MOD: 2 wallclock secs ( 2.33 usr + 0.00 sys = 2.33 CPU) @ 12 +87553.65/s (n=3000000) NULL_LOOP: -1 wallclock secs (-0.07 usr + 0.00 sys = -0.07 CPU) @ -4 +2857142.86/s (n=3000000) (warning: too few iterations for a reliable count) [download] (I realize that in the end it really doesn't matter, I'd probably use '%' anyway, and this is all just for the sake of discusssion/fun/curiousity).	[reply] [d/l]


Problems? Is your data what you think it is?
	PerlMonks