Re: Can you spot the problem?
by hossman (Prior) on Mar 05, 2004 at 06:09 UTC
|
- I'm always suspicious of code that uses Bitwise operators
- I'm suprised no one has pointed out any false positives, ie: 999.0.0.222
| [reply] |
Re: Can you spot the problem?
by Abigail-II (Bishop) on Mar 05, 2004 at 01:47 UTC
|
| [reply] |
Re: Can you spot the problem? (golf)
by tye (Sage) on Mar 05, 2004 at 02:41 UTC
|
| [reply] |
|
Me too. Question is, which way is neatest. Can it be done with one char? In fact, I'm fairly convinced it can't. I can come up with several two-char changes which make it work, but no one-chars.
| [reply] |
|
| [reply] |
|
Hey, you ran it in your head. Bad tye. :)
| [reply] |
Re: Can you spot the problem?
by thelenm (Vicar) on Mar 05, 2004 at 03:23 UTC
|
| [reply] |
Re: Can you spot the problem? (solution)
by dws (Chancellor) on Mar 05, 2004 at 23:09 UTC
|
It's no secret at this point that the "problem" is the ORing of strings instead of numbers.
This code fragment derives from discussion about an interview question that asked for a code fragment for determining if a dotted quad IP address was valid. One straightforward
solution was basically
if ( ($a,$b,$c,$d) = m/^(\d+)\.(\d+)\.(\d+)\.(\d+)$/ ) {
if ( $a < 256 && $b < 256 && $c < 256 && $d < 256 ) {
return 1;
}
}
return 0;
which works, though there are better ways to parse IPv4 addresses.
In the course of "improving" the solution,
someone reached into their bag of tricks and pulled out one often used in assembly language (and occassionaly in C): OR the numbers
together and compare once, thus trading off 3 tests and branches against less expensive boolean operations.
Unfortunately, Perl's automagic string to number conversion then doesn't happen,
and the "improved" expression ORs strings instead of numbers. But, it looks like it should work! And, in
one of those amusing quirks that happen now and then, you might not notice
the problem if you happened to pick a simple set of test data, such as this
one, which tests boundaries and then probes into the middle of the range.
0.0.0.0: "0" | "0" | "0" | "0" eq "0" OK
255.255.255.255: "255" | "255" | "255" | "255" eq "255" OK
127.0.0.1: "127" | "0" | "0" | "1" eq "127" OK
My solution was to force the conversion via
return 1 if (0+$a|0+$b|0+$c|0+$d) < 256;
which tye hinted at above (and pointed out privately) could more simply be written as
return 1 if (0|$a|$b|$c|$d) < 256;
| [reply] [d/l] [select] |
Re: Can you spot the problem?
by graff (Chancellor) on Mar 05, 2004 at 03:10 UTC
|
(sorry... I had to run the code :P -- but in my own defense, I only had to run it a few times to see the pattern.)
No cigar for bmann, I'm afraid -- 192.168.0.0 thru 192.168.0.16 all pass, but 192.0.0.2 does not, nor does 192.20.0.0. (update: -- having seen bmann's update -- no, 192.168.255.255 will pass, but, e.g. 192.1.2.2 will fail.)
The key to the problem involves paying attention to the first digit in each quad, particularly in the cases where at least one quad has three digits and at least one other has fewer digits. It does make sense, and it's quite an amusing trap. Thanks.
(another update: different digit counts is not an essential property -- 100.100.100.200 also fails.) | [reply] |
|
So you ran the code and gave hints. Bad dog. No cookie for you.
| [reply] |
|
You are partially right and partially wrong. You found a partial correlation between the two, but 213.244.213.213 will fail, but 213.243.213.213 will pass. And that's NOT using the first digit of every set. :)
| [reply] |
|
Thank you -- I did have a sense that I must have missed something (as I often do with puzzles like this). Now I don't feel so bad about not getting the cookie.
| [reply] |
|
| [reply] |
Re: Can you spot the problem?
by little (Curate) on Mar 05, 2004 at 11:35 UTC
|
this
if (($a|$b|$c|$d) < 256 ) {
return 1;
}
should be
if (($a < 256) && ($b < 256) && ($c < 256) && ($d < 256) ) {
return 1;
}
Have a nice day
All decision is left to your taste
| [reply] [d/l] [select] |
|
if ( ($a,$b,$c,$d) = $ip =~ m/^(\d+)\.(\d+)\.(\d+)\.(\d+)$/ ) {
to
if ( ($a,$b,$c,$d) = map $_+0,$ip =~ m/^(\d+)\.(\d+)\.(\d+)\.(\d+)$/ )
+ {
and the code will work as expected.
antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1
| [reply] [d/l] [select] |
|
Exactly the opening post attempts to be clever, but instead it does something that doesn't even make sense... ($a|$b|$c|$d) is not any() from Quantum::Superpositions, it's an or operation on a series of values and then a conditional. Horribly bad form and it should have never been written that way in the first place.
( For the vast masses downvoting this, BTW, please explain why the OP post is exemplifying good software design... think about it... not writing maintainable working code and falling for these kind of traps is what gives Perl programmers a bad name .. and this is *not* helping. I don't mind taking the hit here, go ahead, please stand up for your right to write broken code and drag down Perl in your workplace, etc. )
| [reply] |
|
... the opening post attempts to be clever, but instead it does something that doesn't even make sense...
It make sense if you understand a lower-level idiom. To be fair, I left off the thread of discussion that led up to something close to that particular snippet.
The problem, at the level of that statement, is to determine if any one of the octet numbers exceeds 255. You can compare the numbers one by one, or you can take advantage of knowing that 255 == 0xFF, and that if any one of the octet numbers is larger than that, the OR of all the numbers will have bits set above 0xFF, and will hence be greater than 255 (and conversly, a valid quad will OR to less than 256). At a raw instruction level, doing three ORs and one test/branch is faster than doing four individual test/branches if the majority of the IP addresses are going to be valid. This works well in assembler, and in C (which is a limited number of steps above assembler). If you haven't played at that level, the idiom may seem strange. And, unless performance really, really matters, it's not a technique that I'd use in production Perl. But for purposes of the puzzle, it's great, since it's so misleading.
| [reply] |
|
|
| [reply] |
|
|
|
|
| [reply] [d/l] |
|
For the vast masses downvoting this, BTW, please explain why the OP post is exemplifying good software design...
S'funny, but no matter how may times I re-read the OP, I can't find the reference to "good software design". From my reading, it appears to be holding the code up as an example of (at best) a gotchya, and probably bad design, but mostly, as a puzzle. Something fun to try a solve by looking rather than debugging.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
| [reply] [d/l] |
|
( For the vast masses downvoting this, BTW, please explain why the OP post is exemplifying good software design... think about it... not writing maintainable working code and falling for these kind of traps is what gives Perl programmers a bad name .. and this is *not* helping. I don't mind taking the hit here, go ahead, please stand up for your right to write broken code and drag down Perl in your workplace, etc. )
If you ever used the reason why it works for other things, it's fairly obvious. I remember seeing &sub($a,$b), sub( $a, $b ) and sub $a, $b, which are all the same for the most part, but are different formats. It was really unclear when people nested paren'-less subs. But as I got more experienced, as a programmer.. as a perl developer.. things got clearer. So the mistake in the program made me wonder, ok, this doesn't work, i know WHY it shouldn't work, lemme do some examples on paper using the logical error.
But I don't think you got downvoted solely for your opinion, 'cause to some degree, you are right, it could be clearer, but then again, $a||='bob' is too, eh?
| [reply] |
Re: Can you spot the problem?
by adrianh (Chancellor) on Mar 05, 2004 at 11:51 UTC
|
200.050.006.000 - hopefully this won't count as a hint :-)
Nice. I've been caught by this odd corner of Perl before.
| [reply] |
Re: Can you spot the problem?
by duff (Parson) on Mar 05, 2004 at 16:26 UTC
|
I was going to do something clever like post the smallest valid IP address that illustrates the problem, but I decided that I'd rather point out a few well-known sites that have "invalid" IPs :-)
thinkgeek 66.35.250.160
perl.com 208.201.239.56
google 64.233.161.104
theonion 66.216.104.235
I was really hoping that perlmonks.com itself would be "invalid", but alas it is not so.
Oh, I noticed some golfing for the smallest fix. I can't come up with anything smaller than 2 characters. Unless of course, you count removing the entire sub and replacing it with Regexp::Common or something. The absolute character count would be smaller then I think :)
| [reply] |
Re: Can you spot the problem?
by tachyon (Chancellor) on Mar 05, 2004 at 23:15 UTC
|
sub isValidIP {
#234567890123456789012345
use Socket;inet_aton(pop)
}
| [reply] [d/l] |
|
use Socket;*isValidIP=\&inet_aton;
| [reply] [d/l] |
Re: Can you spot the problem?
by The Mad Hatter (Priest) on Mar 05, 2004 at 02:21 UTC
|
Root node updated. Well, for one, that second if needs curly braces. This isn't C after all. ; )
Just to nitpick with it, $a and $b really shouldn't be used as they're used for sort.
| [reply] |
Re: Can you spot the problem?
by perlinux (Deacon) on Mar 05, 2004 at 13:25 UTC
|
This is my regex for Ip addresses:
(([01]?\d\d?|2[0-4]\d|25[0-5])\.){3}([01]?\d\d?|2[0-4]\d|25[0-5])
| [reply] [d/l] |
|
Don't parse IP addresses with regexen in real code. The OP makes an amusing puzzle (one which I have admittedly not figured out yet), but nothing more. There are far more valid formats for IPv4 than the traditional dotted-quad one. I don't even want to think about all the formats available for IPv6.
Look at Valid IP? for more info.
----
: () { :|:& };:
Note: All code is untested, unless otherwise stated
| [reply] [d/l] |
|
| [reply] [d/l] |
|
|
Re: Can you spot the problem?
by dragonchild (Archbishop) on Mar 05, 2004 at 18:36 UTC
|
One cute thing is that all the false negatives (or false positives) are actually solution spaces. So, Abigail-II's solution of 200.0.0.1 actually is the solution space containing the elements 200, 0, 0, and 1. So, 1.200.0.0 also is a false negative (for the same reason). (The symmetry is caused by the bit-wise OR, not by the issue at hand.)
As for fixes ... I would think that this should be something Perl5 should handle through dwimmery (and something it seems that Perl6 will be handling more intelligently, at least according to TheDamian's discussion on sorting ...)
(And, if I get dinged on providing too much of a hint, at least I got someone interested in the differences between Perl5 and Perl6's sorting ... *grins*)
------
We are the carpenters and bricklayers of the Information Age.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] |
|
Well, depends on how you decide to fix it in Perl 6. The code as written will do superpositional logic, but it
will only tell you if one of the numbers is
less than 256. You'd have to use & instead of | to do a conjunction.
If, on the other hand, you try to keep the bitwise-OR semantics, Perl 6 will force you to pick between ~| and +|,
and if you pick the obvious one, it numerifies the string and works as intended. So I think Perl 6 is improving things in this area...er, as long as you don't forget and try to use | or & for bit operations. Doubtless that will be a cultural issue for a while...
| [reply] |
Re: Can you spot the problem?
by Abigail-II (Bishop) on Mar 09, 2004 at 13:02 UTC
|
I've been bitten by a similar issue once:
my $avg = 0;
if (my ($x, $y) = $str =~ m!(\d+)/(\d+)!) {
$avg = $x / $y if $y;
}
That died with a 'division by zero' error. Can you spot why?
Abigail | [reply] [d/l] |
|
| [reply] |
|
Yep, and that's why chanting "don't compare to 0" when someone writes $y == 0 isn't always appropriate. My code would have run fine if the test had been unless $y == 0.
Abigail
| [reply] |
Re: Can you spot the problem?
by bart (Canon) on Mar 05, 2004 at 21:56 UTC
|
Time's almost up...
The reason is because ($a, $b, $c, $d) are all strings. That means "|" works on strings, bytewise: "189" | "2" yields "389". That is unlike 189 | 2 which returns 191 — where the operands are treated as integers.
Just converting one string into a number is enough to force "|" into numerical mode. | [reply] [d/l] [select] |
Re: Can you spot the problem?
by bmann (Priest) on Mar 05, 2004 at 02:11 UTC
|
Among others, the whole network 192.168.0.0/16...
Update In other words, any ip in the entire range from 192.168.0.0-192.168.255.255
Update2 Nope
| [reply] |
Re: Can you spot the problem?
by ambrus (Abbot) on Mar 05, 2004 at 16:38 UTC
|
| [reply] |
Re: Can you spot the problem?
by Nkuvu (Priest) on Mar 05, 2004 at 14:10 UTC
|
I have to admit that trying to solve puzzles like this pre-coffee are not the best idea. I looked at this for a while, and saw the problem. But I wasn't sure I saw the problem, so I started running the code.
Even knowing the problem, I couldn't find a good test case to make this fail until I looked at some of the other notes -- I'd become convinced that I was wrong about the problem. But my initial guess was correct.
So no cookie for me. But I'll settle for my coffee instead.
| [reply] |
Re: Can you spot the problem?
by diotalevi (Canon) on Mar 05, 2004 at 22:00 UTC
|
$\ = "\n";
print chr() . ", " . unpack( "B*", chr ) for ord( "0" ) .. ord( "9" );
print;
print chr() . ", " . unpack( "B*", chr ) for ord( "2" | "8" ) .. ord(
+"7" | "8" );
__DATA__
0, 00110000
1, 00110001
2, 00110010
3, 00110011
4, 00110100
5, 00110101
6, 00110110
7, 00110111
8, 00111000
9, 00111001
:, 00111010
;, 00111011
<, 00111100
=, 00111101
>, 00111110
?, 00111111
| [reply] [d/l] [select] |
Re: Can you spot the problem?
by ambrus (Abbot) on Mar 06, 2004 at 20:13 UTC
|
| [reply] [d/l] |
Re: Can you spot the problem?
by ambrus (Abbot) on Mar 05, 2004 at 18:59 UTC
|
I wonder weather this code could possibly work in perl6,
where the | operator will mean something
else (see Exegesis 6).
| [reply] |
|
Works if you change the all the |s to &s. (Plus you can omit the parens if you're golfing, since junctions now bind tighter than comparisons. Well, okay, you can omit the parens even if you're not golfing.)
| [reply] |
|
| [reply] |
|
|