Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I've just found an extremely obscure bug in perl. This is the tale of that bug - where it came from, and how it was found. I hope you find it entertaining.


Yesterday, blakem posted a question about while loops. A while loop in Perl looks like this:
or like this:

The body of the loop (the STATEMENT or BLOCK) is repeatedly executed as long as the condition is true. In Perl, a scalar is false if it is equal, as a string, to either "" or "0". Otherwise it's true. (The undefined value undef has a string value of "", so undef is false.)

But this being Perl, that's not the end of it. Overloading causes one complication, but that's a story for another day. When you're processing a text file, you will often use a loop like this:

while (my $line = <FILE>) { # Do something with the $line }

The problem might come if the file has a line which is blank, or just "0". Ordinarily that wouldn't matter, because the line would be terminated with "\n". So the value of $line would be "\n" or "0\n", which are both true. But what if it's the last line of the file, and it isn't terminated with a newline? Or what if you used the -l command-line switch to perl, which tells it to remove the line terminator automatically? We have to think about these things. The loop condition will then be false, and the loop will terminate before the file's been completely read.

Or will it? Actually, no. Perl is one step ahead, as usual. If Perl sees a loop like the one above, then it will secretly transform it into:

while (defined(my $line = <FILE>)) { # Do something with the $line }
and all is well. Your program works the way that you intended, and peace reigns in the Monastery. You can see this transformation by using the B::Deparse module:

$ perl -MO=Deparse -e 'while(my $line=<FILE>) { print; }' while (defined(my $line = <FILE>)) { print $_; } -e syntax OK

There is one other kind of loop which gets the same magic treatment, and that's if you use the glob operator. The glob operator returns the names of all the files which match a particular wildcard pattern: for example, you can print the names of all the Perl modules in the current directory like this:

while (my $module = glob("*.pm")) { print "Found a module: $module\n"; }
Now, you might have a file called "0", which would cause the same problem as before. So Perl uses the magic defined test here as well.

The old bug

What blakem noticed yesterday is that this magic wasn't working properly in a while loop of the second kind (STATEMENT while CONDITION;). You can test it easily enough: run this code:
my $ok=0; $ok=1 while my $zero = glob("0"); print $ok ? "ok\n" : "not ok\n"
I looked into the source code for perl, and I found a mistake in the code which was testing for that kind of condition. (The code was wrongly checking for a NULL opcode rather than a GLOB opcode.) I wrote up the correction as a patch, and sent it to the perl development mailing list, perl5-porters.

Hugo, one of the porters, replied to my message saying

Interesting - is there any situation in which that test could have succeeded, and so tested definedness when it should have tested truth?
This is starting to get confusing... Let me try and sum up the story up to this point:
  1. In some situations, perl magically puts a defined(...) test around the condition in a while loop.
  2. But the magic wasn't always working properly.
  3. The reason it wasn't working is that there was a mistake in the perl source code...
  4. ... meaning that a different condition was being checked for. (Instead of checking for a GLOB opcode, it was checking for a NULL opcode)
What Hugo asked was whether that condition could ever arise. The question surprised me: I hadn't even considered the possibility of it. So I thought a bit, and poked around a bit, and I realised that the answer was "yes". It can happen!

I won't go too deep into the details, because I don't want this post to turn into a book on the internals of perl. But I'll tell you what I found.

The new bug

I found that the broken condition is triggered by an assignment statement which has a logical operator on its right-hand side. For example,
$foo = $bar || $baz

If a statement like that is used as the condition of a while loop, then perl will (wrongly) insert a defined(...) test. For example,

my $x = 1; die("Oh dear!") while my $foo = $x && 0;
will be silently turned into
my $x = 1; die("Oh dear!") while defined(my $foo = $x && 0);
and so it will die, even though the condition seems to be false!

I wonder if this is the most obscure bug found in perl to date.

Edit Masem 2001-12-18 - Fixed amp; entity to real & in last code blocks.

In reply to A most obscure bug by robin

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others taking refuge in the Monastery: (3)
    As of 2018-05-26 18:23 GMT
    Find Nodes?
      Voting Booth?