Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

notabug quiz

by ysth (Canon)
on Dec 12, 2003 at 06:46 UTC ( [id://314238]=perlmeditation: print w/replies, xml ) Need Help??

This is a quiz. Below are 6 examples of perl "bugs" that have been reported to perl5-porters in the last 3 weeks and determined to be not bugs. One reporter, (I believe) the only perlmonk among them, had already figured out the difficulty and was providing a documentation patch to clarify things. Of the others, four are somewhat documented.

Your mission, should you choose to accept it, is to find the documentation (if any) that supports perl's actual behaviour. Or at least figure out why perl is doing what it is doing. Or at least have some fun and maybe learn something new.

use strict; use warnings; use vars '$x'; sub ok ($$) { my ($testnum, $check) = @_; if ($check) { print "ok $testnum\n"; } else { print "not ok $testnum\n" } } $x=0; { no strict 'refs'; my $x=42; my $y = 'x'; ok 1, ${$y} == 42; } ok 2, eval 'no warnings; sub Foo::INIT { 42 } &Foo::INIT();'; $x="ad"; for ($x) { /a/gc; /\Gb?/gc; ok 3, /\Gc?/gc; } ok 4, eval ' "(R)" =~ m(\(?r\)?)i '; $x=1; { my $x=2; sub x {eval '$x'} } { my $x=3; ok 5, x; } ok 6, 17.98 == 17.99 - .01;
Update: reordered the tests (before any replies) Update: verbosify ok() sub

Replies are listed 'Best First'.
Re: notabug quiz
by liz (Monsignor) on Dec 12, 2003 at 08:42 UTC
    sub Foo::INIT { print "Hello world\n" } &Foo::INIT;

    I think I'm responsible for this one, more or less. For a while I was thinking that BEGIN, CHECK, INIT and END were actually subroutines. Well, they aren't. They're "magic" code blocks, that just happen to allow a "sub" prefix to confuse the hell out of everybody. Which is where the obfuscation In the BEGINning is based on.

    Note that you can actually call subroutines named BEGIN, CHECK, INIT or END, but you need a special action to get them defined:

    *Foo::INIT = sub { print "Hello world\n" }; &Foo::INIT; __END__ Hello world

    Liz

        Those magical codeblocks are detectable via caller EXPR though. This is a point in favour of subs in the "codeblocks vs. subs" dilemma.
        --kap
Re: notabug quiz
by davido (Cardinal) on Dec 12, 2003 at 08:40 UTC
    Here is a spoiler for #4: ok4, eval ' "(R)" =~ m(\(?r)\)?)i ';

    First, capture the error that's occurring by printing the error contained in $@:

    Sequence (?r...) not recognized in regex; marked by <-- HERE in m/(?r <-- HERE /

    From that we see what perl (the executable) finds to be a problem: (?....) is a trigger mechanism for extended parenthesis patterns. For example, (?:...) represents a non-capturing paren. (?=...) represents a zero-width lookahead assertion. But there is no such thing as (?r...). perl doesn't know how to compile that, and generates a compile-time error. You can verify that fact by moving the regexp in question outside the eval block. The code never gets past the compile stage if you do.

    But why is it that the "\(" escaped parens are not staying escaped? Ah, that's the mystery. If you change the RE from m(.....) to m/...../, the problem goes away. Double mystery? Or clue perhaps? At first I dug into perlre but quickly realized that the answer had something to do with the choice of parenthesis as quotish-delimeters, and re-focused the investigation on the Quote and Quote-Like Operators section of perlop.

    There we can read the following:

    Gory details of parsing quoted constructs

    Finding the end
    The first pass is finding the end of the quoted construct, whether it be a multicharacter delimeter "\nEOF\n" in the <<EOF construct, a / that terminates a qq// construst, a ] which terminates a qq[] construct, or a > which terminates a fileglob started with <.

    When searching for single-character non-pairing delimeters such as /, combinations of \\ and \/ are skipped. However, when searching for single-character paring delimeter like [, combinations of \\, \], and \[ are skipped, and nested [, ] are skipped as well.
    .....
    Removal of backslashes before delimiters
    During the second pass, text between the starting and ending delimiters is copied to a safe location, and the \ is removed from combinations consisting of \ and delimiter--or delimiters, meaning both starting and ending delimiters will should these differ. This removal does not happen for multi-character delimiters. Not that the combination of \\ is left in tact, just as it was.

    So what is actually happening is that when using m(...) (pairing delimiters) the \( and \) are, in the first pass, skipped as you would expect, and the actual beginning and end of the quoted material (the regex) are found properly.

    But in the second pass, the \ backslash is stripped away from the inner parens, as documented. The result is that the inner parens are no longer escaped. When passed to the Regular Expression compiler, what the compiler recieves is: m/(?r)?/i. And in particular, as we saw before, (?r) is a syntax that the compiler has no idea what to do with. Thus endeth the story.


    Dave

Re: notabug quiz
by Zaxo (Archbishop) on Dec 12, 2003 at 08:59 UTC

    #6 is the common and unavoidable "binary does decimal" floating point error. Documented in perlfaq4.

    After Compline,
    Zaxo

      Curiously, perl 5.8.2 on RH9 passes this test. For me, 17.98 == 17.99 - 0.01 :-)

      ------
      We are the carpenters and bricklayers of the Information Age.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: notabug quiz
by davido (Cardinal) on Dec 12, 2003 at 09:17 UTC
    Here is the spoiler for #1: $x=0; { no strict 'refs'; my $x=42; my $y = 'x'; ok 1, ${$y} == 42; }

    This is an example of using symbolic references (which is why you had to put the no strict 'refs' part in.)

    And to keep things straight, remember that the $x=0 piece of code is referring to a package global, whereas the my $x=42; piece of code is creating a lexical scalar. The lexical is a different entity than the package global. Within the {...} block, the global $x is pretty much hidden by the lexical $x of the same name. And when everyone plays nice and avoids symbolic references, that's usually the behavior we see and expect; lexical scoping masking variables of the same name from broader-scoped blocks, and the broader-scoped lexicals (or globals) being protected from whatever the more narrowly scoped entities of the same name are doing.

    But when you play with symbolic refs, you have to remember that the symbolic reference always refers to the package global, which in the example, equals 0, not 42.

    This behavior is documented in perlref:

    Only package variables (globals, even if localized) are visible to symbolic references. Lexical variables (declared with my()) aren't in a symbol table, and thus are invisible to this mechanism.

    And the POD even provides a similar example:

    local $value = 10; $ref = "value"; { my $value = 20; print $$ref; } __OUTPUT__ 10


    Dave

      There is also a quip about this in perlfaq7:How can I access a dynamic variable while a similarly named lexical is in scope? .

      -enlil

      And while more elaborate, the same applies, of course, to test #5. Ugh, I misread the test - too used to Test::More semantics.

      Makeshifts last the longest.

Re: notabug quiz
by davido (Cardinal) on Dec 12, 2003 at 09:56 UTC
    Here is the spoiler for #5: $x=1; { my $x=2; sub x {eval '$x'} } { my $x=3; ok 5, x }

    In the POD for eval we see that expressions (contained in quotes) are compiled and evaluated at runtime, whereas code blocks are put to bed at compiletime.

    So walk through what's going on here....

    The "my $x=2" is a lexically scoped scalar, contained in the same lexical block as the eval. eval '$x'; refers to that $x, because the sub in which the eval exists is defined inside the same lexical scope as that particular $x. If we weren't using 'eval', it would be a no-brainer that the $x that the sub sees is the one that is equal to 2.

    The my $x=3 exists within a different lexical block, and has nothing to do with x(), because x() was compiled in a lexical block where $x=3 couldn't be seen.

    What gets confusing is the issue of reference counting, and runtime compilation. When the block in which $x=2 is defined ends, the eval '$x' hasn't yet been evaluated. Thus, when the block ends, the reference count for that $x drops to zero, it falls out of scope, and disappears. Then later on, x() is called, and the eval is executed. It tries to access the $x that has already dissappeared from existance, and ends up with an undefined value.

    At first I had to wonder why the package global version of $x wasn't being seen from within the eval instead. But that's easy; it was masked by the lexical $x=2 at the time the sub was compiled.

    This code essentially represents a closure, but as ysth later commented, "Perl doesn't realize it's a closure until it's too late." (thanks for the puzzles, ysth).


    Dave

Re: notabug quiz (Spoiler for #3)
by Enlil (Parson) on Dec 12, 2003 at 10:05 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://314238]
Approved by Enlil
Front-paged by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-03-19 05:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found