Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Odd...

by Ovid (Cardinal)
on Sep 19, 2000 at 23:23 UTC ( #33183=perlmeditation: print w/ replies, xml ) Need Help??

From the "learn something new every day" department: Going through some of my company's source code, I stumbled upon a weird method of using a regex:
#!/usr/bin/perl -w use strict; my $test = 'aba'; print "Good\n" if $test =~ ab;
Yup, it prints "Good" with no warnings, no problem with a bareword (ActiveState 5.005 and 5.6). Where the heck is this documented? I've never seen a "bare" regex like that. I wonder if there are any problems with this, so long as it's a simple regex ($test =~ ^aba$; fails, for example)?

Cheers,
Ovid

Update: After a fair amount of discussion in the chatterbox, it was agreed that this is definitely a bug for two reasons:

  • The behavior is inconsistent amongst different versions of Perl and on different operating systems.
  • strict should catch an error on the bareword.
Because of this, and because no one could find this documented either as a feature or a bug, a bug report was submitted.

Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

Comment on Odd...
Download Code
RE: Odd...
by merlyn (Sage) on Sep 19, 2000 at 23:25 UTC
    It's just a normal bareword:
    $x = 'ab'; $x = ab;
    Oh, I see what you're saying. If expecting a regex, you can use a bareword and it gets promoted first to a string, then to a regex, all without tripping use strict! I don't think that's documented.

    -- Randal L. Schwartz, Perl hacker

      Can't you just use any character (here the space) to delimit a regexp?

        This code actually does not pass -w and use strict on my Solaris/Perl 5.6 combo.

        Are you sure ab is not pre-compiled or some similar trickery?

        I think you can only use alternative delimiters if you use the complete form of the regex like:  $a =~ m AreA;

        And then spaces aren't allowed anymore since 5.002 or somesuch. In fact in the above the space is REQUIRED between the "m" and the "AreA" portions or Ovid's bug kicks in and it doesn't parse correctly!

        use strict; my $a = "freak"; my $b = "freakmAreA"; print "Yay!\n" if $a=~m AreA; print "Boo!\n" if $a=~mAreA; print "BUG!\n" if $b=~mAreA;

        The above prints "Yay! and BUG!" in 5.6 under Linux. The "bare" word promotion is happening even when it shouldn't IMHO. Altho reading `perldoc perlop` leads me to believe that it should have not printed "Yay!" either:

        If "/" is the delimiter then the initial m is optional. With the m you can use any pair of non- alphanumeric, non-whitespace characters as delimiters. This is particularly useful for matching path names that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is the delimiter, then the match-only-once rule of ?PATTERN? applies. If "'" is the delimiter, no interpolation is performed on the PATTERN.

        ... since really ...

        $c =~ s eieioeio; #or $d =~ s ei$oeio$ioeieo;

        ... radiates pure evil. =)

        --
        $you = new YOU;
        honk() if $you->love(perl)

(Ovid - benchmarking a bare regex) RE: Odd...
by Ovid (Cardinal) on Sep 20, 2000 at 00:23 UTC
    mirod said:
    Can't you just use any character (here the space) to delimit a regexp?
    That's not happening here:
    print "Good\n" if $test =~ ab;
    There is no space to the right of the regex. If anything, it would be a word boundary as the delimeter (\b in a regex). This appears to be, as merlyn mentioned, a bareword being promoted to a string and then to a regex.

    mirod said:

    This code actually does not pass -w and use strict on my Solaris/Perl 5.6 combo.

    Are you sure ab is not pre-compiled or some similar trickery?

    Nope, it's not precompiled. I just got home (I'm home early due to being sick as a dog. Don't lick my posts, I wouldn't want you to get ill) and tried it on my Win98 box and again, it runs fine.

    Naturally, I'm curious as to the performance aspects, so I decided to benchmark this.

    #!/usr/bin/perl -w use strict; use Benchmark; my $test; #print "Good\n" if $test =~ ab; timethese(-15, { bare => '$test = "aba"; $test =~ ab', delimited => '$test = "aba"; $test =~ /ab/' });
    Whups! All of a sudden, I get a whole slew of messages like the following:
    Unquoted string "ab" may clash with future reserved word at (eval 1) l +ine 1.
    Somehow, eval is catching the problem. By removing the -w switch, the code ran fine (but still works with strict, go figure).

    Incidentally, the benchmark results showed no significant performance impact:

    Benchmark: running bare, delimited, each for at least 15 CPU seconds.. +. bare: 13 wallclock secs (15.01 usr + 0.00 sys = 15.01 CPU) @ 45 +7834.84/s (n=6872101) delimited: 16 wallclock secs (15.00 usr + 0.00 sys = 15.00 CPU) @ 45 +8124.53/s (n=6871868)
    I want to explore this more and see just how complicated of a regex I can make here, but I need to get some sleep.

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://33183]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2014-07-31 01:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (244 votes), past polls