Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

curious regex result for perl 5.8.8

by erodrig (Initiate)
on Jun 20, 2016 at 21:56 UTC ( [id://1166147]=perlquestion: print w/replies, xml ) Need Help??

erodrig has asked for the wisdom of the Perl Monks concerning the following question:

Consider the following program which just does a regex on two array entries:

@files=("zzz.21.yy.ccc", "zzz.220.ccc" ); foreach $name (@files) { chomp $name; $match="no "; $match="yes" if ( $name =~ /(^[a-z]{3})\.(\d{2,3})\..*\.ccc/) ; print "$match, $name, match1: $1, match2: $2\n"; }

When I run this on Linux perl 5.8.8 $2 does not seem correct for the second name:

The second name should have the same values since the regex match fails the second time, but $2 mysteriously takes "21." from the first entry. Can this be a problem with perl 5.8.8?

See below:

perl issueWithRegex.pl yes, zzz.21.yy.ccc, match1: zzz, match2: 21 no , zzz.220.ccc, match1: zzz, match2: 21.

With perl 5.10.1 this seems to run as I would expect it to:

perl issueWithRegex.pl yes, zzz.21.yy.ccc, match1: zzz, match2: 21 no , zzz.220.ccc, match1: zzz, match2: 21

Thanks for any comments.

Replies are listed 'Best First'.
Re: curious regex result for perl 5.8.8
by GrandFather (Saint) on Jun 20, 2016 at 22:53 UTC

    If there is no match the content of the capture variables is bogus. The fact that their content is a different value of bogus between different versions of Perl is not so important as the fact that the value is not defined when there is no match. Your code would be better written:

    use strict; use warnings; my @files = ("zzz.21.yy.ccc", "zzz.220.ccc"); foreach my $name (@files) { chomp $name; if ($name =~ /(^[a-z]{3})\.(\d{2,3})\..*\.ccc/) { print "yes, $name, match1: $1, match2: $2\n"; } else { print "no, $name\n"; } }

    Prints:

    yes, zzz.21.yy.ccc, match1: zzz, match2: 21 no, zzz.220.ccc
    Premature optimization is the root of all job security
      If there is no match the content of the capture variables is bogus
      That's not strictly accurate. The capture variables reflect the last successful match, where that match is dynamically scoped. The 5.8.x behaviour was a bug, fixed in 5.10.0 by commit c74340f9cdee.

      Dave.

        "That's not strictly accurate."

        Oh yes it is, probably! If there was no match then it's easy to argue that there should be no capture. If there was no capture the contents of the capture variables should be undef. The current "last capture" behavior then can easily and accurately be described as bogus. :-)

        I can understand that there are compelling (probably historical compatibility) reasons for the current behavior. In a big picture sense that doesn't make the behavior less bogus.

        Besides, bogus is a fun word so I like to use it - what's bogus with that?

        Premature optimization is the root of all job security
Re: curious regex result for perl 5.8.8
by Cristoforo (Curate) on Jun 20, 2016 at 22:52 UTC
    From the docs for perlre
    NOTE: Failed matches in Perl do not reset the match variables . . .

    You can see this more clearly with a slightly modified version of your code where the leading characters are 'xxx' instead of 'zzz'.

    my @files=("zzz.21.yy.ccc", "xxx.220.ccc" ); foreach my $name (@files) { my $match="no "; $match="yes" if ( $name =~ /(^[a-z]{3})\.(\d{2,3})\..*\.ccc/) ; print "$match, $name, match1: $1, match2: $2\n"; }
    The variables $1 and $2 retain the values they had on the previous successful match.
    yes, zzz.21.yy.ccc, match1: zzz, match2: 21 no , xxx.220.ccc, match1: zzz, match2: 21
Re: curious regex result for perl 5.8.8
by Anonymous Monk on Jun 20, 2016 at 22:38 UTC

    What happens when you explicitly specify /s or /m (as in m//s or m//m)?

    What do you observe as the difference between those two old perl versions under use re 'debug'; or rxrx?

    Can this be a problem with perl 5.8.8?

    If you have to ask the answer is no :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1166147]
Approved by Paladin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-04-23 13:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found