Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

ah, can someone please explain why the hell only "trigger1" triggs in the code's driving me's the exact same code for gods sake?!!
my $title = q(the 2000); if ($title =~/^the ( *?)\d\d\d\d$/ig) { print "\ntrigger1" #PRINTS } if ($title =~/^the ( *?)\d\d\d\d$/ig) { print "\ntrigger2" #NO PRINT }


Replies are listed 'Best First'.
Re: regexp mystery no xxx
by Zaxo (Archbishop) on Aug 18, 2003 at 21:18 UTC

    The /g switch is causing pos to be set to the end of the $title string. Omit /g on the first test and this behaves as you think it should.

    After Compline,

      ah, yeah.. hehe.. Thanks 1 million Zaxo for the quick reply..(..arg, bug hunting makes me tired and frustrated..)

Re: regexp mystery no xxx
by davido (Cardinal) on Aug 18, 2003 at 21:33 UTC
    It is the /g modifier.

    From the Camel book:

    "In a scalar context, m//g iterates through the string, returning true for each time it matches, and false when it eventually runs out of matches. (In other words, it remembers where it left off last time and restarts the search at that point...)... ...If you modify the string in any way, the match position is reset to the beginning."

    Take off the /g modifier and it will work properly. Let me suggest a couple of other improvements to the regexp you're using, by illustrating what I feel is a cleaner example of what you have (minus the /g, of course):

    my $title = q(the 2000); if ($title =~/^the\s(\s*?)\d{4}$/i) { print "\ntrigger1\n" } if ($title =~/^the\s(\s*?)\d{4}$/i) { print "\ntrigger2\n" }

    Note, in the example provided I put a \n at the end of each print, particularly because not all implementations on all operating systems will flush <STDOUT> when the program exits, so in *some* OS's, the final line might not print if it isn't terminated with a "\n". To get back to your original functionality just remove those two trailing \n's. Next, getting down to brass tacks. In the regexp's, I really prefer using something like \s when I mean whitespace, rather than leaving whitespace in the expression. It can be difficult to count through, when revising the code later, especially if several spaces are in a row. If \s doesn't give desired behavior, specify the character explicitly with \x20, or whatever.

    The other issue is avoiding \d\d\d\d. It works but it's not as elegant as \d{4}.

    Those are just opinions. The fact is that all you HAD to do to make your example work was remove the /g modifiers.

    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein
Re: regexp mystery no xxx
by ides (Deacon) on Aug 18, 2003 at 21:22 UTC

    WOW! I have no idea why, but if you remove the 'g' option on the second regex it works as expected. There really isn't a need for the 'g' in these examples because you're binding on ^ and $ which makes the 'g' useless.

    Frank Wiles <>