Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Regex help

by pKai (Priest)
on Jul 25, 2009 at 22:58 UTC ( [id://783236]=note: print w/replies, xml ) Need Help??


in reply to Regex help

E:\Temp>perl -Mstrict -we "$_=7;die qq(matched '$&'\n) if '1234567_.'= +~/[$_]/" matched '7'

Seems like any punctation variables are interpolated inside regex character classes.

This is a surprise (for me at least; and to ww above too it seems)

Is that documented behaviour? Where?


Update: Fixed attribution of surprise, naming the wrong person (graff) when citing a post of ww.

Replies are listed 'Best First'.
Re^2: Regex help
by graff (Chancellor) on Jul 26, 2009 at 02:50 UTC
    Is that documented behaviour? Where?

    Yes, in perlre, as follows:

    An unescaped "$" or "@" interpolates the corresponding variable, while escaping will cause the literal string "\$" to be matched.

    (Though in the version of the perlre man page I have installed, for perl 5.8.8, this sentence comes second in a paragraph that begins with:

    You cannot include a literal "$" or "@" within a "\Q" sequence."

    I can understand that some might consider this obscure.)

      Those remarks in perlre are not specific to character classes, and one regularly thinks these character classes are more special.

      Explicit mentioning of $ being special in character classes is found in perlretut#Using-character-classes:

      …The special characters for a character class are -]\^$ (and the pattern delimiter, whatever it is). ] is special because it denotes the end of a character class. $ is special because it denotes a scalar variable.…

      So indead not only punctation variables are being expanded:

      E:\Temp>perl -Mstrict -we "my $foo=7;die qq(matched '$&'\n) if '123456 +7rab_.'=~/[${foo}bar]+/" matched '7rab'
Re^2: Regex help
by Anonymous Monk on Jul 26, 2009 at 03:05 UTC
    perlre also says Because patterns are processed as double quoted strings, the following also work:
    \t tab (HT, TAB) \n newline (LF, NL) \r return (CR) \f form feed (FF) \a alarm (bell) (BEL) \e escape (think troff) (ESC) \033 octal char (example: ESC) \x1B hex char (example: ESC) \x{263a} long hex char (example: Unicode SMILEY) \cK control char (example: VT) \N{name} named Unicode character \l lowercase next char (think vi) \u uppercase next char (think vi) \L lowercase till \E (think vi) \U uppercase till \E (think vi) \E end case modification (think vi) \Q quote (disable) pattern metacharacters till \E
    So for no interpolation, you can use qr'', m'', s'''
    my $f = 2; print qr/$f/,"\n"; # (?-xism:2) print qr'$f',"\n"; # (?-xism:$f)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://783236]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-24 07:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found