Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: regex not matching special char

by mnooning (Beadle)
on Dec 14, 2012 at 00:48 UTC ( #1008750=note: print w/ replies, xml ) Need Help??


in reply to Re: regex not matching special char
in thread regex not matching special char

Interesting. A backslash is a word boundary, but a backslashed dollar sign is itself a non-word char itself, and hence is also a word boundary to the string "AVG" which follows it. Rats!

I need to distinguish between strings such as "$AVG", A$AVG", "A$AVGA". Hence my attempt to do it using \b$AVG\b.

Am I asking too much of Perl regex?


Comment on Re^2: regex not matching special char
Replies are listed 'Best First'.
Re^3: regex not matching special char
by AnomalousMonk (Abbot) on Dec 14, 2012 at 05:43 UTC

    In addition to looking at the documentation linked by kennethk (and also at perlretut; see in particular the section titled 'Looking ahead and looking behind'), perhaps some insight as to the effect of the  \b (and \B) zero-width word (and non-word) boundary assertions can be gained by split-ting one of the OPed example strings on each assertion:

    >perl -wMstrict -le "my $line2 = 'I:\$AVG\hello.log'; printf qq{'$_' } for split /\b/, $line2; print qq{\n}; printf qq{'$_' } for split /\B/, $line2; " 'I' ':\$' 'AVG' '\' 'hello' '.' 'log' 'I:' '\' '$A' 'V' 'G\h' 'e' 'l' 'l' 'o.l' 'o' 'g'
Re^3: regex not matching special char
by Anonymous Monk on Dec 14, 2012 at 08:50 UTC

    Am I asking too much of Perl regex?

    ;) No, is Perl regex asking too much by asking you to know what you're asking for ? *zing*

    perlrequick

    Luckily YAPE::Regex::Explain handles these

    use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( '\b\$AVG' )->explain; __END__ The regular expression: (?-imsx:\b\$AVG) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char ---------------------------------------------------------------------- \$ '$' ---------------------------------------------------------------------- AVG 'AVG' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
    use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( '\w?\$AVG\b' )->explain; __END__The regular expression: (?-imsx:\w?\$AVG\b) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \w? word characters (a-z, A-Z, 0-9, _) (optional (matching the most amount possible)) ---------------------------------------------------------------------- \$ '$' ---------------------------------------------------------------------- AVG 'AVG' ---------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

      Thanks for the tip on YAPE::Regex::Explain.

      As kennethk pointed out - reading between the lines - the key is to know that a back slashed special character will not be taken as a word character, and hence will be seen as a word boundary itself. That is why the simple \b does not work.

      Armed with a definitive answer as to what is happening, a simple work around can be constructed. In my case the strings I am looking for are actual directory names. I can split on the directory separator.

      Thanks again.
Re^3: regex not matching special char
by muba (Priest) on Dec 14, 2012 at 04:24 UTC

    What if you change \b to \w?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1008750]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2015-07-31 06:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls