Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: about word boundary in RE (use re 'debug')

by anaconda_wly (Scribe)
on Apr 02, 2013 at 07:53 UTC ( #1026613=note: print w/ replies, xml ) Need Help??


in reply to Re: about word boundary in RE (use re 'debug')
in thread about word boundary in RE

Good but seems not easily readable to me. If (.+\b) already match, why I need the \s?


Comment on Re^2: about word boundary in RE (use re 'debug')
Re^3: about word boundary in RE (use re 'debug')
by hdb (Prior) on Apr 02, 2013 at 08:00 UTC

    \b is a zero width match. It does need the space to recognize a word boundary, but it does not consume it. And therefore you need to add a space to your pattern.

      Sorry my mistake. I didn't understand that the \1 repeat the pattern. I just neglcted it. So it's clear to me now. The case is from website. Thanks!

Re^3: about word boundary in RE (use re 'debug')
by Anonymous Monk on Apr 02, 2013 at 08:21 UTC

    Good but seems not easily readable to me.

    In that case, use a shorter string, associate the numbers from "Final program" against those on the right side , like 1: OPEN1 (3)

    $ perl -Mre=debug -le " q/a a/ =~ /(.\b)\1/ " Compiling REx "(.\b)\1" Final program: 1: OPEN1 (3) 3: REG_ANY (4) 4: BOUND (5) 5: CLOSE1 (7) 7: REF1 (9) 9: END (0) minlen 1 Matching REx "(.\b)\1" against "a a" 0 <> <a a> | 1:OPEN1(3) 0 <> <a a> | 3:REG_ANY(4) 1 <a> < a> | 4:BOUND(5) 1 <a> < a> | 5:CLOSE1(7) 1 <a> < a> | 7:REF1(9) failed... 1 <a> < a> | 1:OPEN1(3) 1 <a> < a> | 3:REG_ANY(4) 2 <a > <a> | 4:BOUND(5) 2 <a > <a> | 5:CLOSE1(7) 2 <a > <a> | 7:REF1(9) failed... 2 <a > <a> | 1:OPEN1(3) 2 <a > <a> | 3:REG_ANY(4) 3 <a a> <> | 4:BOUND(5) 3 <a a> <> | 5:CLOSE1(7) 3 <a a> <> | 7:REF1(9) failed... 3 <a a> <> | 1:OPEN1(3) 3 <a a> <> | 3:REG_ANY(4) failed... Match failed Freeing REx: "(.\b)\1"

    Compare against a simpler pattern like

    $ perl -Mre=debug -le " q/aa/ =~ /a\b/ " Compiling REx "a\b" Final program: 1: EXACT <a> (3) 3: BOUND (4) 4: END (0) anchored "a" at 0 (checking anchored) minlen 1 Guessing start of match in sv for REx "a\b" against "aa" Found anchored substr "a" at offset 0... Guessed: match at offset 0 Matching REx "a\b" against "aa" 0 <> <aa> | 1:EXACT <a>(3) 1 <a> <a> | 3:BOUND(4) failed... 1 <a> <a> | 1:EXACT <a>(3) 2 <aa> <> | 3:BOUND(4) 2 <aa> <> | 4:END(0) Match successful! Freeing REx: "a\b"

    Then check the definition of \b in perlre#Assertions, perlrequick

    Perl defines the following zero-width assertions: The word anchor \b matches a boundary between a word character and a non-word character \w\W or \W\w
    $x = "Housecat catenates house and cat"; $x =~ /\bcat/; # matches cat in 'catenates' $x =~ /cat\b/; # matches cat in 'housecat' $x =~ /\bcat\b/; # matches 'cat' at end of string

    Basically your pattern can never match, just like this perl -Mre=debug -le " q/aa/ =~ /a\ba/ "

    there can never be a word boundary within a word by definition

      Thanks for your patience. I think I need a little more time on understand the lines lately.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1026613]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (12)
As of 2015-07-06 19:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (81 votes), past polls