Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^3: Using Look-ahead and Look-behind

by AnomalousMonk (Abbot)
on Jun 25, 2011 at 19:51 UTC ( #911397=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Using Look-ahead and Look-behind
in thread Using Look-ahead and Look-behind

Here's a solution that exactly matches the phrases specified in AnonyMonk's Re: Using Look-ahead and Look-behind post (which the code of Re^2: Using Look-ahead and Look-behind does not quite do), and also shows how to use the newfangled backtracking control verbs of 5.10 to emulate variable-width negative look-behind. Variable-width positive look-behind is emulated by 5.10's  \K assertion.

Explanation:

  • Any 'equity' that is preceded by
    • either a character that is not a comma or whitespace, or
    • by the 'private' phrase
    FAILS and is skipped over (this test has first precedence);
  • Otherwise, any 'equity' that is not followed by a comma that is then followed by any non-whitespace SUCCEEDS.

>perl -wMstrict -le "use Test::More 'no_plan'; ;; for my $ar_vector ( [ YES => 'equity, private equity', ], [ YES => 'equity', ], [ no => 'private equity', ], [ YES => 'private equity,equity', ], [ YES => 'private equity, equity', ], [ no => 'equity,private equity', ], [ no => 'private equity', ], [ no => 'mutual funds', ], [ no => 'cds' ], ) { my ($expected, $string) = @$ar_vector; is match($string), $expected, qq{'$string'}; } ;; sub match { my ($string) = @_; ;; my $char_not_comma_or_space = qr{ [^,\s] }xms; my $private = qr{ private \s+ }xms; return 'YES' if $string =~ m{ (?: $char_not_comma_or_space | $private) equity (*SKIP)(*FAIL) | equity (?! , \S) }xms; return 'no', } " ok 1 - 'equity, private equity' ok 2 - 'equity' ok 3 - 'private equity' ok 4 - 'private equity,equity' ok 5 - 'private equity, equity' ok 6 - 'equity,private equity' ok 7 - 'private equity' ok 8 - 'mutual funds' ok 9 - 'cds' 1..9


Comment on Re^3: Using Look-ahead and Look-behind
Select or Download Code
Re^4: Using Look-ahead and Look-behind
by JohnN (Initiate) on Oct 15, 2012 at 15:09 UTC

    I have a dumb question.

    This code works well (THANKS Roy!) when looking for DNA string matches within a genome sequence but not when the * is changed to {50,100}

    e.g.
    /CCGG # Match starting at DNA sequence CCGG ( (?: (?!CCGG) # make sure we're not finding duplicates mid-stream . # accept any character )*? # any number of times BUT not greedily <==== ) AATT # and ending at AATT /x;

    versus

    /CCGG ( (?: (?!CCGG) . ){50,100}? # <==== ) AATT # and ending at AATT /x;

    This latter one does not have dupes of CCGG but does have dupes of AATT. The previous snippet has no dupes of either CCGG or AATT.

    A follow-up: The following code snippet fixes my problem, and I have NO idea why! I tried it out of desperation

    /CCGG ( (?: (?!AATT|CCGG) # <============= . # ){50,100}? # Here the "?" is not required but I'm anal ) # AATT # /x;
      When * is changed to ^, it does not work either. Why are you changing it at all?

      But jokes aside: The *? matches after seeing the first occurence of AATT, so there are no dupes. The {50,100} must match at least 50 times, so if there is AATT after say 25th character, it cannot stop there and must match a larger string.

      Use YAPE::Regex::Explain to see what your regular expresions mean.

      Moreover, you are replying to a node that is not related to your question.

      لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Great, please take it to Seekers Of Perl Wisdom, see Re^2: Using Look-ahead and Look-behind

      You forgot to include sample input, no matter, here are clues, run these and compare

      perl -Mre=debug -le " $_ = q/foobarfoodrinkAATT/; /foo((?:(?!bar).){1, +5}?)AATT/; "

      perl -Mre=debug -le " $_ = q/foobarfoodrinkAATTAATT/; /foo((?:(?!bar). +){6,10}?)AATT/; "

      50,100 means match at minimum 50 but no more than 100

      .* means match at least zero times

      in my short example, first AATT appears at 6, so it is included in the match

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://911397]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2015-07-04 03:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (57 votes), past polls