Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
You have problems to understand backtracking, which is essentially a branch-and-bound recursion on partial regexes.

Partial matches of a regex are anchoring the start for recursive attempts to match the rest of the regex.

Quantifiers - greedy or non-greedy - only work from left to right. They won't force the left side to shrink as long as they do match to the right!

In the following examples the regex anchors at the semicolon before the quantifier /;(.)/ and tries to match further.

The starting point for each following match is after the last match.

But if the whole regex can't match, the backtracking forces the regex to try the next semicolon as an anchor.

DB<119> $x => "junk ;a;not;b;not;c;; junk ;d;not;e;; junk " DB<120> $x =~ /;(.*);;/g # greedy => 1 match first anch +oring ";" => "a;not;b;not;c;; junk ;d;not;e" DB<121> $x =~ /;(.*?);;/g # non-greedy => 2 matches on first + anchors => ("a;not;b;not;c", "d;not;e") DB<122> $x =~ /;([^;]*?);;/g # non-greedy, trying multiple anch +ors. => ("c", "e")

BTW: Did I mention that Friedl's book is a good read? =)

Cheers Rolf

( addicted to the Perl Programming Language)


from perlretut

The process of trying one alternative, seeing if it matches, and moving on to the next alternative, while going back in the string from where the previous alternative was tried, if it doesn't, is called backtracking. The term backtracking comes from the idea that matching a regexp is like a walk in the woods. Successfully matching a regexp is like arriving at a destination. There are many possible trailheads, one for each string position, and each one is tried in order, left to right. From each trailhead there may be many paths, some of which get you there, and some which are dead ends. When you walk along a trail and hit a dead end, you have to backtrack along the trail to an earlier point to try another trail. If you hit your destination, you stop immediately and forget about trying all the other trails. You are persistent, and only if you have tried all the trails from all the trailheads and not arrived at your destination, do you declare failure. To be concrete, here is a step-by-step analysis of what Perl does when it tries to match the regexp

In reply to Re^4: Selecting HL7 Transactions (backtracking and partial anchors) by LanX
in thread Selecting HL7 Transactions by BillDowns

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    What's the matter? Cat got your tongue?...

    How do I use this? | Other CB clients
    Other Users?
    Others taking refuge in the Monastery: (4)
    As of 2018-03-20 00:03 GMT
    Find Nodes?
      Voting Booth?
      When I think of a mole I think of:

      Results (246 votes). Check out past polls.