Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
... how to parse a string which has repetitive data ...

As choroba pointed out, every  (pattern) pair of parentheses in a regex captures something (even undef possibly) to its corresponding capture variable. One way to parse a string using nested regexes is avoid using a gazillion capturing groups. Use the non-capturing  (?:pattern) instead for grouping. See perlre, perlrequick, perlretut. In the IP example (but this should generalize to any repetitive data you wish to extract):

>perl -wMstrict -le "my $decimal_octet = qr{ 2 (?: [0-4] \d | 5 [0-5]) | [01]? \d? \d }xms; my $ip = qr{ (?<! \d) $decimal_octet (?: \. $decimal_octet){3} (?! \d) }xms; print $ip; ;; my $s = '123.45.6.234 xx yyy zz 000.12.34.255'; my @ips = $s =~ m{ $ip }xmsg; printf qq{'$_' } for @ips; " (?^msx: (?<! \d) (?^msx: 2 (?: [0-4] \d | 5 [0-5]) | [01]? \d? \d ) (? +: \. (?^msx: 2 (?: [0-4] \d | 5 [0-5]) | [01]? \d? \d )){3} (?! \d) ) '123.45.6.234' '000.12.34.255'

Note that neither  (?:pattern) nor the  (?<!pattern) (?!pattern) look-around assertions capture. Indeed, nothing captures (to a capture variable) since data is extracted in list context directly to an array.


In reply to Re^3: Using variable to hold regex expression by AnomalousMonk
in thread Using variable to hold regex expression by salatconed

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (8)
    As of 2015-07-06 09:31 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (71 votes), past polls