Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: A regex that only matches at offset that are multiples of a given N?

by smls (Friar)
on Feb 13, 2013 at 23:35 UTC ( #1018660=note: print w/ replies, xml ) Need Help??


in reply to A regex that only matches at offset that are multiples of a given N?

Alright, here is a revised version of johngg's solution, that prevents the redundant double matching, while at the same time keeping the matching logic self-contained within the regex.

Instead of moving pos() forward manually (like I suggested in the discussion thread for johngg's solution), it lets the regex engine do this implicitly by having it gobble up $n characters (if available) after matching the zero-width look-ahead that contains the capture group:

# 0 5 10 15 20 25 30 35 # ' ' ' ' ' ' ' ' $_ = q{.....fred1..fred2...fred3....fred4..}; # ----++++----||||----||||----++++---- $n = 4 # -----||||+-----+++++||||-+++++-----+ $n = 5 my $capture = qr([0-9]\.+); # the (....) in the OP's specification for my $n ( 4, 5 ) { say "\$n = $n"; while ( m[\G(?:.{$n})*?(?=fred($capture)).{0,$n}]g ) { say " matched 'fred$1' at pos @{[pos($_)-$n]} (gobbled '$&')"; } }

Output:

$n = 4 matched 'fred2...' at pos 12 (gobbled '.....fred1..fred') matched 'fred3....' at pos 20 (gobbled '2...fred') $n = 5 matched 'fred1..' at pos 5 (gobbled '.....fred1') matched 'fred3....' at pos 20 (gobbled '..fred2...fred3')

Note that if length("fred$1") > $n, it will actually start looking for the next "fred" while still whithin the part matched by $1. If this must be avoided, I guess manual pos()-incrementing is still the best bet.


Comment on Re: A regex that only matches at offset that are multiples of a given N?
Select or Download Code
Re^2: A regex that only matches at offset that are multiples of a given N?
by BrowserUk (Pope) on Feb 14, 2013 at 04:31 UTC

    That is really very clever. Thank you. (I'll get around to trying it out in my real application later and let you know how I get on.)

    I also really like your test methodology. Mixing the freds at different multiple boundaries within the same string is a very neat way of testing.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Mixing the freds at different multiple boundaries within the same string is a very neat way of testing.
      I can't take credit for that, it was copied from johngg's answer. I just made it a little more readable.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018660]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (11)
As of 2014-09-16 14:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (31 votes), past polls