Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: RegEx + vs. {1,}

by ELISHEVA (Prior)
on Oct 10, 2012 at 14:16 UTC ( #998232=note: print w/replies, xml ) Need Help??


in reply to RegEx + vs. {1,}

If you want a list of all two letter patterns that appear at least twice somewhere in your string, you need to make three changes to your regex.

  1. you need to make (\w{2,}) non-greedy by adding a "?" to the end, e.g. (\w{2,}?).
  2. you need to wrap what comes after (\w{2,}?) with a zero width lookahead group. Otherwise you will miss all the matches between the first and second occurrence of "ab"
  3. you need to handle repetitions of your regex slightly differently. Instead of /( mumblefoo )+/ you need /mumblefoo/g. Using a + the way you did will only get you the last match found because each time the + causes the regex to repeat, it replaces the previous match.

Taken together these changes will make your regex will look like this: /(\w{2,}?)(?=.*?\1)/g:

print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab", "\n"; print "<" . join('|',$x =~ /(\w{2,}?)(?=.*?\1)/g) , ">\n"; #outputs: <ab|cd|ef|ab|cd|ab>

You can more info on zerolength lookaheads via the Extended Patterns section of the perlre manpage on perldoc

Replies are listed 'Best First'.
Re^2: RegEx + vs. {1,}
by choroba (Chancellor) on Oct 10, 2012 at 14:25 UTC
    Or, just remove the comma from the quantifier: \w{2}
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Indeed. There can never be a three character sequence in your string which occurs more frequently than a two character sequence. (Because the three character sequence contains two two character sequences.)

      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
        But what should be returned if we have string 'abcabcabcdef' with same amount of three-chars-long and two-chars-long string? 'ab' or 'abc'? I assume OP wants the longer one.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998232]
help
Chatterbox?
[marto]: hey Corion, good weekend?
[prospect]: Thank you
[marto]: no problem prospect
[Corion]: I hope you all spent a good weekend!
[Discipulus]: good morning eumonks!
[Discipulus]: yes thanks, mostly at seaside, but the waterpolo tournemts went bad, very bad.. ;=(
[marto]: hey Discipulus, Corion a reasonable weekend. The boys first trip to the cinema
[Corion]: Discipulus: You got pushed too much under water?
[Corion]: marto: Oooh - cinema... I guess that's something I could do with my godson and sibling and sister as well, but I guess that getting a six year old and two four year olds into one movie is a tough sell ;)
[marto]: busy weekend, no me time :P

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2017-07-24 08:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I came, I saw, I ...
























    Results (348 votes). Check out past polls.