Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: RegEx + vs. {1,}

by grizzley (Chaplain)
on Oct 10, 2012 at 12:01 UTC ( #998210=note: print w/ replies, xml ) Need Help??


in reply to RegEx + vs. {1,}

Because you are trying to match string which occurs twice and 'abcdefg' is first correct candidate. I can think only about such approach for your problem:

$ perl -le 'print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; print $1 if +$x =~ /(\w{2,})(.*?\1){2}/;' abcdefgxxabcdefgzzabcdsjfhkdfab abcd $ perl -le 'print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; print $1 if +$x =~ /(\w{2,})(.*?\1){3}/;' abcdefgxxabcdefgzzabcdsjfhkdfab ab $ perl -le 'print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; print $1 if +$x =~ /(\w{2,})(.*?\1){4}/;' abcdefgxxabcdefgzzabcdsjfhkdfab
I.e. The problem is you have to say exactly how many occurences you want (there is no 'greediness' in this case == you can't say "I want as many occurences as possible", only lowest possible number of occurences will be chosen).


Comment on Re: RegEx + vs. {1,}
Download Code
Re^2: RegEx + vs. {1,}
by Sewi (Friar) on Oct 10, 2012 at 13:01 UTC
      So if that's acceptable for you - use while loop to determine max amount of occurences. There will be no more than length / 2 occurences, so start with this max value and decrease it while trying to match:
      $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; $len=int(length($x)/2); while($x !~ /(\w{2,})(.*?\1){$len}/) { $len-- }; $x =~ /(\w{2,})(.*?\1){$len}/; # 'strange line' print $1
      (to self: do not know why I have to add 'strange line', without it nothing is printed, but $len is correctly set to 4)

      I tried to generate the list and include it in one regexp:

      $ perl -le '$x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; $len=int(length($x +)/2); $restring = join"|", map {"(?:.*?\\1){$_}"} reverse(1..$len); p +rint $restring; print $1 if $x =~ /(\w{2,})($restring)/;' (?:.*?\1){15}|(?:.*?\1){14}|(?:.*?\1){13}|(?:.*?\1){12}|(?:.*?\1){11}| +(?:.*?\1){10}|(?:.*?\1){9}|(?:.*?\1){8}|(?:.*?\1){7}|(?:.*?\1){6}|(?: +.*?\1){5}|(?:.*?\1){4}|(?:.*?\1){3}|(?:.*?\1){2}|(?:.*?\1){1} abcdefg
      but it does not work as expected (probably some stupid mistake, maybe someone else can tell what's wrong with it).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998210]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2015-07-04 00:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (57 votes), past polls