Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: RegEx + vs. {1,}

by grizzley (Chaplain)
on Oct 10, 2012 at 12:01 UTC ( #998210=note: print w/ replies, xml ) Need Help??


in reply to RegEx + vs. {1,}

Because you are trying to match string which occurs twice and 'abcdefg' is first correct candidate. I can think only about such approach for your problem:

$ perl -le 'print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; print $1 if +$x =~ /(\w{2,})(.*?\1){2}/;' abcdefgxxabcdefgzzabcdsjfhkdfab abcd $ perl -le 'print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; print $1 if +$x =~ /(\w{2,})(.*?\1){3}/;' abcdefgxxabcdefgzzabcdsjfhkdfab ab $ perl -le 'print $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; print $1 if +$x =~ /(\w{2,})(.*?\1){4}/;' abcdefgxxabcdefgzzabcdsjfhkdfab
I.e. The problem is you have to say exactly how many occurences you want (there is no 'greediness' in this case == you can't say "I want as many occurences as possible", only lowest possible number of occurences will be chosen).


Comment on Re: RegEx + vs. {1,}
Download Code
Re^2: RegEx + vs. {1,}
by Sewi (Friar) on Oct 10, 2012 at 13:01 UTC
      So if that's acceptable for you - use while loop to determine max amount of occurences. There will be no more than length / 2 occurences, so start with this max value and decrease it while trying to match:
      $x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; $len=int(length($x)/2); while($x !~ /(\w{2,})(.*?\1){$len}/) { $len-- }; $x =~ /(\w{2,})(.*?\1){$len}/; # 'strange line' print $1
      (to self: do not know why I have to add 'strange line', without it nothing is printed, but $len is correctly set to 4)

      I tried to generate the list and include it in one regexp:

      $ perl -le '$x = "abcdefgxxabcdefgzzabcdsjfhkdfab"; $len=int(length($x +)/2); $restring = join"|", map {"(?:.*?\\1){$_}"} reverse(1..$len); p +rint $restring; print $1 if $x =~ /(\w{2,})($restring)/;' (?:.*?\1){15}|(?:.*?\1){14}|(?:.*?\1){13}|(?:.*?\1){12}|(?:.*?\1){11}| +(?:.*?\1){10}|(?:.*?\1){9}|(?:.*?\1){8}|(?:.*?\1){7}|(?:.*?\1){6}|(?: +.*?\1){5}|(?:.*?\1){4}|(?:.*?\1){3}|(?:.*?\1){2}|(?:.*?\1){1} abcdefg
      but it does not work as expected (probably some stupid mistake, maybe someone else can tell what's wrong with it).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998210]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (10)
As of 2014-12-29 09:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (185 votes), past polls