Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^5: How to match more than 32766 times in regex?

by rsFalse (Chaplain)
on Dec 01, 2015 at 22:26 UTC ( [id://1149084]=note: print w/replies, xml ) Need Help??


in reply to Re^4: How to match more than 32766 times in regex?
in thread How to match more than 32766 times in regex?

Ah yes, in this node Re^2: Complex regular subexpression recursion limit I didn't get an answer :/ .
Today I was solving another problem (and encountered same limitation). Full problem was: given a string (up to 1e5 length) consisting of '0' and '1', answer what is the length of the longest alternating subsequence if you are able to choose and invert one substring. For example, given a string '100111', I can invert substring from 3rd to 4th character ( substr $line, 2, 2, (substr $line, 2, 2) =~ y/01/10/r ), and then string become '101011' and has alternating subsequence (indexes: 0,1,2,3,4 or 0,1,2,3,5).
I wanted to solve that problem with regexes (I knew that I can solve it other way), so I tried to count /1+/ and /0+/ (this is the answer of longest alternating subsequence if no inversions are made). I thought that I can do:
$line =~ y/1/,/; $len = split /\b/, $line;
, but I decided to stay with zeroes and ones, and wrote  () = $line =~ /(.)\1*/g (as I shown). Later I add to $len:  /(.)\1\1|(.)\2.*(.)\3/ + /(.)\1/, because each regex if succedes it gives +1 to the possible length of subsequence after one inversion.
I often try to solve problems from competitive programming online sites or sites like projecteuler.net and I practise do it with Perl.
After I used to calc all the sum:
$len = + (() = /(.)\1*\1*\1*\1*/g) + /(.)\1\1|(.)\2.*.*.*.*(.)\3/ + /(.)\1/
- it consumed too much time when solving input line '01' x 5e4;

upd: was bad example with reversion, now fixed to inversion.

Replies are listed 'Best First'.
Re^6: How to match more than 32766 times in regex?
by Anonymous Monk on Dec 02, 2015 at 02:12 UTC
    $len = + (() = /(.)\1*\1*\1*\1*/g) + /(.)\1\1|(.)\2.*.*.*.*(.)\3/ + /(.)\1/
    You know, that doesn't make any sense whatsoever.
      After some pondering... Is this what you tried to do:
      use strict; use warnings; my @strs = ( '010111', '0' x 1_000_000, '01' x 1_000_000, '011' x 1_000_000, ( '01' x 1_000_000 ) . '111', ); for my $str (@strs) { my $len = ( () = $str =~ m{ 0+ | 1+ }xg ) + ( $str =~ m{ 000 | 111 | (.)\1 .* (.)\2 }x ? 2 : 0 ); print $len, "\n"; }
      (Perls regex optimizer is pretty smart about the second regex, btw!)
        or, rather
        for my $str (@strs) { my $len = ( () = $str =~ m{ 0+ | 1+ }xg ) + ( $str =~ m{ 000 | 111 | (.)\1 .* (.)\2 }x ? 2 : $str =~ m{ (?: ^ (.)\1 | (.)\2 $) }x ? 1 : 0 ); print $len, "\n"; }
        ...anyway, I'm now too bored to think whether that's the correct solution and I'll leave the rest of that little programming exercise to you.

        Your problem is that you're trying to write overly compact code even though you don't know Perl very well. And, naturally, writing the retarded equivalent of JAPHs is not a good way to learn Perl - or any programming language, for that matter. Just saying.

        I recommended you to learn Forth a while ago, and I still do. BUK suggested Python, but its significantly more verbose than Perl and you won't like it.

        1. I can't understand how your first regex can match more than 32677 times agains string '0' x 1_000_000 ? And it gives correct answer 3. It has a '+' quantifier, which is {1,32677}, true?

        2. Your solution doesn't cope with inputs, which have only one pair of consecutive chars :P . If input is '11' . '01' x N , answer is not a 2*N+1, answer is 2*N+2. (upd: Later I've read newer post with correct code).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1149084]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2024-04-20 01:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found