Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

How to match more than 32766 times in regex?

by rsFalse (Pilgrim)
on Dec 01, 2015 at 18:21 UTC ( #1149052=perlquestion: print w/replies, xml ) Need Help??
rsFalse has asked for the wisdom of the Perl Monks concerning the following question:

upd: Thanks for answers below.
upd: Full problem was posted by me later :/ in this reply - Re^5: How to match more than 32766 times in regex?
  • Comment on How to match more than 32766 times in regex?

Replies are listed 'Best First'.
Match twice
by choroba (Bishop) on Dec 01, 2015 at 18:23 UTC
    Update: The whole question is in the title. So does the answer.
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: How to match more than 32766 times in regex?
by BrowserUk (Pope) on Dec 01, 2015 at 18:24 UTC

    Before I've even read your question; my suggestion is that you take up Python.

    Going through a bunch of known limitations, and raising questions about them as if you've newly discovered them, is a sad strategy.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I've read about limitation, but is it a way to compose regex without big time penalty? I tried smth like /$regex*$regex*$regex*/ if I wanted match up to 96000 times, but it takes a lot of time regex to finish.

        To make a regexp faster, search from start or end, using ^ or $ to bind it to that point. But can you explain what you want to do? I am sure there is a better way.


        As for code, look at the multiplier x 3 that concatenates the string 3 times. Then, we use qr to quote a regular expression, which we then use and capture the results in @R, which we then print. Hope this gets you ideas. (duplicating the expression to capture it 2 times)

        $ perl -e '$s="(\\d\\w)" x 3; $X="a1b2c3d4e5"; $m=qr/$s/; @R=$X=~$m; +print join(";",@R)."\n"' 1b;2c;3d

        another way could be divide and conquer. Paying a penalty by using $' (the rest of the string that has not matched yet) for the next iteration. another idea is using index

        if I wanted match up to 96000 times,

        What's your application?


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
        "... if I wanted match up to 96000 times ..."

        You're fired.

Re: How to match more than 32766 times in regex?
by Anonymous Monk on Dec 01, 2015 at 18:54 UTC
    (shrug) Admittedly at this point BrowserUK's suggestion makes sense to me. But, anyway... use a non-backtracking engine. Or change REG_INFTY value in regcomp.h and recompile perl (I have no idea whether it will work or not).
      use strict; use warnings; my $X = "a1b2c3d4e5"; # or use File::Slurp my $s = "(\\w\\d)"; # my pattern match $s my $m = qr/$s/; # compiled to a regular expression $m my $counter = 0; while($X=~s/$m//){ ++$counter; next unless $counter > 32766; # wait for it... print "this is the $counter iteration, got $1 \n"; }

        No need to go to those lengths:

        $s = '0123456789' x 100000;; ( $m ) = $s =~ m[((?:(?:0123456789){32000}){3})];; print length $m;; 960000

        But for any given application there's almost certainly a better way of tackling the problem.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Hmmm, I thought the OP had problems with 'complex regex recursion limit exceeded'. If he just wanted to match something like (\w\d){32767}, sure.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1149052]
Approved by Corion
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2017-11-18 16:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (277 votes). Check out past polls.

    Notices?