Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^4: A regex that only matches at offset that are multiples of a given N?

by BrowserUk (Pope)
on Feb 16, 2013 at 00:29 UTC ( #1018992=note: print w/ replies, xml ) Need Help??


in reply to Re^3: A regex that only matches at offset that are multiples of a given N?
in thread A regex that only matches at offset that are multiples of a given N?

Update: with the hints about pos % 4 below, I've managed to get (?(cond)yes-expr|no-expr) to work. Not sure how well it goes performance-wise in practice,

Hm. I cannot make it work for my application; but on the basis of the failure I think my previous conclusion that putting code in regexes is always going to be dog slow:

C:\test>junk39 -N=1 #! perl -slw use strict; use Digest::MD5 qw[ md5 ]; use Benchmark qw[ cmpthese ]; our $data = pack '(Va16)*', map{ $_, md5( $_ ) } 1 .. 1000; ## 1000 data items our $N //= -1; cmpthese $N, { a => q[ my $c = 0; for my $i ( 1 .. 2000 ) { ## half should pass; half fail. my $iBin = pack( 'V', $i ); my $md5 = md5( $i ); my $p = 0; while( $p = 1+index $data, $iBin, $p ) { next if ( $p - 1 ) % 20; ++$c, last if substr( $data, $p+3, 16 ) eq $md5; } } print "a: $c" if $N == 1; ], b => q[ my $c = 0; for my $i ( 1 .. 2000 ) { ## Finds 10?? my $iBin = pack( 'V', $i ); my $md5 = md5( $i ); while( $data =~ m{\G(?:.{20})*?(?=\Q$iBin\E(.{16}))}g ) { pos( $data ) += 20; ++$c, last if $1 eq $md5; } } print "b: $c" if $N == 1; ], c => q[ use re 'eval'; my $c = 0; for my $i ( 1 .. 2000 ) { ## finds none?? my $iBin = pack( 'V', $i ); my $md5 = md5( $i ); while( $data =~ m[(?(?{ pos( $data ) % 20 })(*F)|\Q%iBin\E +(.{16}))]g ) { ++$c, last if $1 eq $md5; } } print "c: $c" if $N == 1; ], }; __END__ C:\test>junk39 -N=1 a: 1000 b: 10 c: 0 Rate c a b c 0.111/s -- -99% -100% a 21.3/s 19117% -- -68% b 66.7/s 60113% 213% --

I also can't get smls' version to work for this either?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re^4: A regex that only matches at offset that are multiples of a given N?
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018992]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (13)
As of 2015-07-07 16:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (91 votes), past polls