Re: A regex that only matches at offset that are multiples of a given N?

in reply to A regex that only matches at offset that are multiples of a given N?

I'd put . in place of ???? and then * after the first parentheses...

use v5.12;

my @should_match = (
    q[foo],
    q[WXYZfoo],
    q[WXYZWXYZfoo],
    q[WXYZ    WXYZfoo],
);

my @should_not_match = (
    q[ foo],
    q[ABCfoo],
    q[VWXYZfoo],
    q[WXYZWXYZWXYZAfoo],
);

my $regexp = qr{^(?:.{4})*foo};

say /$regexp/ ? "ok" : "not ok" for @should_match;
say /$regexp/ ? "not ok" : "ok" for @should_not_match;
[download]

package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

Comment on Re: A regex that only matches at offset that are multiples of a given N? Select or Download Code

Replies are listed 'Best First'.
Re^2: A regex that only matches at offset that are multiples of a given N? by BrowserUk (Patriarch) on Feb 13, 2013 at 16:24 UTC
wish it were so simple: `$a = join'',map{ ('a'..'z')[rand 26] } 1 .. 1000;; print "$-[0]: $1" while $a =~ m[(?:.{4})(?=(aa..))]g;; 0: aawx 404: aawx 405: aadz 481: aadz print "$-[0]: $1" while $a =~ m[(?:.{4})(?=(gg..))]g;; 0: gghn 208: gghn 211: ggyj 955: ggyj` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^3: A regex that only matches at offset that are multiples of a given N? by tobyink (Canon) on Feb 13, 2013 at 17:11 UTC
Ah, you're using `/g`... in that case I'd just... `use v5.12; my $a = join '', map{ ('a'..'j')[rand 10] } 1 .. 1000; while ($a =~ m[(aa..)]g) { next if pos($a) % 4; say "Match at ", pos($a), ": ", $1; }` [download] It's possible that `(?(cond)yes-expr\|no-expr)` might be able to do what you want, but I've not had much luck with that. Update: with the hints about `pos % 4` below, I've managed to get `(?(cond)yes-expr\|no-expr)` to work. Not sure how well it goes performance-wise in practice, but: `use v5.12; my $a = join '', map{ ('a'..'j')[rand 10] } 1 .. 1000;; while ($a =~ m[(?(?{ pos() % 4 })(*F)\|(aa..))]g) { say "Match at ", pos($a), ": ", $1; }` [download] `package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name`	[reply] [d/l] [select]
Re^4: A regex that only matches at offset that are multiples of a given N? by BrowserUk (Patriarch) on Feb 13, 2013 at 17:18 UTC
Ah, you're using /g... Yes. Sorry about failing to mention that. It was late and I was tired {excuse, excuse, excuse...} in that case I'd just.. That is essentially what I doing (except with index) in Re: Store larg hashes more efficiently (10e6 md5s in 260MB at 4µs per lookup), because I couldn't work out how to get a regex to do it for me. Hence the question. I'd rather push the check inside the regex engine if I could; and it seems like it should be a reasonable ask; but it has me totally stumped. Even johngg's solution is flawed. :( With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^4: A regex that only matches at offset that are multiples of a given N? by BrowserUk (Patriarch) on Feb 16, 2013 at 00:29 UTC
Update: with the hints about pos % 4 below, I've managed to get (?(cond)yes-expr\|no-expr) to work. Not sure how well it goes performance-wise in practice, Hm. I cannot make it work for my application; but on the basis of the failure I think my previous conclusion that putting code in regexes is always going to be dog slow: C:\test>junk39 -N=1 #! perl -slw use strict; use Digest::MD5 qw[ md5 ]; use Benchmark qw[ cmpthese ]; our $data = pack '(Va16)', map{ $_, md5( $_ ) } 1 .. 1000; ## 1000 data items our $N //= -1; cmpthese $N, { a => q[ my $c = 0; for my $i ( 1 .. 2000 ) { ## half should pass; half fail. my $iBin = pack( 'V', $i ); my $md5 = md5( $i ); my $p = 0; while( $p = 1+index $data, $iBin, $p ) { next if ( $p - 1 ) % 20; ++$c, last if substr( $data, $p+3, 16 ) eq $md5; } } print "a: $c" if $N == 1; ], b => q[ my $c = 0; for my $i ( 1 .. 2000 ) { ## Finds 10?? my $iBin = pack( 'V', $i ); my $md5 = md5( $i ); while( $data =~ m{\G(?:.{20})?(?=\Q$iBin\E(.{16}))}g ) { pos( $data ) += 20; ++$c, last if $1 eq $md5; } } print "b: $c" if $N == 1; ], c => q[ use re 'eval'; my $c = 0; for my $i ( 1 .. 2000 ) { ## finds none?? my $iBin = pack( 'V', $i ); my $md5 = md5( $i ); while( $data =~ m[(?(?{ pos( $data ) % 20 })(*F)\|\Q%iBin\E +(.{16}))]g ) { ++$c, last if $1 eq $md5; } } print "c: $c" if $N == 1; ], }; __END__ C:\test>junk39 -N=1 a: 1000 b: 10 c: 0 Rate c a b c 0.111/s -- -99% -100% a 21.3/s 19117% -- -68% b 66.7/s 60113% 213% -- [download] I also can't get smls' version to work for this either? With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^3: A regex that only matches at offset that are multiples of a given N? by choroba (Cardinal) on Feb 13, 2013 at 16:31 UTC
Why have you removed the `^` anchor? لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re^4: A regex that only matches at offset that are multiples of a given N? by BrowserUk (Patriarch) on Feb 13, 2013 at 16:36 UTC
Because with it, it only matches the first match; not any second or subsequent properly aligned matches?: `print "$-[0]: $1" while $a =~ m[(?:.{4})(?=(aa..))]g;; 0: aawx 404: aawx 405: aadz 481: aadz print "$-[0]: $1" while $a =~ m[^(?:.{4})(?=(aa..))]g;; 0: aawx print "$-[0]: $1" while $a =~ m[(?:.{4})(?=(gg..))]g;; 0: gghn 208: gghn 211: ggyj 955: ggyj print "$-[0]: $1" while $a =~ m[^(?:.{4})(?=(gg..))]g;; 0: gghn` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^3: A regex that only matches at offset that are multiples of a given N? by ikegami (Patriarch) on Feb 14, 2013 at 07:50 UTC
You're just missing a leading `\G`	[reply] [d/l]

In Section Seekers of Perl Wisdom