I think smls might be on the right track with moving pos. This benchmark seems to show that there is, if anything, a performance gain. I tried a modification to my first attempt my removing the duplicate matches via a hash but it only showed a marginal improvement in performance.
Then again, I am not very good at benchmarks so I could have cocked it up :-/
use strict;
use warnings;
use 5.014;
use Benchmark qw{ cmpthese };
my $n = 4;
my $str = q{x} x 50;
substr $str, $_, 4, q{fred} for 4, 9, 20, 24, 31, 40;
say qq{String: $str\n};
my $rcMovePos = sub {
my $raMatches;
while ( $str =~ m{\G(?:.{$n})*?(?=(fred.*))}g )
{
push @{ $raMatches }, [ pos( $str ), $1 ];
pos $str += $n;
}
return $raMatches;
};
my $rcNoDups = sub {
my $rhMatches;
$rhMatches->{ pos( $str ) } = $1 while
$str =~ m{\G(?:.{$n})*?(?=(fred.*))}g;
return $rhMatches;
};
my $rcWithDups = sub {
my $raMatches;
push @{ $raMatches }, [ pos( $str ), $1 ] while
$str =~ m{\G(?:.{$n})*?(?=(fred.*))}g;
return $raMatches;
};
my $raRes = $rcMovePos->();
say q{Using $rcMovePos};
say qq{ Matched $_->[ 1 ] at position $_->[ 0 ]} for @{ $raRes };
my $rhRes = $rcNoDups->();
say q{Using $rcNoDups};
say qq{ Matched $rhRes->{ $_ } at position $_} for
sort { $a <=> $b } keys %{ $rhRes };
$raRes = $rcWithDups->();
say q{Using $rcWithDups};
say qq{ Matched $_->[ 1 ] at position $_->[ 0 ]} for @{ $raRes };
srand 1234567890;
$str = q{x} x 10000;
substr $str, int rand 9997, 4, q{fred} for 1 .. 50;
say q{};
cmpthese(
-5,
{
movePos => $rcMovePos,
noDups => $rcNoDups,
withDups => $rcWithDups,
} );
String: xxxxfredxfredxxxxxxxfredfredxxxfredxxxxxfredxxxxxx
Using $rcMovePos
Matched fredxfredxxxxxxxfredfredxxxfredxxxxxfredxxxxxx at position 4
Matched fredfredxxxfredxxxxxfredxxxxxx at position 20
Matched fredxxxfredxxxxxfredxxxxxx at position 24
Matched fredxxxxxx at position 40
Using $rcNoDups
Matched fredxfredxxxxxxxfredfredxxxfredxxxxxfredxxxxxx at position 4
Matched fredfredxxxfredxxxxxfredxxxxxx at position 20
Matched fredxxxfredxxxxxfredxxxxxx at position 24
Matched fredxxxxxx at position 40
Using $rcWithDups
Matched fredxfredxxxxxxxfredfredxxxfredxxxxxfredxxxxxx at position 4
Matched fredxfredxxxxxxxfredfredxxxfredxxxxxfredxxxxxx at position 4
Matched fredfredxxxfredxxxxxfredxxxxxx at position 20
Matched fredfredxxxfredxxxxxfredxxxxxx at position 20
Matched fredxxxfredxxxxxfredxxxxxx at position 24
Matched fredxxxfredxxxxxfredxxxxxx at position 24
Matched fredxxxxxx at position 40
Matched fredxxxxxx at position 40
Rate withDups noDups movePos
withDups 2321/s -- -3% -33%
noDups 2394/s 3% -- -31%
movePos 3445/s 48% 44% --
I hope this is of interest.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.