This offloads the work to the regex engine and is *much* faster:
sh-3.1$ perl benchmark.pl
Rate split1 substr1 subst
split1 5.05/s -- -82% -100%
substr1 27.7/s 449% -- -99%
subst 3551/s 70273% 12719% --
ok 1
ok 2
1..2
The function, could be done in-place which would be even faster:
sub subst
{
my ($s1, $s2) = @_;
my $s3 = $s1;
{
use bytes;
$s3 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg;
}
$s3;
}
The complete benchmark file (now with tests):
#!/usr/bin/perl
use 5.6.0;
use strict;
use warnings FATAL => 'all';
use Benchmark qw( cmpthese );
use Test::More 'no_plan';
my $s1 = join '', (do_rand(1) x 100_000);
my $s2 = join '', (do_rand(0) x 100_000);
cmpthese( -2, {
'split1' => sub { my $s3 = split1( $s1, $s2 ) },
'substr1' => sub { my $s3 = substr1( $s1, $s2 ) },
'subst' => sub { my $s3 = subst($s1, $s2) },
});
my $s30 = split1( $s1, $s2 );
my $s31 = substr1( $s1, $s2 );
my $s32 = subst( $s1, $s2 );
is($s30, $s31);
is($s31, $s32);
sub split1 {
my ($s1, $s2) = @_;
my @s1 = split //, $s1;
my @s2 = split //, $s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
return join '', @s1;
}
sub substr1 {
my ($s1, $s2) = @_;
for my $idx ( 0 .. length($s1) ) {
if ( substr($s1,$idx,1) eq chr(0) ) {
substr($s1, $idx, 1) = substr($s2, $idx, 1);
}
}
return $s1;
}
sub subst
{
my ($s1, $s2) = @_;
my $s3 = $s1;
{
use bytes;
$s3 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg;
}
$s3;
}
# This makes sure that $s1 has chr(0)'s in it and $s2 does not.
sub do_rand {
my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1;
return chr( $n );
}
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.