comment on

I have two very long (>64k) strings of equal lengths - $s1 and $s2. They are strings of bytes, meaning that any value from chr(0) to chr(255) is legal. $s2, however, will not have any chr(0). $s1 may or may not have any. What I need to do is look at each byte in $s1 and if it is chr(0), replace it with the corresponding byte in $s2. So, something like the following code:

sub foo {
    my ($s1, $s2) = @_;

    my @s1 = split //, $s1;
    my @s2 = split //, $s2;

    foreach my $idx ( 0 .. $#s1 ) {
        if ( $s1[$idx] eq chr(0) ) {
            $s1[$idx] = $s2[$idx];
        }
    }

    return join '', @s1;
}
[download]

foo() could return the resulting string or it could modify $s1 in place. If foo() returns $s1, I'm going to be doing $s1 = foo( $s1, $s2 ); in all cases.

Here's what I've got so far, including Benchmark harness. Whoever comes up with the fastest version earns a meter of beer from me whenever we see each other.

#!/usr/bin/perl

use 5.6.0;

use strict;
use warnings FATAL => 'all';

use Benchmark qw( cmpthese );

my $s1 = join '', (do_rand(1) x 100_000);
my $s2 = join '', (do_rand(0) x 100_000);

cmpthese( -2, {
    'split1' => sub { my $s3 = split1( $s1, $s2 ) },
    'substr1' => sub { my $s3 = substr1( $s1, $s2 ) },
});

sub split1 {
  my ($s1, $s2) = @_;

  my @s1 = split //, $s1;
  my @s2 = split //, $s2;

  foreach my $idx ( 0 .. $#s1 ) { 
    if ( $s1[$idx] eq chr(0) ) { 
      $s1[$idx] = $s2[$idx];
    } 
  } 

  return join '', @s1;
}

sub substr1 {
  my ($s1, $s2) = @_;

  for my $idx ( 0 .. length($s1) ) {
    if ( substr($s1,$idx,1) eq chr(0) ) {
      substr($s1, $idx, 1) = substr($s2, $idx, 1);
    }
  } 

  return $s1;
} 

# This makes sure that $s1 has chr(0)'s in it and $s2 does not.
sub do_rand {
  my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1;
  return chr( $n );
}

__END__
[download]

Update: It looks like there is a 2-way tie between avar and moritz. I went ahead and wrote an in-place version of moritz's code. Thanks to SuicideJunkie for fixing my stupidity in the test data. The script now looks like:

#!/usr/bin/perl

use 5.6.0;

use strict;
use warnings FATAL => 'all';

#use Test::More no_plan => 1;
use Benchmark qw( cmpthese );

my $s1 = do_rand(0, 100_000);
my $s2 = do_rand(1, 100_000);
my $expected = split1( \$s1, \$s2 );

cmpthese( -3, {
  'avar2' => sub {
    my $s3 = $s1; avar2( \$s3, \$s2 );
#    is( $s3, $expected, "avar2" );
  },
  'moritz' => sub {
    my $s3 = $s1; moritz( \$s3, \$s2 );
#    is( $s3, $expected, "moritz" );
  },
});

sub split1 {
  my ($s1, $s2) = @_;

  my @s1 = split //, $$s1;
  my @s2 = split //, $$s2;

  foreach my $idx ( 0 .. $#s1 ) {
    if ( $s1[$idx] eq chr(0) ) {
      $s1[$idx] = $s2[$idx];
    }
  }

  $$s1 = join '', @s1;
}

sub avar2 {
  my ($s1, $s2) = @_;
  use bytes;
  $$s1 =~ s/\0/substr $$s2, pos($$s1), 1/eg;
}

sub moritz {
  my ($s1, $s2) = @_;

  my $pos = 0;
  while ( 0 < ( $pos = index $$s1, "\000", $pos ) ) {
    substr( $$s1, $pos, 1 ) = substr( $$s2, $pos, 1 );
  }
}

sub do_rand {
  my ($min, $len) = @_;
  my $n = "";
  for (1 .. $len) {
    $n .= chr( rand(255-$min)+$min )
  }
  return $n;
}

__END__
[download]

I'm going to keep it open until 24 hours have passed from the initial posting of this node. If no-one gets any faster, both moritz and avar have a meter of beer from me.

My criteria for good software:

Does it work?
Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

In reply to Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) by dragonchild

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Come for the quick hacks, stay for the epiphanies.
	PerlMonks