comment on

If you are going to use subroutines in lieu of one-liners, here's a non-bit twiddling version. You've already got a solution, so please forgive the redundancy; this is a nice excuse for me to practice implementing streams a la HOP: ["Higher-Order Perl" now available for free download]

If you, like me, feel leery using bit-wise operations on strings that might be Unicode, this may be a more comforting approach. While your other solutions may indeed work 100% of the time with Unicode strings as well, that is too much thinking and worry for me so I just punt to the standard string manipulation & comparison functions in perl.

The code doesn't have any error checking, but the compare stream tries to do the right thing (per my tastes) for the boundary cases. Obviously you can change stream output behavior to taste. I left the characters in the output to better demonstrate the results. (And I generalized it a little to allow use of streams of characters in addition to just strings, so the YAGNI line tax is 1, leaving aside the YAGNI maintenance tax ;-)

I'm usually late the party so I don't expect many to see this, but if anyone has suggestions for improvement I would be interested in hearing them.

This code

#!/usr/bin/env perl

use Modern::Perl;
use Data::Dump qw(pp);

my ($foo, $bar) = ("abcdef", "abdfec");

sub pop_char {
    my @chars;
    { no warnings; @chars = map { split '' } @_; }
    return sub { return shift @chars; }
}

sub cmp_str {
    my ($i, $s1, $s2) = (0, @_);
    $s1 = pop_char $s1 if not ref $s1;
    $s2 = pop_char $s2 if not ref $s2;
    return sub {
        no warnings;
        my ($c1, $c2) = ($s1->(), $s2->());
        return ($c1 or $c2) ? [ $i++, $c1 cmp $c2, $c1, $c2 ] : undef;
    }
}

my ($cmp_foo_bar, $cmp_char);

say "\nNull case: both strings empty";
$cmp_foo_bar = cmp_str;
say pp $cmp_char while defined ($cmp_char = $cmp_foo_bar->());

say "\nSecond string null";
$cmp_foo_bar = cmp_str $foo;
say pp $cmp_char while defined ($cmp_char = $cmp_foo_bar->());

say "\nFirst string null";
$cmp_foo_bar = cmp_str '', $bar;
say pp $cmp_char while defined ($cmp_char = $cmp_foo_bar->());

$cmp_foo_bar = cmp_str $foo, $bar;
say "\nBoth strings same length";
say pp $cmp_char while defined ($cmp_char = $cmp_foo_bar->());

$cmp_foo_bar = cmp_str $foo.$bar, $bar;
say "\nFirst string longer";
say pp $cmp_char while defined ($cmp_char = $cmp_foo_bar->());

exit;

$cmp_foo_bar = cmp_str $foo.$bar, sub { return 'A'; };
say "\nBoth strings against infinite A's";
say pp $cmp_char while defined ($cmp_char = $cmp_foo_bar->());
[download]

Produces


Null case: both strings empty

Second string null
[0, 1, "a", undef]
[1, 1, "b", undef]
[2, 1, "c", undef]
[3, 1, "d", undef]
[4, 1, "e", undef]
[5, 1, "f", undef]

First string null
[0, -1, undef, "a"]
[1, -1, undef, "b"]
[2, -1, undef, "d"]
[3, -1, undef, "f"]
[4, -1, undef, "e"]
[5, -1, undef, "c"]

Both strings same length
[0, 0, "a", "a"]
[1, 0, "b", "b"]
[2, -1, "c", "d"]
[3, -1, "d", "f"]
[4, 0, "e", "e"]
[5, 1, "f", "c"]

First string longer
[0, 0, "a", "a"]
[1, 0, "b", "b"]
[2, -1, "c", "d"]
[3, -1, "d", "f"]
[4, 0, "e", "e"]
[5, 1, "f", "c"]
[6, 1, "a", undef]
[7, 1, "b", undef]
[8, 1, "d", undef]
[9, 1, "f", undef]
[10, 1, "e", undef]
[11, 1, "c", undef]
[download]

In reply to Re: If there a way to find the location of the first difference between two strings? by jaredor
in thread If there a way to find the location of the first difference between two strings? by flexvault

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


laziness, impatience, and hubris
	PerlMonks