Note that hippo's addition of the g modifier brings up the issue that I raised earlier about having more than one character difference between two words. Here's what happens when I add a couple more examples to hippo's verson:
#!/usr/bin/perl
use strict;
use warnings;
my @words = <DATA>;
chomp @words;
while ( @words >= 2 ) {
my $model = my $regex = shift @words;
if ( $regex =~ s/(.*?)[ab](.*?)/$1\[ab\]$2/g ) {
my @hits = grep /^$regex$/, @words;
if ( @hits ) {
print join( " ", $model, "matches", @hits, "using", $regex
+, "\n" );
}
}
}
__DATA__
lama
lamb
able
bale
Output:
lama matches lamb using l[ab]m[ab]
able matches bale using [ab][ab]le
The output shows how the g modifier affects the creation of the regex to be used for searching the array; without it, the first regex would be l[ab]ma (which would not match "lamb"), and the next would be l[ab]mb (which would not match "lama" if it were to show up later in the list).
But when using the g modifier, the search pattern for "able" and "bale" come out the same, and they match each other, because the regex [ab][ab]le allows up to two characters to differ.
To solve that, you could to compare the current "model" word against each of the matches from the array, using the tr/// operator as described in previous replies, to see how many characters are different in each paired set of words, and keep only those matches that differ by a single character.
(UPDATE: It's also worth noting that using g this way is effectively equivalent to using "split", "map" and "join" to build the multi-match regex, like I showed in this previous reply - which just goes to show that "there's more than one way to do it." |