in reply to
Using special characters in left part of a regex match?
So, I'm assuming 3 should actually be Gallia ... since it would otherwise be inconsistent with all other options and would not map to the first string. Without the ability to anchor at the front and back, this problem seems unsolvable to me.
If you have a full string, as in case 0, you can trivially use regular expressions. Otherwise, only your leading and trailing words are actually constraining. This strikes me more as a case where you would split your strings on ... and then compare substrings -- this is not consistent with using regular expressions. My version, with index to do the work fragment checking:
#!/usr/bin/perl
use strict;
use warnings;
my @var;
$var[0] = "Gallia est omnis divisa in partes tres";
$var[1] = "Gallia est omnis divisa in ...";
$var[2] = "Gallia est omnis ...";
$var[3] = "Gallia ...";
$var[4] = "... omnis divisa in ...";
$var[5] = "Gallia est ... tres";
$var[6] = "Gallia ... partes tres";
$var[7] = "Gallia est ... partes tres";
$var[8] = "Gallia ... divisa ... tres";
$var[9] = "... tres";
$var[10] = "quattuor";
for my $i (0 .. $#var) {
for my $j ($i+1 .. $#var) {
print "$i - $j DO NOT MATCH!\n" unless compare($var[$i], $var[
+$j]);
}
}
sub compare {
my @str1 = split /\Q...\E/, shift, -1;
my @str2 = split /\Q...\E/, shift, -1;
if (@str1 == 1) { # Regex is possible
local $" = ".+";
return $str1[0] =~ /^@str2$/;
} elsif (@str2 == 1) { # Regex is still possible
local $" = ".+";
return $str2[0] =~ /^@str1$/;
} else { # Fragment matching
# Openings must be consistent
if (length $str1[0] > length $str2[0]) {
return if index($str1[0], $str2[0]) != 0;
} else {
return if index($str2[0], $str1[0]) != 0;
}
# Closings must be consistent, start search from end
if (length $str1[-1] > length $str2[-1]) {
return if index(reverse($str1[-1]), reverse($str2[-1])) !=
+ 0;
} else {
return if index(reverse($str2[-1]), reverse($str1[-1])) !=
+ 0;
}
}
return 1;
}
which outputs
0 - 10 DO NOT MATCH!
1 - 10 DO NOT MATCH!
2 - 10 DO NOT MATCH!
3 - 10 DO NOT MATCH!
4 - 10 DO NOT MATCH!
5 - 10 DO NOT MATCH!
6 - 10 DO NOT MATCH!
7 - 10 DO NOT MATCH!
8 - 10 DO NOT MATCH!
9 - 10 DO NOT MATCH!
If instead you meant case 10 to be ... quattuor ..., you get
0 - 10 DO NOT MATCH!
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.