Hmm, I was sufficiently surprised by this behaviour (that I've not heard of before) that I went looking. First off, your code fragment is not much use, as it does not define what $R2 contains. So I went and looked at the source, and ripped the following out of its guts:
use strict;
use warnings;
my @word = qw(
constituci\xf3n contribuci\xf3n destituci\xf3n devoluci\xf3n dismi
+nuci\xf3n
constituciones contribuciones destituciones devoluciones disminuci
+ones
foo
);
my $vowels = 'aeiou\xe1\xe9\xed\xf3\xfa\xfc';
my $consonants = 'bcdfghjklmn\xf1pqrstvwxyz';
my $revowel = qr/[$vowels]/;
my $reconsonants = qr/[$consonants]/;
my $R2;
my $suffix;
for my $word (@word) {
($R2) = $word =~ /^.*?$revowel$reconsonants.*?$revowel$reconsonant
+s(.*)$/;
$R2 ||= '';
if ( ($suffix) = $R2 =~ /(uciones|uci\xf3n)$/ ) {
# uci\xf3n uciones
# replace with u if in R2
$word =~ s/$suffix$/u/;
print "Step 1 case 4: $word\n";
}
}
(Those \xnn characters really are Latin-1 characters, that's just a direct cut'n'paste from my shell introducing the artifact).
And that runs just fine here, all the way up to "perl, v5.11.0 DEVEL33323 built for i386-freebsd-64int". So there's something else going on. Both "ución" and "uciones" match just fine. Perhaps the tester platforms are running in a different locale. To play it safe, I suggest you encode your program in UTF-8 and slap a use utf8 at the top and be done with it. At least I think that's the correct best practice. Thinking about encoding makes my head explode.
• another intruder with the mooring in the heart of the Perl