http://www.perlmonks.org?node_id=935400

McA has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I have the following little script demonstrating a case (case 1 in the output below), which I can't explain to myself. So I hoped, someone can explain it to me or give the right hints.
#!/usr/bin/perl -CO use strict; use warnings; use Encode; use utf8; my $a = 'ä'; print "UTF8-Flag: ", utf8::is_utf8($a) ? "Yes" : "No"; print " matches word: ", $a =~ /\w/ ? "Yes\n" : "No\n"; my $b = encode("ISO-8859-1", $a); print "UTF8-Flag: ", utf8::is_utf8($b) ? "Yes" : "No"; print " matches word: ", $b =~ /\w/ ? "Yes\n" : "No\n"; use locale; $a = 'ä'; print "UTF8-Flag: ", utf8::is_utf8($a) ? "Yes" : "No"; print " matches word: ", $a =~ /\w/ ? "Yes\n" : "No\n"; $b = encode("ISO-8859-1", $a); print "UTF8-Flag: ", utf8::is_utf8($b) ? "Yes" : "No"; print " matches word: ", $b =~ /\w/ ? "Yes" : "No"; print "\n";
The output on a linux box with locale de_DE.UTF-8 and perl source code encoded in UTF-8 is:
UTF8-Flag: Yes matches word: Yes UTF8-Flag: No matches word: No UTF8-Flag: Yes matches word: No UTF8-Flag: No matches word: No
It's the very first case I can't explain to me. Why is an unicode-flagged 'ä' matched against words when locale is not set explicitly?

Thanks in advance
Andreas