>perl -le"use open ':std', ':encoding(cp850)'; use if $ARGV[0], 'local
+e'; print sort 'a', chr(0xE1), 'b'" 0
abá
>perl -le"use open ':std', ':encoding(cp850)'; use if $ARGV[0], 'local
+e'; print sort 'a', chr(0xE1), 'b'" 1
aáb
(Replace cp850 with the proper encoding for your console.)
You have a few problems:
-
"á" (English: "a with acute accent", French: "a avec accent aigu") is Unicode character E1, not 9E.
-
You wrote use local; instead of use locale;
-
/a..zA..Z/ means
- "a"
- followed by a character other than "\n"
- followed by a character other than "\n"
- followed by "z"
- followed by "A"
- followed by a character other than "\n"
- followed by a character other than "\n"
- followed by "Z"
You meant /[a-zA-Z]/
-
Perl ranges and character class ranges don't use alphabetical order.
-
French doesn't use "á".
The solution is to use POSIX or Unicode properties.
>perl -le"my $s = chr(0xE1); print $s =~ /\p{Alpha}/ ?1:0"
1
>perl -le"my $s = chr(0xE1); print $s =~ /[\p{Alpha}]/ ?1:0"
1
>perl -le"my $s = chr(0xE1); utf8::upgrade($s); print $s =~ /[[:alpha:
+]]/ ?1:0"
1
From the documentation, it seems to me the following should also work, but they don't:
>perl -le"use feature 'unicode_strings'; my $s = chr(0xE1); print $s =
+~ /[[:alpha:]]/ ?1:0"
0
>perl -le"use 5.012; my $s = chr(0xE1); print $s =~ /[[:alpha:]]/ ?1:0
+"
0
Update: Fixed c&p mistake in char class.
Update: Inserted Problem #1.
Update: Oops, I seem to have forgotten to include the solution. Added.
|