Re: perllocale weirdness, bug, or...?
by Corion (Patriarch) on Oct 20, 2010 at 14:04 UTC
|
See locale. Using locale changes your sort order to whatever is considered "natural" for the locale you have set up. I would avoid it, but I guess you can find out what locale is active and if you still want to use it, you can set your locale to 'C' for the time where you want the "usual" sort/string comparison behaviour of Perl.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
I've checked with perllocale, but I just can't find any sense:
how is it possible to have a > b _and_ in the same time a@yahoo.com < b@yahoo.com...?
| [reply] [Watch: Dir/Any] |
|
Oh - I hadn't seen that contradiction that runs counter to the intuition that "strings comparing larger" should compare starting from the left. I'm not sure what locale you actually run under. Maybe somebody who has actual working experience with locales can tell from $ENV{LC_ALL} or $ENV{LC_COLLATE} or $ENV{LANG} (see perllocale) what the active locale for your system is and how it affects sorting.
I would still avoid locales, exactly because they introduce hard to track down behaviour.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
|
Well what is your locale?
| [reply] [Watch: Dir/Any] |
|
Re: perllocale weirdness, bug, or...?
by thundergnat (Deacon) on Oct 20, 2010 at 18:23 UTC
|
It is almost definitely a locale weirdness thing. To get an idea of your local(e) sort order, try running the following: It's probably not what you might suspect.
#!/usr/bin/perl
use strict;
use warnings;
{
no locale;
print "\nNO Locale:\n\n";
print +(join ' ', sort grep /\w/, map { chr } 0..255), "\n";
}
{
use locale;
print "\nWith 'use locale;':\n\n";
print +(join ' ', sort grep /\w/, map { chr } 0..255), "\n";
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: perllocale weirdness, bug, or...?
by Krambambuli (Curate) on Oct 20, 2010 at 21:31 UTC
|
Thanks - I've done this already, but it's not explaining the supposed non-sensical ordering I see.
I've made some progress in the meantime however - it seems a problem with how exactly collate is done when LC_COLLATE = en_US.UTF-8 and not a Perl problem. But I'm still have to understand how it comes that a sort with this collation gives
_
2
a
a2
a_2
a_2.
a2.
instead of what I would feel as 'logical' to be
_
2
a
a_2
a_2.
a2
a2.
---
(Update): sorry, misplaced this answer, it should have been a reply to thundergnat's note. | [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |