Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Perl, Gtk2 and locale — a bit of a mess

by Ralesk (Pilgrim)
on Jul 11, 2013 at 12:46 UTC ( #1043714=perlquestion: print w/replies, xml ) Need Help??
Ralesk has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

We have run into a very strange issue with locales using comma as the decimal separator. Not sure where exactly the problem lies, but it seems that no matter if we say no locale; so we would be exempt from locale rules in a block — and this we need so numbers in JSON remain… well, numbers —, as long as we have an initialised Gtk2, we will get comma-separated floats.

~$ locale < ... all set to hu_HU.UTF-8 ... > [001] ~$ perl -e 'printf(q{%f}, 123.456);' 123.456000 [002] ~$ perl -e 'use locale; printf(q{%f}, 123.456);' 123.456000 [003] ~$ perl -e 'use locale; use POSIX; POSIX::setlocale("LC_ALL", "h +u_HU.UTF-8"); printf q{%f}, 123.456;' 123.456000 [004] ~$ perl -e 'use Gtk2 -init; printf(q{%f}, 123.456);' 123,456000

I might not know my way around locale, but I thought the line in 003 would result in a comma... In any case, the moment Gtk2 enters the game, like in 004, it turns into a comma. Then we experimented further:

[005] ~$ perl -e 'use Gtk2 -init; print unpack("H*", 123.789) . "\n";' 3132332c373839 [006] ~$ perl -e 'use locale; print unpack("H*", 123.789) . "\n";' 3132332e373839 [007] ~$ perl -e 'print unpack("H*", 123.789) . "\n";' 3132332e373839 [008] ~$ perl -e 'use Gtk2 -init; print unpack("H*", "123.789") . "\n" +;' 3132332e373839 [009] ~$ perl -e 'use Gtk2 -init; print unpack("H*", 123.789 . "") . " +\n";' 3132332c373839

Notice how even unpack already only knows about the commaed string, once Gtk2 is in play — we haven’t found a way to stringify the number (in our real-world use case: the result of a time() command — without the locale mangling it.

So... any ideas? What are we doing wrong? Is there a way around this without forcing the locale to be C (or something “proper”) before Gtk2 gets initialised, ie. without actually breaking the entire locale thing?

Replies are listed 'Best First'.
Re: Perl, Gtk2 and locale — a bit of a mess
by choroba (Chancellor) on Jul 11, 2013 at 13:26 UTC
    The documentation of POSIX uses a different way to refer to locale subtypes: constants.
    perl -we 'use locale; use POSIX; POSIX::setlocale(POSIX::LC_NUMERIC, " +hu_HU.UTF-8"); printf qq{%f\n}, 123.456;'

    Note: Tested with cs_CZ locale, as hu_HU is not installed here.

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Yeah, sorry, just noticed that; if anything it doesn't change a thing.

      More precisely, this is what happens in the code now:

      use POSIX qw/strftime setlocale/; BEGIN { ## To hopefully make it happen way before Gtk2 init setlocale(POSIX::LC_ALL(), "en_US.UTF-8"); } use Glib qw/FALSE TRUE/; use Gtk2 -init;

      And on my coworker’s computer (who uses Hungarian locale to begin with, on Fedora, GTK 3 desktop), we get commas, on my computer (English locale, KDE desktop, but in the shell the locale is set to Hungarian) we get periods.

      Except this is only with the software we’re making. The example one-liners above work identically on both machines.

Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Jul 11, 2013 at 15:55 UTC

    After a talk with Nei on #gtk-perl I’ve come to a few conclusions. For one, I’m really bad at this whole locale thing. For two, this whole locale thing is pretty much broken by design.

    There are a few things working together that make this so bad:

    • GTK will call setlocale(LC_ALL, "") when it starts, so we were mistaken about the use of the Perl instruction POSIX::setlocale — it should, by all means, go after the Gtk init, so as to actually override whatever Gtk loaded from the environment
    • C’s locale support is pretty much broken: there’s apparently no way to say “this is something user-facing, please present it as appropriate” and “this is something that must remain exactly the way I’m saying it”.
    • Perl will inherit this behaviour and unless the libraries dealing with numbers setlocale(LC_NUMERIC, "C"), they will end up producing localised numbers the instance they turn it into a string. Which many do.
    • JSON gets slightly confused, it appears, by producing a JSON string like { cmd: "something", ts: 1373556417,044533, data: { ... } }

    So, for me, the solution is turning locales off on numerals. For others, it would require calling setlocale back and forth. Here is another example of this issue cropping up all of a sudden.

Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Jul 12, 2013 at 11:41 UTC

    Okay, a little update. Just what is going on in Perl’s mind here?

    $ perl -E 'use POSIX qw(setlocale LC_NUMERIC); say 3.14; setlocale(LC_ +NUMERIC, ""); say 3.14; setlocale(LC_NUMERIC, "C"); say 3.14; setloca +le(LC_NUMERIC, ""); say 3.14; say 3.14 . "";' 3.14 3,14 3.14 3,14 3.14

    Switching back and forth seems to work. When the environment locale is loaded, we get a comma, when the C locale is loaded, we get a period. Excellent.

    Except for the last two. If you concat anything (could have done it with print 3.14; print "\n"; print 3.14 . "\n"; instead, for the same effect) to the float (turning it into a string much like how you turn it into a string with the print or the say directive, it won’t follow the locale rules. I’d been battling the exact opposite until now!

    And when we also use Gtk which will call setlocale on the C level:

    ~$ perl -E 'use Gtk2 -init; use POSIX qw(setlocale LC_NUMERIC); say 3. +14; setlocale(LC_NUMERIC, ""); say 3.14; setlocale(LC_NUMERIC, "C"); +say 3.14; setlocale(LC_NUMERIC, ""); say 3.14; say 3.14 . ""; ' 3,14 3,14 3.14 3,14 3,14

    The first turns into a comma, that’s good, Gtk2 set the locale to Hungarian. The second remains a comma, as expected. The third becomes a period, because of C locale. Fourth is a comma, rightfully, because of the environment locale. And the last one behaves as it should now, remaining a comma, appropriate for the environment locale being used.

      Once again IRC help, so for the sake of completeness I’m answering myself: the reason for the awkward difference is compile time optimisation. Where, in the first case, that last concatenation & stringification still happened during initial unset locale.

      Note how the last two prints work as expected if one replaces the constant with a subroutine call.

      ~$ perl -E 'sub x { 3.14 }; use POSIX qw(setlocale LC_NUMERIC); say x; + setlocale(LC_NUMERIC, ""); say x; setlocale(LC_NUMERIC, "C"); say x; + setlocale(LC_NUMERIC, ""); say x; say x . ""; ' 3.14 3,14 3.14 3,14 3,14
Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Sep 23, 2013 at 06:16 UTC
Re: Perl, Gtk2 and locale — a bit of a mess
by Khen1950fx (Canon) on Jul 11, 2013 at 23:20 UTC
    It don't see it as a problem with locales. For example, try this:
    #!/usr/bin/perl -l use strict; use warnings; my $num = 123.456; str($num); sub str { my ($want_num, $width) = @_; $width = '000'; $want_num =~ tr/./,/; $want_num = print "$want_num$width"; }
    Does that work for you?

      I'm not even sure what you're trying to achieve here.

      • Call str with one argument, but assign @_ to two arguments, thus rendering the second one undef
      • Then you assign to the second var anyway. Fine, I guess, but I would have done something like
        sub str { my ($want_num) = @_; my $width = '000'; # ...
      • Then you replace periods in the string representation of 123.456 (which could be "123,456" or even "١٢٣٫٤٥٦" by this time) with commas, resulting in not much.
      • And then print whatever the replaced result is (let’s say "123,456") concatenated with "000" (why is that even called ‘width’?), resulting in an output of 123,456000.
      • Then you put the return value of that print in $want_num for not much reason, as print returns true on a successful print, and the value doesn’t escape the subroutine anyway.
      • str and thus the whole program then returns true after printing the above value.

      Pray tell, what did you want to achieve?

        (I am not at all surprised that this never received further comments :3)

Re: Perl, Gtk2 and locale — a bit of a mess
by Ralesk (Pilgrim) on Jul 15, 2013 at 09:45 UTC

    I think I don’t understand anything anymore. Look:


    Why is everything set except for LC_NUMERIC? Why is anything set? I never asked for this...

    Update: C does the following:

    #include <stdio.h> #include <locale.h> int main() { printf("%s\n", setlocale(LC_ALL, NULL)); setlocale(LC_ALL, ""); printf("%s\n", setlocale(LC_ALL, NULL)); return 0; }
    ~$ ./localetest C hu_HU.UTF-8

      What does the POSIX standard say should happen? What does your system say setlocale should do?

      The setlocale link talks about  POSIX::setlocale( LC_ALL , "" ) setting stuff like you're seeing, but in perl its the default  Usage: POSIX::setlocale(category, locale = 0)

      To query you use NULL, perls equivalent is

      $ perl -e " use POSIX qw/ setlocale LC_ALL /; print setlocale( LC_ALL +, undef ); " English_United States.1252

      So I don't think I'm seeing a bug here, looks like its working as designed

        POSIX(3pm) says that the Perl equivalent of C’s setlocale(cat, NULL) is setlocale($cat) (i.e. one argument). I’m not explicitly setting the locale to the environment-given locale (setlocale($cat, "") or in C setlocale(cat, "")) until later in the code, so it should default to C locale.

        Also, setlocale(3) says the following: “If locale {the second param} is NULL, the current locale is only queried, not modified. On startup of the main program, the portable "C" locale is selected as default. A program may be made portable to all locales by calling setlocale(LC_ALL, "" ) after program initialization”

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1043714]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2017-02-21 09:16 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (309 votes). Check out past polls.