Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

POSIX::strftime encoding

by squentin (Sexton)
on Aug 24, 2010 at 20:10 UTC ( #857018=perlquestion: print w/ replies, xml ) Need Help??
squentin has asked for the wisdom of the Perl Monks concerning the following question:

How do I get a utf8 time string using POSIX::strftime ?

When using a utf8 locale, for example fr_FR.utf8, the output of POSIX::strftime is encoded in utf8 but without the utf8 flag on, ie: returns a byte string and not a character string. So when using utf8::upgrade on it (what the gtk2 bindings do), or when printing it to a file using ">:utf8", the non-ascii characters become garbage.

And of course, when using a non-utf8 locale such as fr_FR, the return value of POSIX::strftime is encoded in a locale specific encoding.

So what is the best way to get a proper utf8 string ? Do I really have to look at the locale value myself to know how to convert the string ?

Shouldn't that behavior be considered a bug ? (though probably hard to fix without breaking some existing programs) It should at least be mentioned in the documentation.

example code:

use POSIX "strftime"; my $s=strftime("%c",localtime); open my($fh),">","without_utf8"; printf $fh $s; open my($fh2),">:utf8","with_utf8"; printf $fh2 $s;
    result with locale fr_FR.utf8 :
  • "without_utf8" contains the correct utf8 date
  • "with_utf8" contains a date with a garbled "Août" (french for August)
    result with locale fr_FR
  • "without_utf8" contains the date encoded in ISO-8859
  • "with_utf8" contains the correct utf8 date

(tested with perl v5.10.1 and v5.12.1)

Comment on POSIX::strftime encoding
Download Code
Re: POSIX::strftime encoding
by Khen1950fx (Canon) on Aug 25, 2010 at 06:20 UTC
    "without_utf8" contains the correct utf8 date

    That is actually correct. "fr_FR.utf8" means that, because of the utf8 string, utf8 is default. You don't need to call utf8 because perl will do it automatically; however, if you call utf8, the wrong move, then you get some weird stuff in return.

    Try this with your locale fr_FR. Calling binmode should work for you:

    #!/usr/bin/perl use strict; use warnings; use Time::Piece; open STDOUT, '>', 'time.log'; my $t = localtime; print $t->strftime("%c"), "\n"; my $mt = localtime; binmode STDOUT, ":utf8"; print $mt->strftime("%c"), "\n";

      Yes, I understand what is happening, that's not what I'm asking, sorry if I wasn't clear, let me rephrase it.

      What I want is use the return value of POSIX::strftime in gtk2, the bindings use utf8::upgrade on all the strings sent to gtk2 functions.

      The question is how do I make sure the string isn't mangled by that, do I :

      a) consider the locale is utf8, and use utf8::decode on it to turn on its utf8 flag.

      b) use some unknown function that will use the locale to correctly decode the string from whatever encoding the locale is using, and turn it into a valid utf8 string.

      c) implement the unknown b) function myself

      And also 2 related questions:
      - is it a bug ?
      - shouldn't this be documented in the man page for POSIX::strftime ?

Re: POSIX::strftime encoding
by squentin (Sexton) on Sep 01, 2010 at 22:07 UTC

    For what it's worth, I did the following :

    use POSIX qw/setlocale LC_TIME strftime/; use Encode; my ($strftime_encoding)= setlocale(LC_TIME)=~m#\.([^@]+)#; sub strftime2 # try to return an utf8 value from strftime { $strftime_encoding ? Encode::decode($strftime_encoding, &strftime +) : &strftime; }

    It works if the encoding is specified in the locale, which seems to be usually the case when using utf8.

    And if the encoding is not in the locale, I keep the returned string as is, it won't work with all locales, but at least it works with fr_FR.

    It's not perfect, but it's better than before, and simple.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://857018]
Approved by biohisham
Front-paged by tye
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (13)
As of 2014-12-22 22:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (132 votes), past polls