Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: "ISO-8859-1 0x80-0xFF" and chr()

by choroba (Cardinal)
on Mar 23, 2012 at 12:28 UTC ( [id://961199]=note: print w/replies, xml ) Need Help??


in reply to "ISO-8859-1 0x80-0xFF" and chr()

Where does your problematic string come from? If it comes from the code,
use utf8;
If the encoding is different, you can replace utf8 with encoding('iso-8859-2') etc.

If the string comes from a filehandle,

open my $FH, '<:utf8', ...
If the encoding is different, you can replace utf8 with encoding(iso-8859-2) etc.

If the string comes from a DBI, your driver might support encoding (for example, Postgres's connect supports pg_enable_utf8 attribute.

And so on.

Replies are listed 'Best First'.
Re^2: "ISO-8859-1 0x80-0xFF" and chr()
by remiah (Hermit) on Mar 24, 2012 at 03:07 UTC

    Thanks for reply.

    Problematic string came from chr(). I didn't know I can paste 'é' at PerlMonk, I tried to create it with chr(hex()). And I stumbled.

    The OP of this thread Bug in Template? said he decode with database driver and print it in Template with like this.

    my $t =Template->new(); $t->process("his.tmpl", {lines=>\@vars}, "output.html" ) or die $t->error();
    Template wants encoded bytes, not decoded characters. This prints "#�#".
    #!/usr/bin/perl use strict; use warnings; use Encode qw(decode encode); use Template; my($a,$decoded); #input bytes to $a $a=`perl -CS -e "use utf8;print 'é'"`; #decode it to character $decoded=decode('UTF-8', $a); #this will print replacement character to test_out1.html my $t=Template->new(); $t->process("test.tmpl",{a=>$decoded},"test_out1.html");
    And below is Template for that.
    <html> <head> <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8" +> </head> <body> #[% a %]# </body> </html>
    Encode $a to bytes will work.
    #!/usr/bin/perl use strict; use warnings; use Encode qw(decode encode); use Template; my($a,$decoded,$encoded); #input bytes to $a $a=`perl -CS -e "use utf8;print 'é'"`; #decode it to character $decoded=decode('UTF-8', $a); $encoded=encode('UTF-8', $decoded); #this is good my $t=Template->new(); $t->process("test.tmpl",{a=>$encoded},"test_out2.html");
    There seems huge confusion around Template Tool Kit's Encoding problem here in Japan. My conclusion so far: "pass encoded bytes to Template, not decoded character".

      Hmm
      $ perldoc template |grep -i utf8 -C2 Alternately, the "binmode" argument can specify a particular IO la +yer such as ":utf8". $tt->process($infile, $vars, $outfile, binmode => ':utf8') || die $tt->error(), "\n";

        oh! I changed my conclusion now...

        $t->process("test.tmpl",{a=>$decoded},"test_out4.html",binmode=>":encoding(UTF-8)");

Re^2: "ISO-8859-1 0x80-0xFF" and chr()
by Anonymous Monk on Mar 23, 2012 at 17:29 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://961199]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-04-23 15:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found