Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: "ISO-8859-1 0x80-0xFF" and chr()

by moritz (Cardinal)
on Mar 23, 2012 at 12:45 UTC ( #961202=note: print w/ replies, xml ) Need Help??


in reply to "ISO-8859-1 0x80-0xFF" and chr()

1. chr() returns characeter not bytes.(silly me)

While "bytes" and "characters" is a useful mental image, it's not always correct. The operation defines the context. For example uc interprets a string as text no matter what, whereas print interprets a string as bytes (if it can)

The real problem is that the byte 0xe9 cannot be decoded as UTF-8, because it isn't UTF-8. Either do nothing with it (which works on sufficiently modern perls), or decode it as Latin-1, because Latin-1 (aka ISO-8859-1) maps each byte exactly to the same codepoint number.

Note that instead of calling encode() on each output string, you can also set an IO layer which does it automatically:

binmode STDOUT, ':encoding(UTF-8)';

Or on the command line, you can set that up with the -C option:

$ perl -CS -wE 'say chr hex "E9"'


Comment on Re: "ISO-8859-1 0x80-0xFF" and chr()
Select or Download Code
Re^2: "ISO-8859-1 0x80-0xFF" and chr()
by remiah (Hermit) on Mar 24, 2012 at 04:16 UTC

    Thanks for reply, moritz.

    I was careless for "utf8" and "UTF-8" before I read that document. moritz seems to be careful person. And -CS option very usuful.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://961202]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2014-11-26 04:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (162 votes), past polls