Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: "ISO-8859-1 0x80-0xFF" and chr()

by moritz (Cardinal)
on Mar 23, 2012 at 12:45 UTC ( #961202=note: print w/ replies, xml ) Need Help??


in reply to "ISO-8859-1 0x80-0xFF" and chr()

1. chr() returns characeter not bytes.(silly me)

While "bytes" and "characters" is a useful mental image, it's not always correct. The operation defines the context. For example uc interprets a string as text no matter what, whereas print interprets a string as bytes (if it can)

The real problem is that the byte 0xe9 cannot be decoded as UTF-8, because it isn't UTF-8. Either do nothing with it (which works on sufficiently modern perls), or decode it as Latin-1, because Latin-1 (aka ISO-8859-1) maps each byte exactly to the same codepoint number.

Note that instead of calling encode() on each output string, you can also set an IO layer which does it automatically:

binmode STDOUT, ':encoding(UTF-8)';

Or on the command line, you can set that up with the -C option:

$ perl -CS -wE 'say chr hex "E9"'


Comment on Re: "ISO-8859-1 0x80-0xFF" and chr()
Select or Download Code
Re^2: "ISO-8859-1 0x80-0xFF" and chr()
by remiah (Hermit) on Mar 24, 2012 at 04:16 UTC

    Thanks for reply, moritz.

    I was careless for "utf8" and "UTF-8" before I read that document. moritz seems to be careful person. And -CS option very usuful.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://961202]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2015-07-05 06:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls