Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: minimalist perl-utf8 question

by mbethke (Hermit)
on Feb 02, 2013 at 06:54 UTC ( #1016657=note: print w/ replies, xml ) Need Help??

in reply to minimalist perl-utf8 question

One of the nasty interactions you can have with so many components (editor, programming language, terminal) working together to interpret and re-interpret octet sequences :) As quester has already explained how to fix it, this is why:

In the first program Perl doesn't interpret much about the embedded byte string. It sees it's 10 bytes long and writes the raw bytes to STDOUT were the terminal picks them up, recognizes they're valid UTF-8 and displays them thus. use utf8 on the other hand tells Perl to interpret the 10 bytes as UTF-8 characters so it realizes it's actually a string of only 5 characters. But when printing them, it forces them back to Latin-1 unless you use the binmode() which the terminal, expecting UTF-8, cannot parse correctly.

Edit: you were unlucky in that é is a valid character in Latin-1 so it can be "downgraded" to 1-byte encoding without complaints. Had you had some character outside that range in your string (\x{123} works fine) you'd have gotten a "Wide character in print" warning.

Comment on Re: minimalist perl-utf8 question
Select or Download Code

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1016657]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2016-05-28 15:23 GMT
Find Nodes?
    Voting Booth?