|Pathologically Eclectic Rubbish Lister|
minimalist perl-utf8 questionby didess (Sexton)
|on Feb 02, 2013 at 06:14 UTC||Need Help??|
didess has asked for the
wisdom of the Perl Monks concerning the following question:
Hi to all !
I thought I'd understood a little of character encoding ... But now I'm completely lost :
Given these 2 minimal "scripts" : (you should see 5 e-accutes in the strings )
When I run the first one, I see 5 splendid e-acutes, but length is said 10
When I run the second one, I see 5 question marks, but the length is 5
All this is done on a macbook pro, perl 5.14.2, locale on next lines, in the terminal window. The preferences of terminal are set for "UTF-8" encoding (that's why cat give good results)
Next line : an hexadecimal dump of the "print" line: One clearly sees it's utf8 encoded (0xa9c3 is e-acute)
Any idea or explanation is welcome !!