Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Sorry, I realize I wasn't specific enough:
I've read about Encode, and successfully used it in a previous project. I know about the need to decode and encode streams, too. However, it seemed to me that Perl did some of this job itself (as I had tried to explicitly decode data from the standard input, or from command-line arguments, and had experienced strange results)

So, is there some place where it is explicitly stated what is converted by perl, in a transparent manner, and what isn't ?

Furthermore, even though I didn't Encode or Decode the streams, shouldn't it "just work", if the scalar value is specified in UTF-8 (because the file is encoded as such), and Perl is AWARE that it is UTF-8 (because of 'use utf8;'), and Perl stores it internally in UTF-8, and the expected output format is UTF-8 too ?

I'm pretty sure there is a catch I haven't figured out, there, but pointing it to me, even if obvious, could help. Thanks !

EDIT: I've run a short test, using a Latin-1 terminal (this test script is fully encoded in UTF-8):

#!/usr/bin/perl use utf8; use Encode; $\ = "\n"; my $unicodeScalar = "Je suis une chaîne accentuée là où il faut."; print '['.Encode::is_utf8($unicodeScalar).'] '.$unicodeScalar;

Using my Latin-1 terminal, I displayed the source file, and, sure enough, the contents were garbled (2 strange bytes for each accentuated character, which confirmed me the file was truly UTF-8), then I ran the script. And I got a perfect display.

So, does Perl assume by default, even in a UTF-8 environment, that it should output everything in Latin-1 ?


In reply to Re^2: Default encoding rules leave me puzzled... (use open qw/ :std :locale /; by kzwix
in thread Default encoding rules leave me puzzled... by kzwix

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others musing on the Monastery: (19)
    As of 2015-07-28 18:45 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (258 votes), past polls