<?xml version="1.0" encoding="windows-1252"?>
<node id="929113" title="Re^3: Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma?" created="2011-10-02 02:59:15" updated="2011-10-02 02:59:15">
<type id="11">
note</type>
<author id="171588">
BrowserUk</author>
<data>
<field name="doctext">
&lt;blockquote&gt;&lt;i&gt;&lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;Really? It should:
&lt;code&gt;
-C [number/list]

The -C flag controls some of the Perl Unicode features.

As of 5.8.1, the -C can be followed either by a number or a list of option letters. The letters, their numeric values, and effects are as follows; listing the letters is equal to summing the numbers.
    I     1   STDIN is assumed to be in UTF-8
    O     2   STDOUT will be in UTF-8
    E     4   STDERR will be in UTF-8
    S     7   I + O + E
    i     8   UTF-8 is the default PerlIO layer for input streams
    o    16   UTF-8 is the default PerlIO layer for output streams
    D    24   i + o
    A    32   the @ARGV elements are expected to be strings encoded
              in UTF-8
    L    64   normally the "IOEioA" are unconditional,
              the L makes them conditional on the locale environment
              variables (the LC_ALL, LC_TYPE, and LANG, in the order
              of decreasing precedence) -- if the variables indicate
              UTF-8, then the selected "IOEioA" are in effect
    a   256   Set ${^UTF8CACHE} to -1, to run the UTF-8 caching code in
              debugging mode.

For example, -COE and -C6 will both turn on UTF-8-ness on both STDOUT and STDERR. Repeating letters is just redundant, not cumulative nor toggling.

The io options mean that any subsequent open() (or similar I/O operations) will have the :utf8 PerlIO layer implicitly applied to them, in other words, UTF-8 is expected from any input stream, and UTF-8 is produced to any output stream. This is just the default, with explicit layers in open() and with binmode() one can manipulate streams as usual.

-C on its own (not followed by any number or option list), or the empty string "" for the PERL_UNICODE environment variable, has the same effect as -CSDL. In other words, the standard I/O handles and the default open() layer are UTF-8-fied but only if the locale environment variables indicate a UTF-8 locale. This behaviour follows the implicit (and problematic) UTF-8 behaviour of Perl 5.8.0.

&lt;/code&gt;


&lt;p&gt;Maybe you should perlbug the error.

&lt;div class="pmsig"&gt;&lt;div class="pmsig-171588"&gt;
&lt;hr /&gt;
&lt;font size=1 &gt;
&lt;div&gt;Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.&lt;/div&gt;
&lt;div&gt;"Science is about questioning the status quo. Questioning authority". &lt;/div&gt;
&lt;div&gt;In the absence of evidence, opinion is indistinguishable from prejudice.&lt;/div&gt;
&lt;/font&gt;

&lt;/div&gt;&lt;/div&gt;</field>
<field name="root_node">
929103</field>
<field name="parent_node">
929110</field>
</data>
</node>
