<?xml version="1.0" encoding="windows-1252"?>
<node id="1008017" title="Peculiar Reference To U+00FE In Text::CSV_XS Documentation" created="2012-12-09 20:50:27" updated="2012-12-09 20:50:27">
<type id="115">
perlquestion</type>
<author id="546548">
Jim</author>
<data>
<field name="doctext">
&lt;p&gt;In the documentation of [href://http://search.cpan.org/~hmbrand/Text-CSV_XS-0.94/CSV_XS.pm#SPECIFICATION|Text::CSV_XS], there's a peculiar reference to what seems like a very special case:&lt;/p&gt;

&lt;blockquote&gt;The separation-, escape- &amp;#91;&lt;i&gt;sic&lt;/i&gt;&amp;#93;, and escape- characters can be any ASCII character in the range from 0x20 (space) to 0x7E (tilde). Characters outside this range may or may not work as expected. &amp;hellip; If you use perl-5.8.2 or higher, these three attributes are utf8-decoded, to increase the likelihood of success. &lt;b&gt;This way U+00FE will be allowed as a quote character.&lt;/b&gt; &amp;#91;My emphasis.&amp;#93&lt;/blockquote&gt;

&lt;p&gt;Why is this particular Unicode character, &lt;tt&gt;LATIN SMALL LETTER THORN&lt;/tt&gt;, singled out for special mention in the documentation? And why does it state that "&amp;#91;c&amp;#93;haracters outside &amp;#91;the range from 0x20 through 0x7E&amp;#93; may or may not work as expected"? When &lt;i&gt;might&lt;/i&gt; they work?&lt;/p&gt;

&lt;p&gt;The implication of this explicit mentioning of &lt;tt&gt;U+00FE&lt;/tt&gt; in the documentation is that [mod://Text::CSV_XS] can be used to parse CSV records in Unicode Concordance DAT files. If this is the case, then I want to learn how to do this. (See my earlier post titled [1007942|Best Way To Parse Concordance DAT File Using Modern Perl?])&lt;/p&gt;

&lt;p&gt;Jim&lt;/p&gt;
</field>
</data>
</node>
