http://www.perlmonks.org?node_id=1061973


in reply to Text::CSV and Unicode

This has come up before, and as of Text::CSV_XS version 1.00, the behavior is now consistent. It however does not meet your current needs. I just uploaded version 1.02 a minute ago, as that now has a new attribute decode_utf8 that enables you to disable the default behavior (which is what has proven to be what most people want and expect).

decode_utf8 This attributes defaults to TRUE. While parsing, fields that are valid UTF-8, are automatical +ly set to be UTF-8, so that $csv->parse ("\xC4\xA8\n"); results in PV("\304\250"\0) [UTF8 "\x{128}"] Sometimes it might not be a desired action. To prevent thos +e upgrades, set this attribute to false, and the result will +be PV("\304\250"\0)

I realize that "most people" is not "all people" and I cannot make a default that makes "all" people happy. That is also the reason why I waited with 1.02. I have asked many users about what should be the default and also check the historical entries in RT and my mail and came to the conclusion that nowadays the majority works with UTF8 CSV more than with binary CSV. The change in 1.00 was not to enable UTF-8 or to disable it. The change was to make it work more consistently.


Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: Text::CSV and Unicode
by vsespb (Chaplain) on Nov 11, 2013 at 11:00 UTC
    Ok, thank you ! I will try new version
    conclusion that nowadays the majority works with UTF8 CSV

    That's no problem, I just wanted this behaviour to be clearly documented (I did not want to rely on something undocumented).
Re^2: Text::CSV and Unicode
by Jim (Curate) on Nov 11, 2013 at 17:59 UTC
      Oddly my similar issues with Text::CSV were resolved when I simply installed Text::CSV_XS. There must be a shared library that gets updated or something...