Tux, thanks for the code. It looks like it will work and I'll try it out tomorrow. I've developed with Unicode in many programming languages and what confuses me is the term binary. I'm not sure if this is unique to Text::CSV_XS or if it related to all perl modules. This is a strange term when speaking of Unicode characters. Thus, I looked up what is really meant by binary and according to the Text::CSV_XS documentation binary is when any byte in the range
\x00-\x08,\x10-\x1F,\x7F-\xFF is found but the range you specified in the trigger is (c >= 0x7f && c <= 0xa0). The documentation goes on to say that "If a string is marked UTF8, binary will be turned on automatically when binary characters other than CR or NL are encountered. Note that a simple string like "\x{00a0}" might still be binary, but not marked UTF8, so setting { binary = 1 }> is still a wise option.". I was using the binary = 1 setting along with quote_space => 0, so from a Unicode development standpoint there is a strange behavior when the results differ for a simple CSV file with just one row, one field such as
This is X test
If X is a printable character in the range from 0x0000 to 0x00FF, the quote_space => 0 works as expected and the field will not have double quotes in the output but if the X is a printable character > 0x00FF the field has double quotes around it. All Unicode developers expect the behavior of quote_space => 0 to be the same for all printable characters. Unicode people think that a letter is a letter regardless of what writing script (Hangul, Han, Cyrillic, Hebrew, etc.) it comes from. It is even stranger that the setting quote_space => 0 works with characters from Basic & Latin1 but then it starts to act differently once you get into the Unicode block for Extended Latin A.
Again, thanks for going the extra mile and providing the new constructor.