I don't plan to spend time investigating this particular mystery.
If you're someone who has a deep understanding of how the Perl programming language works and the know-how to help fix it when it's broken—and I suspect you are—then perhaps you should spend time investigating this particular mystery. If you help make Perl more intuitive to use ("DWIM"), then you improve the language, which benefits the Perl community.
Posting snarky, condescending responses to the earnest inquiries of causal Perl programmers on PerlMonks doesn't improve the language or help the Perl community, and so isn't the best use of a Perl expert's time. It's especially unhelpful if the obtuse point one is trying to make turns out to be wrong.
I doubt the original poster's expressed desire for ignorance will lead to success when dealing with UTF-8 streams. Unfortunately, UTF-8 was defined in a way and supported by Unix (and Perl) in ways that make handling it correctly very often require significant diving into a lot of details.
You're right that grappling with Unicode in Perl is too often unduly tricky and obscure. But in most cases, as in this case, simple, ordinary tasks should be more straightforward. After all, "Easy things should be easy and hard things should be possible." Reading and writing trivial CSV records encoded in UTF-8 is most assuredly an "easy thing," not a "hard thing," isn't it?
(I'm the original poster, and I'm using Windows, not Unix. I made this clear in my original post. Also, UTF-8 is an ingenious encoding scheme that accomplishes its multiple objectives brilliantly. It wasn't defined in a way such that handling it correctly by programmers using modern programming languages and software libraries must inevitably be more difficult than handling text in any other character encoding by those same programmers. You can't blame Unicode here.)
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||