The problem isn't one of characters versus bytes. The problem is the definition of character in the context of Unicode text. The scalar reverse function and other built-in string functions operate on Unicode text using a naÔve and inadequate definition of character. Pointing this out and offering a workaround is the raison d'Ítre of moritz's 2008 tutorial.
The issue of what reverse does when fed, say, the bytes of a JPEG image are utterly irrelevant to this discussion, which is about Unicode text. I don't understand ikegami's insistentence on trying to fold into this discussion unrelated contexts. Your reply dramatizes how ikegami's contrarian non sequitur needlessly confused the simple and self-evident conclusion I made in my post.
Here's what I wrote:
The documentation of Perl's reverse function states: "In scalar context, [the reverse function] ... returns a string value with all characters in the opposite order." But it doesn't, at least not for a sufficiently modern, multilingual, Unicode-conformant definition of "character." It reverses Unicode code points, not characters in the usual, well-understood sense of the word.
One or the other is wrong: the behavior of the reverse function or the reverse function's documentation.
If I understand the design principles of Perl correctly, the reverse function should properly reverse extended grapheme clusters when the thing being reversed is Unicode text (and Perl understands it is Unicode text), and it should reverse bytes otherwise.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
Outside of code tags, you may need to use entities for some characters:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||