Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

The problem is the definition of character in the context of Unicode text.

No, I fully agree with you with the definition of character in the context of Unicode text.

At issue is that reverse cannot recognise the presence of Unicode text. How do you think reverse can tell the difference between chr(113).chr(101).chr(769) and "qe\N{COMBINING ACUTE ACCENT}"?

It can either always treat the string as Unicode text, or never. Currently, it never does. To change that is backwards incompatible, so you'd have to demonstrate a bug in order to change that behaviour.

use strict; use warnings; use charnames qw( :full ); sub current_reverse { return reverse(@_); } sub string_reverse { @_ = return reverse(@_) if wantarray; my @chars = join('', @_) =~ /./sg; return join '', @chars[ reverse 0..$#chars ]; } sub unicode_reverse { return reverse(@_) if wantarray; my @chars = join('', @_) =~ /\X/g; return join '', @chars[ reverse 0..$#chars ]; } printf("%-7s %-7s %-7s\n", "", "samples", "text"); printf("%-7s %-7s %-7s\n", "", "-------", "-------"); for (qw( current string unicode )) { my $reverser = do { no strict 'refs'; \&{ $_."_reverse" } }; my $water_samples = join '', map chr, 113, 101, 769; $water_samples = $reverser->($water_samples); my $last_sample = substr($water_samples, 0, 1); my $text = "Cafe\N{COMBINING ACUTE ACCENT}"; $text = $reverser->($text); my ($last_char) = $text =~ /^(\X)/; printf("%-7s %-7s %-7s\n", $_, ord($last_sample) == 769 ? 'ok' : 'not ok', $last_char eq "e\N{COMBINING ACUTE ACCENT}" ? 'ok' : 'not ok', ); }
samples text ------- ------- current ok not ok string ok not ok unicode not ok ok

Your whole argument for the presence of a bug is that reverse uses "character" could be confused with Unicode's definition of the word.

One or the other is wrong: the behavior of the reverse function or the reverse function's documentation.

Those are the only two options if and if reverse's documentation uses the same definition of "character" as the Unicode standard.

Update: Added code.

In reply to Re^9: Repurposing reverse by ikegami
in thread How to reverse a (Unicode) string by moritz

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others imbibing at the Monastery: (3)
    As of 2018-03-22 02:37 GMT
    Find Nodes?
      Voting Booth?
      When I think of a mole I think of:

      Results (272 votes). Check out past polls.