Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^9: Repurposing reverse

by ikegami (Pope)
on Jan 31, 2011 at 19:01 UTC ( #885332=note: print w/ replies, xml ) Need Help??


in reply to Re^8: How to reverse a (Unicode) string
in thread How to reverse a (Unicode) string

The problem is the definition of character in the context of Unicode text.

No, I fully agree with you with the definition of character in the context of Unicode text.

At issue is that reverse cannot recognise the presence of Unicode text. How do you think reverse can tell the difference between chr(113).chr(101).chr(769) and "qe\N{COMBINING ACUTE ACCENT}"?

It can either always treat the string as Unicode text, or never. Currently, it never does. To change that is backwards incompatible, so you'd have to demonstrate a bug in order to change that behaviour.

use strict; use warnings; use charnames qw( :full ); sub current_reverse { return reverse(@_); } sub string_reverse { @_ = return reverse(@_) if wantarray; my @chars = join('', @_) =~ /./sg; return join '', @chars[ reverse 0..$#chars ]; } sub unicode_reverse { return reverse(@_) if wantarray; my @chars = join('', @_) =~ /\X/g; return join '', @chars[ reverse 0..$#chars ]; } printf("%-7s %-7s %-7s\n", "", "samples", "text"); printf("%-7s %-7s %-7s\n", "", "-------", "-------"); for (qw( current string unicode )) { my $reverser = do { no strict 'refs'; \&{ $_."_reverse" } }; my $water_samples = join '', map chr, 113, 101, 769; $water_samples = $reverser->($water_samples); my $last_sample = substr($water_samples, 0, 1); my $text = "Cafe\N{COMBINING ACUTE ACCENT}"; $text = $reverser->($text); my ($last_char) = $text =~ /^(\X)/; printf("%-7s %-7s %-7s\n", $_, ord($last_sample) == 769 ? 'ok' : 'not ok', $last_char eq "e\N{COMBINING ACUTE ACCENT}" ? 'ok' : 'not ok', ); }
samples text ------- ------- current ok not ok string ok not ok unicode not ok ok

Your whole argument for the presence of a bug is that reverse uses "character" could be confused with Unicode's definition of the word.

One or the other is wrong: the behavior of the reverse function or the reverse function's documentation.

Those are the only two options if and if reverse's documentation uses the same definition of "character" as the Unicode standard.

Update: Added code.


Comment on Re^9: Repurposing reverse
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://885332]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (11)
As of 2014-09-19 18:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (144 votes), past polls