Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: The “real length" of UTF8 strings

by gone2015 (Deacon)
on Sep 23, 2008 at 20:30 UTC ( [id://713303]=note: print w/replies, xml ) Need Help??


in reply to The “real length" of UTF8 strings

Can you used a regex to identify the the characters which are double length ? Something like:

print xlen("(\x{5fcd}\x{65e0}\x{53ef}\x{5fcd})"), "\n" ; ; sub xlen { my ($s) = @_ ; my $l = length($s) ; while ($s =~ m/[\x{5000}-\x{6FFF}]/g) { $l++ ; } ; return $l ; } ;
perhaps ?

Or:

print ylen("(\x{5fcd}\x{65e0}\x{53ef}\x{5fcd})"), "\n" ; ; sub ylen { my ($s) = @_ ; return length($s) + ($s =~ tr/[\x{5000}-\x{6FFF}]//) ; } ;
which avoids running a while loop and may or may not be faster.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://713303]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-04-25 13:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found