Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Trying to determine the output length of a Unicode string

by ikegami (Pope)
on Sep 26, 2011 at 00:04 UTC ( #927777=note: print w/ replies, xml ) Need Help??


in reply to Trying to determine the output length of a Unicode string

To get the "visual size" (my term, don't know if there's an official one) of a string, you need two pieces of information:

  • The number of graphemes.
  • The visual size of each of those graphemes.

(And that's assuming your input has no control characters such as a newline.)

The first is actually pretty easy:

my @graphemes = $text =~ /\X/g; my $count = () = $text =~ /\X/g;

NFC is definitely not the way to go as it doesn't work for every character-mark combination.

The catch is knowing the width of characters. Some characters are zero-width, and others are double-width. For that, you really a need the help of a module. Unicode::GCString is such a module.

my $size = Unicode::GCString->new($text)->columns();


Comment on Re: Trying to determine the output length of a Unicode string
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://927777]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (18)
As of 2014-04-16 15:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (432 votes), past polls