Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Trying to determine the output length of a Unicode string

by ikegami (Pope)
on Sep 26, 2011 at 00:04 UTC ( #927777=note: print w/ replies, xml ) Need Help??


in reply to Trying to determine the output length of a Unicode string

To get the "visual size" (my term, don't know if there's an official one) of a string, you need two pieces of information:

  • The number of graphemes.
  • The visual size of each of those graphemes.

(And that's assuming your input has no control characters such as a newline.)

The first is actually pretty easy:

my @graphemes = $text =~ /\X/g; my $count = () = $text =~ /\X/g;

NFC is definitely not the way to go as it doesn't work for every character-mark combination.

The catch is knowing the width of characters. Some characters are zero-width, and others are double-width. For that, you really a need the help of a module. Unicode::GCString is such a module.

my $size = Unicode::GCString->new($text)->columns();


Comment on Re: Trying to determine the output length of a Unicode string
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://927777]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (16)
As of 2015-07-02 11:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (35 votes), past polls