http://www.perlmonks.org?node_id=1199304


in reply to Counting text with ligatures

I assume what you want to count is "Graphemes" (see also perluniintro). You should use Perl v5.12 or better; here are a couple of ways (see \X and \b{gcb}, as well as my post here):

my $string = "k\x{0301}u\x{032D}o\x{0304}\x{0301}n"; print "length: ",length($string),"\n"; # wrong way my $len = () = $string=~/\X/g; print "len: $len\n"; my @graphs = split /\X\K(?=\X)/, $string; print "graphs: ", 0+@graphs, "\n"; # in Perl v5.22+: my @graphs2 = split /\b{gcb}/, $string; print "graphs2: ", 0+@graphs2, "\n"; __END__ length: 8 len: 4 graphs: 4 graphs2: 4