Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: Trying to determine the output length of a Unicode string

by ikegami (Pope)
on Sep 26, 2011 at 08:08 UTC ( #927811=note: print w/ replies, xml ) Need Help??


in reply to Re: Trying to determine the output length of a Unicode string
in thread Trying to determine the output length of a Unicode string

sub length_in_grapheme_clusters { my $length; $length++ while $_[0] =~ m/\X/g; return $length; }

As previously mentioned, this can be written as:

sub length_in_grapheme_clusters { my $length = () = $_[0] =~ /\X/g; return $length; }
or
sub length_in_grapheme_clusters { return 0+( () = $_[0] =~ /\X/g ); }

You must roll your own.

As previously mentioned, he does not need to roll his own as there's already an existing solution. (It was also mentioned that length_in_grapheme_clusters is not sufficient.)

Update: Fixed length_in_grapheme_clusters so it can be called in list context.


Comment on Re^2: Trying to determine the output length of a Unicode string
Select or Download Code
Re^3: Trying to determine the output length of a Unicode string
by Jim (Curate) on Sep 26, 2011 at 19:19 UTC

    What "existing solution"? And why isn't length_in_grapheme_clusters() sufficient?

    Once again, it's obvious you're making some point, the thrust of which is undoubtedly that I'm wrong about something I wrote, but you're making it too laconically for me to get it. I have no idea what you're saying.

    By the way, the version of length_in_grapheme_clusters() I used in my Perl script is attributable to Tom Christiansen. I borrowed it from a PerlMonks post of his. To me, it's better because it makes the operation plainly clear. Your version is tricky and obfuscated, and seemingly weirdly dependent on context. To be honest, I don't understand how it works. To learn how it works, I'd have to read the Perl documentation.

      What "existing solution"?

      See Re: Trying to determine the output length of a Unicode string

      And why isn't length_in_grapheme_clusters() sufficient?

      See Re: Trying to determine the output length of a Unicode string

      I used in my Perl script is attributable to Tom Christiansen.

      Then you should find his comments about Text::Wrap as they are pertinent here. Maybe it was on the Perl5 Porters mailing list (which is archived).

      To be honest, I don't understand how it works

      Most people will say the same about Perl, map, etc, but that's a stupid reason not to use Perl, map, etc. Especially where performance matters, which is likely for this function.

      What I used: ()= returns the length of the list returned by the expression that follows (when used in scalar context).

      How it works: List assignmemt in scalar context returns the number of elements to which the RHS evaluated.

      Your version is tricky and obfuscated

      It's actually very straightforward. There's nothing hidden, it uses well known idioms, and it require only the lowest mental load (only need to remember one value at a time).

      I'd have to read the Perl documentation.

      Really? I use list assignment in scalar context countless times a day. More often than the match operator, I dare say.

      Your implication that someone needs to read the docs for that, but not for \X and capture-less m/.../g is unconvincing.

        You just changed your version of length_in_grapheme_clusters() to be more like Tom Christiansen's. I told you I thought your version was "weirdly dependent on context." It was. Now it's not because you changed it to be more like the one you disparaged earlier.

        I had tested your original, uncorrected version of length_in_grapheme_clusters() in my Perl script and got results I didn't understand when I called the function in list context instead of scalar context.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://927811]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2014-08-30 01:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (291 votes), past polls