Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: limiting length of utf8 string in bytes

by RMGir (Prior)
on Dec 14, 2009 at 12:44 UTC ( #812699=note: print w/ replies, xml ) Need Help??


in reply to limiting length of utf8 string in bytes

On a utf8 string, chop appears to do 'the right thing', i.e. remove one trailing utf8 character, regardless of how many bytes it is.

I guess you could keep chop-ping your string while length>$threshold, but that's O(excess characters), which might get painful.

The other alternative is to proceed by inspection - under 'use bytes', examine the characters at the $threshold+1 position, and, working your way backwards, "substr" before that character if it's a valid utf8 start character.

That would require at most 4 loop iterations for valid utf8, I think.


Mike


Comment on Re: limiting length of utf8 string in bytes

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://812699]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2014-10-01 23:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (41 votes), past polls