Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: limiting length of utf8 string in bytes

by RMGir (Prior)
on Dec 14, 2009 at 12:44 UTC ( #812699=note: print w/ replies, xml ) Need Help??


in reply to limiting length of utf8 string in bytes

On a utf8 string, chop appears to do 'the right thing', i.e. remove one trailing utf8 character, regardless of how many bytes it is.

I guess you could keep chop-ping your string while length>$threshold, but that's O(excess characters), which might get painful.

The other alternative is to proceed by inspection - under 'use bytes', examine the characters at the $threshold+1 position, and, working your way backwards, "substr" before that character if it's a valid utf8 start character.

That would require at most 4 loop iterations for valid utf8, I think.


Mike


Comment on Re: limiting length of utf8 string in bytes

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://812699]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (11)
As of 2015-07-08 04:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (94 votes), past polls