|XP is just a number|
Re: limiting length of utf8 string in bytesby RMGir (Prior)
|on Dec 14, 2009 at 12:44 UTC||Need Help??|
On a utf8 string, chop appears to do 'the right thing', i.e. remove one trailing utf8 character, regardless of how many bytes it is.
I guess you could keep chop-ping your string while length>$threshold, but that's O(excess characters), which might get painful.
The other alternative is to proceed by inspection - under 'use bytes', examine the characters at the $threshold+1 position, and, working your way backwards, "substr" before that character if it's a valid utf8 start character.
That would require at most 4 loop iterations for valid utf8, I think.