in reply to limiting length of utf8 string in bytes
On a utf8 string, chop appears to do 'the right thing', i.e. remove one trailing utf8 character, regardless of how many bytes it is.
Mike
I guess you could keep chop-ping your string while length>$threshold, but that's O(excess characters), which might get painful.
The other alternative is to proceed by inspection - under 'use bytes', examine the characters at the $threshold+1 position, and, working your way backwards, "substr" before that character if it's a valid utf8 start character.
That would require at most 4 loop iterations for valid utf8, I think.
Mike
|
---|
In Section
Seekers of Perl Wisdom