Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: limiting length of utf8 string in bytes

by RMGir (Prior)
on Dec 14, 2009 at 12:44 UTC ( #812699=note: print w/replies, xml ) Need Help??


in reply to limiting length of utf8 string in bytes

On a utf8 string, chop appears to do 'the right thing', i.e. remove one trailing utf8 character, regardless of how many bytes it is.

I guess you could keep chop-ping your string while length>$threshold, but that's O(excess characters), which might get painful.

The other alternative is to proceed by inspection - under 'use bytes', examine the characters at the $threshold+1 position, and, working your way backwards, "substr" before that character if it's a valid utf8 start character.

That would require at most 4 loop iterations for valid utf8, I think.


Mike
  • Comment on Re: limiting length of utf8 string in bytes

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://812699]
help
Chatterbox?
[stevieb]: agreed. That's why I said at least a half-dozen. If enough of the different formats are present, the date/time folk may not have to request more. If they do, then at least there was a decent base to start with
[stevieb]: I do date and time transformations in both Perl and Python, but not frequently enough to not have to search for the format params etc ;)

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2017-04-29 02:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I'm a fool:











    Results (531 votes). Check out past polls.