What would be a nice way to limit a string to a certain length in bytes, but avoid chopping in the middle of a multibyte unicode character. Something like this works:
use utf8; # the literal strings are in utf8
binmode(STDOUT, ":utf8");
my $maxbytes = 5;
my $a= "יטא"; # length: 3 chars, 6 bytes
print $a, "\n";
{
use bytes;
$a = substr($a,0,5) if length($a) > 5;
}
use Encode;
$a = decode_utf8($a,Encode::FB_QUIET);
print $a, "\n"; # 2 chars, 4 bytes now
But I feel there should be something simpler...