$a is 5 bytes long and pack("v") is 4 bytes long, so $binarystring should hold 9 bytes. length($binarystring) confirms the length, and utf8::downgrade would confirm that they are bytes.
$b is 1000 bytes long and pack("v") is 4 bytes long, so $binarystring2 should hold 1004 bytes. length($binarystring2) confirms the length, and utf8::downgrade would confirm that they are bytes.
And what's great with this bug, is that you only see it when the original string has multi-bytes characters or when it is long enough. :)
I don't see the problem. Are you expecting something other than 9 and 1004? Yes, the length of the internal representation is different (as reported by bytes::length), but why are you mucking with the internals?
Speaking of mucking with internals, utf8::decode should normally be used instead of _utf8_on.
so returning a length in utf8 characters is strange.
It's a bit odd, but only because it's a bit inefficient.