My gut feel is that as the are so many different "unicode standard" encodings out there in the wild, the chances of getting false positives from undetected transmission errors using sum-the-ordinals values, is far higher then using sum-the-bytes values.

I don't understand this. I can't think of any issue that would affect

sum unpack 'W*', decode 'UTF-8', $utf8

that wouldn't also affect

sum unpack 'C*', $utf8