http://www.perlmonks.org?node_id=669391


in reply to UTF8 Validity

It is valid UTF-16 ... perhaps that's what you're dealing with. A good resource for this is the fileformat.info character reference page