|Problems? Is your data what you think it is?|
Converting UTF-16 files to UTF-8by demerphq (Chancellor)
|on May 16, 2007 at 14:48 UTC||Need Help??|
demerphq has asked for the
wisdom of the Perl Monks concerning the following question:
I would have thought the following (quick hack) script would work:
Or even the more elegant one liner:
But it doesnt work. If I use an input file with a few (three) Ĕ in it (0x0114), saved in utf-16 by Ultraedit on win2k I end up with a file with the octets FF FE 14 01 14 01 14 01 and after conversion the output file has the octets EF BB BF C2 BE 00 14 00 01 00 14 00 01 00 14 00 01, which is just wrong. Can anybody spot what the problem is or is Perls Utf-16 support borked?
Note that this was with Perl 5.8.6 from ActiveState.
Update: Turns out that this was all down to a display bug in Ultraedit. Thanks for the help, and sorry for wasting anybody's time.