Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
more useful options
 
PerlMonks  

Re^2: decoding a UTF-16B string found in an email subject

by runrig (Abbot)
on Oct 30, 2013 at 20:18 UTC ( #1060438=note: print w/ replies, xml ) Need Help??


in reply to Re: decoding a UTF-16B string found in an email subject
in thread decoding a UTF-16B string found in an email subject

This should be the correct answer, but I don't think the string is correctly encoded. This:

use Encode qw(decode); my $str = 'username, A Ne=?UTF-16?B?dwAgAEMAcgBlAGQAaQB0ACAAQwBhAHIAZA +AgAEMAbwB1AGwAZAAgAEIAZQAgAEgAZQBhAGQAZQBkACAAWQBvAHUAcgAgAFcAYQB5AA= +=?='; my $chr = decode('MIME-Header', $str); print "$chr\n";
Gets me:
UTF-16:Unrecognised BOM 7700 at /.../Encode/MIME/Header.pm line 81.
While this:
use MIME::Base64; my $cstr = 'dwAgAEMAcgBlAGQAaQB0ACAAQwBhAHIAZAAgAEMAbwB1AGwAZAAgAEIAZQ +AgAEgAZQBhAGQAZQBkACAAWQBvAHUAcgAgAFcAYQB5AA'; my $chk = decode_base64($cstr); print "$chk\n";
Gets me:
w Credit Card Could Be Headed Your Way
So the part that is supposed to be UTF-16 appears to be just base64 encoded.

UPDATE: And if you change 'UTF-16' in the first part to 'UTF-8', then it is correctly decoded without error.


Comment on Re^2: decoding a UTF-16B string found in an email subject
Select or Download Code
Re^3: decoding a UTF-16B string found in an email subject
by Anonymous Monk on Mar 18, 2014 at 12:53 UTC
    According to all docs I've found, a BOM is not necessary, and when a BOM is not present then big-endian is supposed. However the string you give seems to be little-endian (as is the case in the problem that got me to this page...). If you s/UTF-16/UTF-16LE/ then your string gets decoded correctly.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1060438]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (8)
As of 2014-04-19 10:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (480 votes), past polls