I tried to decode a MIME-encoded message with MIME::Tools, and got the body decoded from quoted-printable to bytes and accessible via MIME::Body methods. To get the unicode characters I needed to do one more decoding step and decode my message body from bytes to characters using Encode module.
Approaching your example,
use MIME::Decoder;
use Encode 'decode';
# only for this particular case I will decode QP manually
my $d = new MIME::Decoder 'quoted-printable';
# usual way of obtaining bytes decoded from
# QP/Base64/7bit/other content-transfer-encodings
# is to use MIME::Body methods
# encode unicode characters to UTF-8 on printing
binmode STDOUT, ":utf8";
# open an in-memory filehandle
# since MIME::Decoder only supports filehandles
open my $fh, ">", \(my $bytes);
# decode the quoted-printable
$d->decode(\*DATA, $fh);
# decode the bytes
my $characters = decode 'utf-8' => $bytes;
# prove having 1 character, not 4 bytes
while ($characters =~ /(.)/g) {
printf "%s is unicode character %x\n",$1,(unpack"W",$1);
}
__DATA__
=F0=9F=98=B3
� is unicode character 1f633
my terminal font doesn't have emoji, so it showed � instead
More info at perlopen, Encode, perlunitut, perluniintro, perlunifaq. |