http://www.perlmonks.org?node_id=770824


in reply to Mime::Parser utf-8 issue

I don't think that MIME::Parser touches the encoded headers. You should be able to decode them with something like the following:
use Encode; my $val = $header->get($_); $val = Encode::decode('MIME-Header', $val); $val = Encode::encode('utf8', $val);
(I'm sure you could even go further and combine my two lines into one, using Encode::from_to or something like that.)

Joe

P.S. Here's an obscure tip that you'll probably never need to worry about: the "Remove Trash..." block of your code should technically come after the decoding that I described above, just in case there is a comma in the encoded data which would be significant to your splitting of the From/To/Cc headers.

Replies are listed 'Best First'.
Re^2: Mime::Parser utf-8 issue
by mhearse (Chaplain) on Jun 12, 2009 at 19:02 UTC
    Thanks for you reply. It works great. I have only one small problem. When running the proceeding code, I end up with some trash at the beginning of the subject. Such as: ת\xB7\xA2. Any suggestions?
    my $entity = $parser->parse_data($message) or die $!; my $header = $entity->head() or die $!; my $utf8 = decode('MIME-Header', $header); $header = encode('MIME-Header', $utf8);
      Hmmm... I'm not sure. This code works for me, with the header value from your original post.
      use Encode; my $header = '=?UTF8?B?5LuO5Y2a5a6i5paH56ug5Lit5p+l5om+5oKo5oSf5YW06La +j55qE5Li7?==?UTF-8?B?6aKY?='; my $utf8 = decode('MIME-Header', $header); print "uft8: $utf8\n";
      output is:
      uft8: 从博客文章中查找您感兴趣的主题

      Of course, I can't read any Chinese, so I have no idea if those are the right characters or just gibberish.