Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: MIME::Parser parse_data

by Corion (Pope)
on Jan 17, 2013 at 17:20 UTC ( #1013825=note: print w/ replies, xml ) Need Help??


in reply to MIME::Parser parse_data

I think the content encoding after base64 decoding is in the Content-Encoding header. Or at least, it should be, if the sending MUA adds it. Otherwise, you have to ass-u-me some default encoding.


Comment on Re: MIME::Parser parse_data
Download Code
Replies are listed 'Best First'.
Re^2: MIME::Parser parse_data
by boosth (Initiate) on Jan 17, 2013 at 18:19 UTC
    An example that I have that is working with option A. Even though the raw data is clearly base64 encoded it is being parsed as a human readable string
    $tmp_part->bodyhandle->as_string
    There is no "Content-Encoding" header in the raw mail.

    Here's the relevant part of the header:
    X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:cont +ent-classes:message MIME-Version: 1.0 Content-Type: multipart/related; type="multipart/alternative"; boundary="----_=NextPart001_01CDF2E2.B6A090C6" This is a multi-part message in MIME format. ------_=NextPart001_01CDF2E2.B6A090C6 Content-Type: multipart/alternat +ive; boundary="----_=NextPart002_01CDF2E2.B6A090C6" ------_=NextPart002_01CDF2E2.B6A090C6 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 SGkgU3VlICwKCldpbGwgYmUgaW4gdG91Y2ggaSBhbSB0aGlua2luZyBvZiBkb2luZyBhIH +RyaXAg dG8gSXRhbH ...
      Content-Type: text/plain; charset="utf-8"

      That's a good hint for that part...

        The problem is that I have other emails where it has the same format but I have to use this instead:
        $MessageBody = " ". decode('UTF-8',decode_base64($tmp_part->bodyh +andle->as_string));
        ------_=NextPart002_01CDF2E2.B6A090C6 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 SGkgU3VlICwKCldpbGwgYmUgaW4gdG91Y2ggaSBhbSB0aGlua2luZyBvZiBkb2luZyBhIH +RyaXAg dG8gSXRhbH
        The issue appears to be that this call:
        my $tmpMessage = $parser->parse_data($body);
        Is returning decoded strings for some emails but not for others. Some of the emails require the output of this call:
        $tmp_part->bodyhandle->as_string;
        to be decoded manually and others do not. I don't understand why sometimes this call
        $tmp_part->bodyhandle->as_string;
        Returns a human readable decoded string on some emails with base64 encoding but not on all emails with base64 encoding. This is a headache for me because if I change the code to just output the string it breaks on emails that need the string manually decoded. All I am doing is calling "parse_data" and then "bodyhandle->as_string". I'm not sure where the decoding process happens. The original data is definitely base64 encoded which I can see by looking at the raw email data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1013825]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (10)
As of 2015-07-30 11:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (271 votes), past polls