pdxperl has asked for the
wisdom of the Perl Monks concerning the following question:
I'm reading email from a pop server and logging it into a tracking system. Since the email is coming from users, it can be plain text, or HTML, or a mix. All I'd like to do is extract the plain text.
If it is a mime-encoded message, just stripping HTML means that I can wind up with two copies of the message (plain text section + filtered HTML section) plus some mime-boundary stuff, so that doesn't seem like the best way to go.
I'm guessing that I need to look at the email, decide if it's mime encoded, and then see if there is a plain text section. So there's three cases (?) of emails:
1) Not Mime, so no decoding needed, just read msg body
2) Mime-encoded with plain text section -> extract plain text section
3) Mime-encoded, no plain text section, just HTML -> decode the HTML
Seems like a fair amount of effort, so I wondered if someone else has solved this in better way. I didn't see anything in CPAN that would be a complete solution (ie, nothing entitled Mail::ReadAnythingAndExtractPlainText)