pdxperl has asked for the wisdom of the Perl Monks concerning the following question:
I'm reading email from a pop server and logging it into a tracking system. Since the email is coming from users, it can be plain text, or HTML, or a mix. All I'd like to do is extract the plain text.
If it is a mime-encoded message, just stripping HTML means that I can wind up with two copies of the message (plain text section + filtered HTML section) plus some mime-boundary stuff, so that doesn't seem like the best way to go.
I'm guessing that I need to look at the email, decide if it's mime encoded, and then see if there is a plain text section. So there's three cases (?) of emails:
1) Not Mime, so no decoding needed, just read msg body
2) Mime-encoded with plain text section -> extract plain text section
3) Mime-encoded, no plain text section, just HTML -> decode the HTML
Seems like a fair amount of effort, so I wondered if someone else has solved this in better way. I didn't see anything in CPAN that would be a complete solution (ie, nothing entitled Mail::ReadAnythingAndExtractPlainText)
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Reading POP email that may be plain text, HTML, or both
by almut (Canon) on Jun 19, 2010 at 09:34 UTC | |
by pdxperl (Sexton) on Jun 20, 2010 at 03:08 UTC | |
Re: Reading POP email that may be plain text, HTML, or both
by Krambambuli (Curate) on Jun 19, 2010 at 11:45 UTC |
Back to
Seekers of Perl Wisdom