Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Simplest Way to Edit EML Files?

by Jim (Curate)
on Jun 15, 2013 at 23:56 UTC ( #1039166=perlquestion: print w/ replies, xml ) Need Help??
Jim has asked for the wisdom of the Perl Monks concerning the following question:

I have many EML files (plain text, RFC 5322, MIME) that I need to edit to replace headers (From, To, Cc, Subject, etc.) that have character encoding damaged text in them. Most of the headers with corrupted text in them are in the MIME encoded-word format and are base64 encoded. Can I use Email::Simple and Email::MIME to edit these files? Or should I not bother with these email modules and just parse the headers myself using regular expression patterns? (I can use MIME::Base64 to decode the encoded-word strings.)

These EML files are from the same source and are quite uniform.

Comment on Simplest Way to Edit EML Files?
Re: Simplest Way to Edit EML Files?
by CountZero (Bishop) on Jun 16, 2013 at 18:51 UTC
    Can I use Email::Simple and Email::MIME to edit these files?
    Just try it. It could be that your headers are so damaged that they cannot be parsed anymore, but unless you try, you will not know.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

      Thanks for your reply, CountZero.

      The reason I posted an inquiry on PerlMonks before I delved into using Email::Simple is because it wasn't obvious to me from its documentation whether it could be used to edit EML files in place (effectively, at least). I needed to make small changes to many EML files, but I didn't want to have to construct whole Internet mail messages anew. And it takes me longer than it probably takes most Perl programmers to learn to use a new module.

      It turns out I don't have to repair the EML files after all. But I do have to extract the character encoding damaged text from them and make a best effort to reverse the corruption. (See How to Fix Character Encoding Damaged Text Using Perl?.) I was able to determine that, in my finite collection of EML files, the damaged text is consistently in the same format, so matching the pattern was trivial using a regular expression. The script I wrote is included below.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1039166]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2014-08-31 08:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls