Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: Simplest Way to Edit EML Files?

by Jim (Curate)
on Jun 17, 2013 at 18:25 UTC ( #1039438=note: print w/ replies, xml ) Need Help??


in reply to Re: Simplest Way to Edit EML Files?
in thread Simplest Way to Edit EML Files?

Thanks for your reply, CountZero.

The reason I posted an inquiry on PerlMonks before I delved into using Email::Simple is because it wasn't obvious to me from its documentation whether it could be used to edit EML files in place (effectively, at least). I needed to make small changes to many EML files, but I didn't want to have to construct whole Internet mail messages anew. And it takes me longer than it probably takes most Perl programmers to learn to use a new module.

It turns out I don't have to repair the EML files after all. But I do have to extract the character encoding damaged text from them and make a best effort to reverse the corruption. (See How to Fix Character Encoding Damaged Text Using Perl?.) I was able to determine that, in my finite collection of EML files, the damaged text is consistently in the same format, so matching the pattern was trivial using a regular expression. The script I wrote is included below.

#!perl use v5.14; use Encode qw( encode decode ); use MIME::Base64; binmode STDOUT, ':encoding(UTF-8)'; local $, = "\t"; local $\ = "\n"; @ARGV = <@ARGV>; # Expand wildcards... LINE: while (<>) { next LINE unless m{ ^(\S+): # field name \s+ =[?]utf-8[?]B[?] ([^?]+) # base64 encoded text [?]= }ix; my ($field_name, $base64_encoded_text) = ($1, $2); my $base64_decoded_text = decode_base64($base64_encoded_text); my $utf8_decoded_text = decode('UTF-8', $base64_decoded_text); next LINE unless $utf8_decoded_text =~ m{ [^\p{Script=Common}\p{Script=Latin}] }x; my $damaged_text = $utf8_decoded_text; my $repaired_text = decode('UTF-8', encode('UCS-2LE', $damaged_text)); $repaired_text =~ s{\x{00}+$}{}; print $ARGV, $field_name, $repaired_text, $damaged_text; } exit 0;


Comment on Re^2: Simplest Way to Edit EML Files?
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1039438]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2015-07-30 17:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (273 votes), past polls