Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: Is it possible to force MIME::Parser to extract text-files on a Windows system without the extra CR's on the end of lines?

by WilliamDee (Initiate)
on Feb 11, 2014 at 04:01 UTC ( #1074337=note: print w/ replies, xml ) Need Help??


in reply to Re: Is it possible to force MIME::Parser to extract text-files on a Windows system without the extra CR's on the end of lines?
in thread Is it possible to force MIME::Parser to extract text-files on a Windows system without the extra CR's on the end of lines?

Thank you for the welcome and the idea, Athanasius.

It is a possibility to do some post-processing only on text-files, if there are no other options. I'll admit that I'm not keen at the thought of slurping large files (2+ megabytes) into memory again and doing a regex replace like the following:

$fileguts =~ s/\r{2,}\n/\r\n/g;

That should reasonably efficient at the process. Your idea does raise another thought though: avoiding the mangling of files which come out of unix-based systems, changing \n to \r\n. It might be preferable to do something like:

$fileguts =~ s/\r+\n/\n/g;

In the interest of not potentially mangling files - for the moment I will continue to hang out in the hope of another, MIME::Parser-based fix. :)

Cheers!
William

PS: Another possibility might be to change the original MIME message before writing to disk, say from:

Content-Type: text/plain;

To:

Content-Type: application/x-msexcel;

A bit of an ugly hack to trick MIME::Parse, though probably doable. And might be preferable to the extra disk-load/regex-replace/disk-save cycle. While I'm not expecting hundreds of files per minute/second, it is best to assume that something like that might happen if an ISP error suddenly causes a surge or someone attempts a DoS/mailbomb attack.


Comment on Re^2: Is it possible to force MIME::Parser to extract text-files on a Windows system without the extra CR's on the end of lines?
Select or Download Code
Re^3: Is it possible to force MIME::Parser to extract text-files on a Windows system without the extra CR's on the end of lines?
by WilliamDee (Initiate) on Feb 12, 2014 at 03:45 UTC

    Thank you Athanasius, I have gone down the path of changing the content-type to something that will extract text/plain as binary files (application/x-msexcel). The code I'm using now is:

    # open the file in raw/binary output for writing open MAILOUT, '>:raw', "$receiving/message-$thetime-$popcount.msg" + or LogWrite("Unable to open message-$thetime-$popcount.msg for writi +ng: $!"); # get the email into a temporary variable my $hold = $pop->HeadAndBody($popcount); # force it to use binary saving $hold =~ s/text\/plain/application\/x-msexcel/g; # write to file print MAILOUT $hold; # close the file close MAILOUT;

    And the text-files extracted by MIME::Parser are now saved without extra \r characters added to them.

    Cheers for the help! :)
    William

      $cough->binary(1); ... print ...

        Not sure what you mean, my anonymous friend. Would you please explain further?

        Cheers,
        William

        PS: Have altered the documentation in my code to be properly clear, now using:

        # change it to use binary extraction when MIME::Parse extracts tex +t files from the mail message $hold =~ s/text\/plain/application\/x-msexcel/g;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1074337]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2014-09-19 00:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (128 votes), past polls