Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^5: How to parse outlook type attachment from inbox

by jdtoronto (Prior)
on Sep 22, 2006 at 17:01 UTC ( [id://574406]=note: print w/replies, xml ) Need Help??


in reply to Re^4: How to parse outlook type attachment from inbox
in thread How to parse outlook type attachment from inbox

OK perlCrazy,

Time to back up here a little. Two questions:

  • Why are you trying to parse header information from attachments?
  • Why are you using proprietary headers rather than standard ones (X-MS-Has-Attach for example)?
Maybe if you told us what exactly you are trying to achieve it would make it easier for all of us.

jdtoronto

  • Comment on Re^5: How to parse outlook type attachment from inbox

Replies are listed 'Best First'.
Re^6: How to parse outlook type attachment from inbox
by perlCrazy (Monk) on Sep 23, 2006 at 08:47 UTC
    Why are you trying to parse header information from attachments?

    There are 50k emails are lyin in inbox all emails has come from one email address(ex: staff@lastminute.com) and each email
    is coming from end customer as attachment that is
    containing sender address that will be hotmail address. Now all attachmnet it self is a mail that is why i need to p
    arse header so i can get those email address.

    Why are you using proprietary headers rather than standard ones (X-MS-Has-Attach for example)?


    I opened the outlook options and found that all attachment information is like this only,
    that is why i checked that condition. and it is storing all attachment OK. But i need to get the header info from attachment
      #!/usr/bin/perl use strict; use Mail::POP3Client; use MIME::Parser; use MIME::Head; use Mail::Header; #usage() unless scalar @ARGV == 3; my $pop = new Mail::POP3Client( HOST => 'server', USER => 'my', PASSWORD => 'pass' ); my $tmp_directory = "/tmp/attach"; my $parser = new MIME::Parser; $parser->output_dir($tmp_directory); $parser->output_prefix("attachment"); $parser->output_to_core(); open (FH,">>/tmp/msgtxt1"); for (my $i = 1; $i <= $pop->Count(); $i++){ my $head = $pop->Head($i); if ($head =~ /X-MS-Has-Attach: yes/i){ foreach ( $pop->Head( $i ) ) { if( /^(From|Subject):\s+/i ) { if ( /Vincent/) { my $msg = $pop->HeadAndBody($i); ### Automatically attempt to RFC-1522- +decode the MIME headers? $parser->decode_headers(1); + ### default is false ### Parse contained "message/rfc822" o +bjects as nested MIME streams? $parser->extract_nested_messages(0); + ### default is true ### Look for uuencode in "text" messag +es, and extract it? $parser->extract_uuencode(1); + ### default is false ### Should we forgive normally-fatal e +rrors? $parser->ignore_errors(0); + ### default is true my $entity = $parser->parse_data($msg) +; } } } # print "\n"; } } close(FH); $pop->Close(); sub usage { print "Usage: $0 <mail_server> <username> <password>\n"; exit; }

      Above code does save all attachment. Now I need to parse those attachment
      One of the attachment will be .msg file that contains header einfo etc.
      I need to extarct those header info and find the sender address.
      Can somebody help me in this.

        I think you're overcomplicating this problem. So here's my final attempt to explain it to you.

        MIME messages are hierarchical in nature. That is to say, a MIME message can contain other MIME messages which can, in turn, also contain other MIME messages and so on. So whatever you use to parse a MIME message, can also be used to parse its children.

        In fact, MIME::Parser is cleverer than that, if you set the extract_nested_messages flag to 1 (which is the default value, I'm not sure why you changed it in your code) then it will produce a tree structure containing all of the MIME messages contained within your original message. You can see the structure of this tree with the dump_skeleton method and you can get the children of any given MIME message by using the parts method.

        So, you've gone as far as getting the top level message in $entity. If you change the nested messages flag to 1 then your entity will contain sub-entities which you can access using $entity->parts. Each of these contained messages will be a MIME::Entity object and you can extract various parts of the message (like the headers) using the methods described in the documentation.

        Is that clearer?

        --
        <http://dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://574406]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-24 07:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found