Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Re: Parsing email files, what are the best modules ?

by peterr (Scribe)
on Nov 10, 2003 at 11:25 UTC ( [id://305851]=note: print w/replies, xml ) Need Help??


in reply to Re: Parsing email files, what are the best modules ?
in thread Parsing email files, what are the best modules ?

Hi Roger,

Many thanks for that example you posted.

I assume that the mail folders are in the format of unix mboxes, ascii-mode, line-by-line.

Yes, they are ascii-mode, with CR/LF's. When I tried the script, a message:

D:\Perl\myscripts>\perl\bin\perl.exe checke~1.pl Can't locate Mail/Box.pm in @INC (@INC contains: D:/Perl/lib D:/Perl/s +ite/lib .) at checke~1.pl line 5. BEGIN failed--compilation aborted at checke~1.pl line 5.

I then checked for Mail::Box, and it didn't appear to be part of the Active State Perl I have installed. I then used the PPM (version 3.0.1) and did an "install Mail::Box" command, it took about 10 mins, but said everything was okay. However, the same error message appeared.

The PPM search for Mail::Box displayed

ppm> search Mail::Box Searching in Active Repositories 1. Mail-Box-Parser-C [3.003] C parser for Mail::Box
and that is all that was installed, just a file called "C.pm" in a folder D:\Perl\site\lib\Mail\Box\. I did download the file http://perl.overmeer.net/mailbox/source/source-current.tar.gz , and there is a Box.Pm in that file, but I don't know where to put it. No doubt somehow I should reference the 'tar' file in the PPM, for the install ? When I do a SET command at DOS, there are no environment variables for Perl ?

I'm trying to read up more on the documentation also.

Many thanks,

Peter

Replies are listed 'Best First'.
Re: Re: Re: Parsing email files, what are the best modules ?
by Anonymous Monk on Nov 10, 2003 at 11:51 UTC
      Hi,

      Thanks, I have read most of that now, installed 'nmake', and have been using the

      perl -MCPAN -e "shell"

      to install the Mail::Box modules and some others.

      Peter

Re: Re: Re: Parsing email files, what are the best modules ?
by Roger (Parson) on Nov 11, 2003 at 00:23 UTC
    Hi Peter, you could read the PPM documentation like what the Anonymous Monk has suggested. Also you probably need all the Mail::Box and its derived modules as well.

    I have complete the code I started earlier. The additional code is an example on the kind of thing you could do with the Mail::Box::Manager module. Pretty handy I think.
    #C:\Perl\bin\Perl.exe -w use strict; use IO::File; use Data::Dumper; use Mail::Box; use Mail::Box::Manager; # Load mail list my $MailList = load_mail_list('./list25B6.txt'); print Dumper($MailList); # Load folder list my $MailFolder = load_mail_folders('./hierarch.txt'); print Dumper($MailFolder); # Parse folder files foreach (values %{$MailFolder}) { parse_mail_folder($_); } # Optionally output $MailList into another file, etc. # And other things ... exit(0); sub parse_mail_folder { my $folder_file = shift; my $mgr = Mail::Box::Manager->new(); my $folder = $mgr->open($folder_file); my @email_addr; foreach my $message ($folder->messages) { my $dest = $message->get('To'); # retrieve the To-address @email_addr = split /,/, $dest; # retrieve multiple addresses # assume the email address format is as follows - # # John & Jenny Arnold <johnarnold@somedomain.com> # # you have to tweak a bit if the format is not as expected # or use the Mail::Address module to do the trick - to # convert the mail address into its canonical form. foreach (@email_addr) { my ($name, $addr) = /(.*)<(.*)>/; $name = s/^\s+//g; # trim spaces at front $name = s/\s+$//g; # trim spaces at rear $addr = s/^\s+//g; # trim spaces at front $addr = s/\s+$//g; # trim spaces at rear if (! exists $MailList->{$addr}) { # ok, we haven't seen this Email address yet $MailList->{$addr} = $name; # and do other things } } } $folder->close; } sub load_mail_list { my $filename = shift; my $f = new IO::File $filename, "r" or die "Can not open mail list +"; my %mlist; # load the header chomp($mlist{title} = <$f>); chomp($mlist{sender} = <$f>); chomp($mlist{nosig} = <$f>); <$f>; # load the rest of the email addresses my %MailAddress; while (<$f>) { chomp; my ($name, $email) = /^(.*)\s+<(.*)>$/; next if $email eq ''; $MailAddress{$email} = $name; } $mlist{mlist} = \%MailAddress; return \%mlist; } sub load_mail_folders { my $filename = shift; my $f = new IO::File $filename, "r" or die "Can not open mail list +"; my %mbox; while (<$f>) { chomp; next unless ( $_ ne '' and m/^0,0,/ ); s/"//g; my @fld = split /,/; my ($folder) = $fld[2] =~ /.*:.*:(.*)/; $mbox{$fld[-1]} = "D:/Pmail/mail/$folder.PPM"; # full path to +mboxes } return \%mbox; }
      Hi Roger,

      Also you probably need all the Mail::Box and its derived modules as well.

      I certainly got plenty of these messages

      Warning: prerequisite Scalar::Util failed to load: Can't locate Scalar +/Util.pm in @INC (@INC contains: D:/Perl/lib D:/Perl/site/lib .) at ( +eval 46) line 3. Warning: prerequisite Test::Harness 1.38 not found at D:/Perl/lib/ExtU +tils/MakeMaker.pm line 343.

      when running the Makefile.pl from the Mail::Box tar/archive. The problem is, I used

      perl -MCPAN -e "shell" cpan> install Scalar::Util

      and the error message still appeared, even though the install went okay ? Even re-installing Mail::Box

      D:\Perl\myscripts>\perl\bin\perl.exe -MCPAN -e "shell" cpan shell -- CPAN exploration and modules installation (v1.59_54) ReadLine support available (try 'install Bundle::CPAN') cpan> install Mail::Box CPAN: Storable loaded ok Going to read \.cpan\Metadata Database was generated on Tue, 11 Nov 2003 00:45:51 GMT Mail::Box is up to date. cpan> q Lockfile removed.

      and then running the Perl script, still gave the following

      D:\Perl\myscripts>\perl\bin\perl.exe checke~1.pl Can't locate Scalar/Util.pm in @INC (@INC contains: D:/Perl/lib D:/Per +l/site/lib .) at D:/Perl/site/lib/Mail/Reporter.pm line 9. BEGIN failed--compilation aborted at D:/Perl/site/lib/Mail/Reporter.pm + line 9. Compilation failed in require at (eval 1) line 3. ...propagated at D:/Perl/lib/base.pm line 62. BEGIN failed--compilation aborted at D:/Perl/site/lib/Mail/Box.pm line + 8. Compilation failed in require at checke~1.pl line 5. BEGIN failed--compilation aborted at checke~1.pl line 5.

      I have checked out all the "prerequisite" warning messages, made a note of those modules, then used the 'MCPAN' / shell to install them. The install appears to go okay, it goes out to the internet , parses through files on FTP sites, and says _that_ module has installed okay. ??

      Going back to where I think (but don't really know) where the perl script is stopping, is line 9 of Reporter.pm , which has

      Use Scalar::Util 'dualvar';

      and I know I have installed _that_ module. The other related code from the error messages are

      # msg - "...propagated at D:/Perl/lib/base.pm line 62." die if $@ && $@ !~ /^Can't locate .*? at \(eval /; # msg - "compilation aborted at D:/Perl/site/lib/Mail/Box.pm line 8." use base 'Mail::Reporter';

      I'm just about all debugged out, and have run out of clues.

      I have complete the code I started earlier. The additional code is an example on the kind of thing you could do with the Mail::Box::Manager module. Pretty handy I think.

      Thanks very much for that additional code, Roger. I guess the big question is, what is different on your Perl setup to mine ??

      Thanks a lot, :)

      Peter

        Looks like you have some maintenance to do. You need to download and install the Scalar::Util module from CPAN, probably other things you think is useful too. Can't perl without them. 8^p

        By the way, the perl version I use is Active Perl 5.8.1.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://305851]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-03-29 05:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found