Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I have been using Netscape or Mozilla as my mail client since 1996, and I have built up a message archive with thousands of messages.

The way I have managed this archive, up to now, is to create folders within the Mozilla mail client. I then use the "Search Mail/News Messages" function in the mail client to find specific messages. This is not scaling well. Searches take a long time because each folder is stored as two text files:

  • one file is the messages themselves concatenated together,
  • the second is a file of meta data corresponding to the messages in the other file.
I want to design an application that ingests my Mozilla mailbox, separates the messages into rows in a database, and provides much more robust and scalable search capabilities. I am considering using MySql as the database with Apache and mod_perl as the front end running on my local machine, a Linux laptop.

I am not asking for help identifying the Perl modules to parse mail out of a Mozilla mailbox. I think this was covered in a previous question I posted, Netscape/Mozilla Mailbox Processing. But, I do wonder if my fellow monks would mind commenting on:

  1. the general merits of the design idea that I've sketched out
  2. any "gotchas" they see in attempting to store email in a MySQL database, or rendering the body of the message in a dynamically generated web page
  3. practical ways to deal with any attachments included with the mails:
    • copy to a place in the file system, store a reference to the location in the database
    • embed the attachments as BLOBs in the database
Finally, if anyone knows of an Open Source program that provides 80 percent of this functionality, let me know. So far, I've identified SQmaiL (python) and Gmail (C). Neither is Perl, nor do they seem to be particularly active projects.


Dave Aiello
Chatham Township Data Corporation

In reply to Managing a Personal Email Archive by dave_aiello

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2022-12-04 07:45 GMT
Find Nodes?
    Voting Booth?

    No recent polls found