Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I use IMAPdir, which extends Maildir++ with a folder hierarchie. Maildir++ inherits from maildir and adds a quota system. All of these give you one file per e-mail, all without needing locks, NFS-safe, and without any modification to the e-mail. You can parse the files with exactly the same tools that you use to parse an e-mail fetched from the net. And yes, you can use grep and all other text processing tools on the files in the maildir/Maildir++/IMAPdir folders. Sequencial access is no problem, just use readdir() or File::Find to iterate over the directory.

Storing several hundred files in an ext3 filesystem is no problem. With 100_000 files, things begin to look different. It works, but ext3 does not like it and slows down. RaiserFS is said to be faster in that case, but I've never tested it.

I've used the de-facto standard mbox format since the days of Netscape Communicator, but it became slow as hell when the mailboxes filled up. Some day, I gave the IMAPdir format a try, splitted all mailboxes into the IMAPdir format, switched my IMAP daemon from pine's to bincimap, and found that it was much faster.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

In reply to Re: recommended storage format for email messages? by afoken
in thread recommended storage format for email messages? by perl5ever

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    [Lady_Aleena]: tobyink, I did after I failed to get the BLOCK to work. I can't seem to get my brain around grep BLOCK, though I'm okay with grep EXPR.
    [shmem]: so in the second example grep returns all true elements of the list passed
    [Lady_Aleena]: Okay, so grep BLOCK is not like map BLOCK where something might need to be returned at the end.
    [tobyink]: grep { $_ =~ /.*$in.*/; } @my_modules should work just fine. The problem is that you were adding on ;$_ at the end of the block. Why were you doing that?
    [Lady_Aleena]: tobyink, I was thinking map.
    [tobyink]: Something does need to be returned at the end not $_ though. You need to return (something that will be evaluated as) a boolean.
    [Lady_Aleena]: Here is a longish map I did in the same script. my @my_modules = map { my $file = $_; $file =~ s/$module_director y(.+)\.pm/$1/; $file =~ s/\//::/g; $file; } @files;
    [Lady_Aleena]: I spent half an hour trying to figure out why map was rewriting @files.
    [tobyink]: If you like map you can do map { ($_ =~ /.*$in.*/) ? $_ : () } @my_modules and it will work just the same. grep is neater though.
    [Lady_Aleena]: tobyink, I'll keep the grep as is. Thanks everyone!

    How do I use this? | Other CB clients
    Other Users?
    Others perusing the Monastery: (9)
    As of 2017-05-27 07:44 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?