Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: recommended storage format for email messages?

by afoken (Abbot)
on Jul 14, 2009 at 19:29 UTC ( #780043=note: print w/replies, xml ) Need Help??


in reply to recommended storage format for email messages?

I use IMAPdir, which extends Maildir++ with a folder hierarchie. Maildir++ inherits from maildir and adds a quota system. All of these give you one file per e-mail, all without needing locks, NFS-safe, and without any modification to the e-mail. You can parse the files with exactly the same tools that you use to parse an e-mail fetched from the net. And yes, you can use grep and all other text processing tools on the files in the maildir/Maildir++/IMAPdir folders. Sequencial access is no problem, just use readdir() or File::Find to iterate over the directory.

Storing several hundred files in an ext3 filesystem is no problem. With 100_000 files, things begin to look different. It works, but ext3 does not like it and slows down. RaiserFS is said to be faster in that case, but I've never tested it.

I've used the de-facto standard mbox format since the days of Netscape Communicator, but it became slow as hell when the mailboxes filled up. Some day, I gave the IMAPdir format a try, splitted all mailboxes into the IMAPdir format, switched my IMAP daemon from pine's to bincimap, and found that it was much faster.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
  • Comment on Re: recommended storage format for email messages?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://780043]
help
Chatterbox?
[Corion]: Otherwise, I would imagine that a user with a process still alive would lock that information in memory.
[davido]: so last -f /var/run/utmp on ubuntu provides similar (though more verbose) info
[oiskuu]: glibc getlogin just does ttyname() and falls back on getutline(); it's not security related at all. (reminds me of sendmail and remote finger services of the naive early spam era)
[Corion]: But yes, "who started this process" is interesting information :)
[tye]: no, I really believe that "login user" was added as a fundamental bit of info about each process in order to enhance the usefulness of auditing
[Corion]: Ah - if that information is saved in a file, then you could theoretically spam that file and confuse getlogin(). So, don't use it for authentication :)
[tye]: that is what getlogin() certainly *used* to do. I don't believe that is what it certainly should do.
[davido]: /var/run/utmp is 664 i think.
[tye]: Note that my "man getlogin" says that it uses stdin when it should use /dev/tty (calling a glibc bug). But that does not appear to be the case when I test it. But maybe Perl's getlogin() is not using glibc's getlogin().
[oiskuu]: well, run a strace and see what the getlogin does for you.... As I said. SELinux probably has those security labels. But not regular linux.

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (9)
As of 2017-06-23 19:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many monitors do you use while coding?















    Results (554 votes). Check out past polls.