Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Simple (but robust) email server (receiver)

by zby (Vicar)
on Dec 16, 2008 at 14:00 UTC ( #730642=perlquestion: print w/ replies, xml ) Need Help??
zby has asked for the wisdom of the Perl Monks concerning the following question:

Dear fellow Perl practitioners,

My current assign is to build a simple, standalone email (SMTP) server. The plan is to feed the (processed) emails into the database - so the whole delivery part is not necessary. It looks that the current contenders in that area are Net::Server::Mail and POE::Component::Server::SimpleSMTP. Did I miss something? And which one would you advice me to choose? POE seems a bit heavy - but perhaps it is worth the cost?

Thanks in advance.

Comment on Simple (but robust) email server (receiver)
Re: Simple (but robust) email server (receiver)
by JavaFan (Canon) on Dec 16, 2008 at 14:08 UTC
    My personal preference (but that's just personal preference) is qmail. With qmail, it's trivial to pipe received emails into a different program. But qmail takes care of the nitty gritty details of accepting and queuing mail. No doubt Postfix and a whole batch of other mailservers make it easy too.

    A long time ago, I did something similar - accepting emails, and stuffing it in a database. With qmail it only took a small C program to put it into a DB2 database. With Perl, it would have been shorter. (But the DB2 manuals came with a C program which was easier to adapt than to write a Perl program).

      With qmail, it's trivial to pipe received emails into a different program.

      This is not always a good solution though. You seem to recommend creating a perl process for every message that is received and piping the message to that process from qmail.This is vastly more resource-intensive than having a perl daemon that already runs with all required modules pre-loaded.

      Of course, you could have a perl daemon running and have qmail send the message to that, but what's the point? You need to use a mail protocol for transfer between qmail and the daemon, so it's no less work to write and you might as well leave qmail out of the process entirely.

      You also have the problem of what to do with messages your perl process can not or does not want to accept. If there's an intermediate MTA, rejecting the message will create a bounce, which may be superfluous. Running your own receiving daemon you can reject at SMTP time which means it's up to the sender to create a bounce or ignore.

      Now if you're talking about a machine that receives mail for multiple destinations and has to deliver mail to mailboxes as well as to the perl process then I agree, use a real MTA (I personally would use Exim and not qmail but YMWV). But that doesn't seem to be what the OP wants to do.


      All dogma is stupid.

        Creating a new perl process for every message is more resource-intensive in some ways, not in others.

        First off, all the SMTP-handling code is in C. That's got to be less resource-intensive than perl. Second, it's already written, most bugs worked out, and someone else will fix any new bugs found. That's less resource-intensive than writing it yourself (you are a resource, too). Third, spawning off a new perl process, compiling and then running a bunch of perl code (both a .pl and a bunch of modules) is really not that resource-intensive. The only really expensive part here is the DB connect, which you can mitigate with FastCGI or DBD::Proxy if it really becomes a problem (premature optimisation--). Oh, and by using qmail, you get the preforking done for you. No, your code isn't forked - but if multiple emails come in at the same time, qmail will kick you off multiple times, providing with better scaling without any thought or design on your part. That, too, makes it less resource-intensive (again, you're a resource). And spewing the message over a pipe (purely in RAM, remember) is trivial - we're talking about copying a couple of KB around (usually). That's not resource-intensive at all, especially considering perl already does that type of work when you pass around scalars instead of references to scalars.

        Personally, I like the "use qmail or postfix" solution as a starting point, as it allows you to focus on the real work you're trying to accomplish (stuffing a database) without worrying about stuff you don't really care about (SMTP, preforking, dropping privileges, etc.). If it turns out that you need more, you can always go back and try cleaning up whatever bottleneck there really is, whether that's a db connection or it's the fork/exec overhead, or something else. But to solve performance issues you may never have is really wasting resources: you.

        This is vastly more resource-intensive than having a perl daemon that already runs with all required modules pre-loaded.
        If that were the case, all the setups that use spamc to talk to spamd (client and daemon of spamassassin) would come to a grinding halt. But that's not true. A light weight program talking to a daemon doesn't have to be resource intensive. If it's run very often, it'll be in memory anyway. ;-)
        Of course, you could have a perl daemon running and have qmail send the message to that, but what's the point? You need to use a mail protocol for transfer between qmail and the daemon, so it's no less work to write and you might as well leave qmail out of the process entirely.
        Why on earth would you need SMTP to talk to such a daemon? That what your MTA is for. Your MTA can also take care of queuing - you may get bursts of mail faster than your database can store them. Or your database may be down, and you still want to accept mail.
        You also have the problem of what to do with messages your perl process can not or does not want to accept. If there's an intermediate MTA, rejecting the message will create a bounce, which may be superfluous. Running your own receiving daemon you can reject at SMTP time which means it's up to the sender to create a bounce or ignore.
        Whether or not the MTA will create a bounce that depends on how you configure the MTA, or on the exit status of the delivery program. I can't answer whether there should be a bounce or not - that's up to the OP. I often set qmail systems up in such a way that they never bounce. They accept anything, and just deliver to /dev/null instead of bouncing. There's already enough spam. ;-)

        Anyway, as I said, it's my personal preference. I like qmail, it does its job well and efficiently. It's really easy to configure, and I haven't encountered the resource problems you refer to in the beginning of your post.

      I used to be (administered a fairly active, but not huge, installation on a campus in the early to mid 90s) a big proponent of qmail for a general mail processing solution, even writing a delayed mail notifier for it (which I no longer recommend using in the general case due to the spamminess of the entire concept of DSN). Since that time, the network landscape has changed, and the default delivery method that qmail uses could be abusive.

      For a specific end solution, where you don't use it as an outgoing delivery agent, I can still see it being a very good solution. However, tirwhan makes a very good point about resource usage. If you are aware of this, and deal with the local and remote resource piggyness that qmail can exhibit, it can be a good solution.

      --MidLifeXis

      My personal preference is Postfix (for lesser patching ;-)), but choose the MTA you know/like best (or which you get best support in your environment) be it exim, postfix, qmail, or even sendmail (courier-mta i would not recommmend ...). I would prefer handing the Mail over to a full-blown MTA, if available (with filtering mechanisms like spamassassin attached etc.)

      There are spam sources enough out there, so be sure to validate your code ...
      Just my 2 ct.
      hth MH
Re: Simple (but robust) email server (receiver)
by tirwhan (Abbot) on Dec 16, 2008 at 14:26 UTC

    I can heartily recommend Net::Server::Mail in conjunction with Net::Server::PreFork for this purpose. I've found this to be a very stable and performant solution (code I wrote for a client using this combination handles thousands of messages a second in production and runs stably without any need for maintenance apart from configuration).

    Question though, are you really planning on storing the whole email in a database? There are seldom (IME) good reasons for doing that.

    Update:Question answered :-)


    All dogma is stupid.
      Thanks. And to answer your question - not exactly. The emails will be processed and only some results will be stored.
Re: Simple (but robust) email server (receiver)
by samtregar (Abbot) on Dec 16, 2008 at 18:56 UTC
    I've used Net::Server::Mail, but only as part of test scripts for sending email. Have you considered letting another system handle receiving the email? Then you could just write your code using a POP or IMAP client library to fetch new mail and process it. Easier and more reliable, particularly if it means you don't have to run a mail server at all, for example by setting up the target address at GMail. I did that recently for an automated responder app and it worked great.

    -sam

Re: Simple (but robust) email server (receiver)
by skx (Parson) on Dec 16, 2008 at 22:05 UTC

    qpsmtpd is a Perl SMTP server, which is very extensible and based upon plugins.

    It would be trivial to set it up to receive mails and process them.

    I do use it on my commercial spam filtering site, and it is a real pleasure to work with.

    THere is a simple introduction here, and move via the homepage and your favourite search engine.

    Steve
    --
      Thanks - it looks interesting. One question about the plugins - how do you maintain a database connection persistent for a worker process? The docs only say:
      There is no restriction, what you can do in “register()”, but creating database connections and reuse them later in the process may not be a good idea. This initialisation happens before any “fork()” is done. Therefore the file handle will be shared by all qpsmtpd processes and the database will probably be confused if several different queries arrive on the same file handle at the same time (and you may get the wrong answer, if any).
      So is there a way to initialize the connection just after the fork - and before any request is served?

        I don't maintain a database handle throughout the transaction - instead I use the connection object to make notes as the SMTP transaction is completed.

        e.g.

        sub hook_helo { my ( $self, $transaction, $host ) = @_; # # Make sure helo includes a domain # if ( $host !~ /\./ ) { $self->log( LOGWARN, "HELO $host doesn't contain a period." ); $transaction->notes( "reject", 1 ); $transaction->notes( "reason", "invalid helo" ); } return DECLINED; }

        Then I have a series of plugins which do different things at the last step, either forward the message or reject it, but archive a searchable copy for the recipients benefit. Here's a simplified version of the reject + archive plugin:

        sub hook_queue { my ( $self, $transaction ) = @_; # # We only log mails which have been rejected. # if ( 0 == ( $transaction->notes("reject") || 0 ) ) { return DECLINED; } # connect to DB # archive message # disconnect return ( DECLINED, "Rejected this is spam: " . $transaction-Notes +("reason" ) ); }

        (Actually this is a polite fiction. I actually archive messages to local disk, if they were to be rejected, then later rsync them to a central location - and import them to MySQL there.

        Steve
        --

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://730642]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (23)
As of 2014-07-30 16:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (236 votes), past polls