Re: Limit submissions over time?

Well, here are some of the challenges you'll face if you wish to limit how many times a particular individual is able to send you messages (in no particular order):

You cannot rely on environment variables to check IP's or domains. In some cases many users will appear to be from the same IP or domain. In other cases, some users' info simply won't be available. In still other cases, the info that is available can be spoofed or otherwise wrong. So rule CGI environment variables out as a means of 'authentication'.
You can't rely on cookies, unless you require that a cookie be present before a mail message can be sent. The cookie could contain a MD5 hash as identification that you keep track of for some period of time. This method would work, but would prevent access for folks who have cookies turned off.
You could require a login, but that means maintaining user lists which adds complexity and might be inconvenient enough for people that they won't send a message in the first place.
Even if you do prevent an individual from posting multiple times, you may still be leaving the door opened to a many-source DOS attack, where a large number of "bad" machines gang up on you at once.

Every practical and reliable means of preventing abuse has trade-offs manifesting as reduced convenience and/or reduced compatibility for the end users, while at the same time increasing complexity for your script.

At least, you probably ought to look into the CGI::Session module, which could facilitate adding session management to your script. You might also find it helpful to buy, borrow, or check out at the library a copy of "CGI Programing with Perl" (O'Reilly & Associates) 2nd edition. It dedicates a lot of discussion to subjects such as email, and session management. It's a good read, IMHO. Also, don't do mail by hand. Use a module such as Mime::Lite, for its simplicity, reliability, and robustness.

Dave

Comment on Re: Limit submissions over time?

Replies are listed 'Best First'.
Re^2: Limit submissions over time? by jhourcle (Prior) on Jun 18, 2006 at 12:55 UTC
Well, you can rely on environmental variables to check the IP of the machine connecting to the server, as it's set by your local webserver. (assuming you trust your local webserver, that is.) Yes, there are issues, but I don't think it's worth ruling them out -- for authentication yes, not it can still be used for authorization, if you know where the problems are. HTTP_ADDR is very reliable. However, the problem comes that it might not be the IP for the machine that the person is connecting from. Many proxies will also set X_FORWARDED_FOR, but they're not required to, and those IP addresses aren't necesarily routable, which means that a collision in non-routable space may not be a collision for different proxy servers. If you're just looking for _some_ sort of rate throtling (ie, better than nothing at all), I'd use a combination of HTTP_ADDR, and X_FORWARDED_FOR. I'd probably not worry about the issues with non-routable colisions, and keep track of the following: `if ( defined $ENV{'X_FORWARDED_FOR'} ) { &track(':'.$ENV{'X_FORWARDED_FOR'}); &track{$ENV{'HOST_ADDR'}.':'.$ENV{'X_FORWARDED_FOR'}); } else { &track{$ENV{'HOST_ADDR'}); }` [download] (specific tracking code depends on what you're planning, how much memory you have, and what other resources (ie, database), you have available.) Now, let's look at the flaw in my plan -- anyone can send whatever they want in X_FORWARDED_FOR, which would suggest they're a proxy server, and you'd not be rate limiting them if they put something random in it. (it's possible that the original poster would want to rate limit proxy servers at some smaller interval, just to keep the 10,000 possibility down). Personally, I'd just impose extra sleep for those times of collisions in the case of a proxy -- if you slow it down to one every 30 seconds or so, it makes it less likely that it'll get abused. (and remember that in whatever tracking system you're using, log at the time that it comes in, but set the timestamp to the time that it's expected to run, so if something else comes in while it's sleeping, it won't just wait ($time), it'll wait $time past the current one finishing. Just remember -- anything you can do will never making spamming impossible. You juat need to make things harder on the spammer so they'll try somewhere else -- hopefully, without imposing too much of a burder on your legitimate users.	[reply] [d/l]
Re^3: Limit submissions over time? by davido (Cardinal) on Jun 18, 2006 at 15:35 UTC
You know, another strategy might be to do some sort of a diff calculation on incoming mail, and if it appears to be, within a certain tolerance for error, approximately equal to one of the past five messages you received, block it. This could be made even more secure if you also implement session management (CGI::Session, for example), and even more secure if you also require logins. But again each level of additional protection means additional assumptions about the end user, and/or additional hoops for the end user to jump through. Dave	[reply]


Pathologically Eclectic Rubbish Lister
	PerlMonks