http://www.perlmonks.org?node_id=529588

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

As suggested by a monk a few days past, to help put a stop to these bandwidth whores that are scraping my site for images, I was told to look at modperl http://www.modperl.com/book/chapters/ch6.html#Blocking_Greedy_Clients.

The code below is what they say I should use. I understand the code, but not where to use it. This is a module, is it not? Do I just save this as a .pm and use it in my .htaccess file? Or how do I go about getting this code to work?

package Apache::SpeedLimit; # file: Apache/SpeedLimit.pm use strict; use Apache::Constants qw(:common); use Apache::Log (); use IPC::Shareable (); use vars qw(%DB); sub handler { my $r = shift; return DECLINED unless $r->is_main; # don't handle sub-requests my $speed_limit = $r->dir_config('SpeedLimit') || 10; # Accesses +per minute my $samples = $r->dir_config('SpeedSamples') || 10; # Sampling +threshold (hits) my $forgive = $r->dir_config('SpeedForgive') || 20; # Forgive a +fter this period (minutes) my $expire = $r->dir_config('SpeedExpire') || 40; # Expire unus +ed # records from memory after this period (mi +nutes) my $content_type = $r->lookup_uri($r->uri)->content_type; return OK if $content_type =~ m:^image/:i; # ignore images tie %DB, 'IPC::Shareable', 'SPLM', {create => 1, mode => 0644} unless defined %DB; my($ip, $agent) = ($r->connection->remote_ip, $r->header_in('User +-Agent')); my $id = "$ip:$agent"; my $now = time()/60; # minutes since the epoch # lock the shared memory while we work with it tied(%DB)->shlock; my($first, $last, $hits, $locked) = split ' ', $DB{$id}; my $result = OK; my $l = $r->server->log; CASE: { unless ($first) { # we're seeing this client for the first tim +e $l->debug("First request from $ip. Initializing speed cou +nter."); $first = $last = $now; $hits = $locked = 0; last CASE; } if ($now - $last > $forgive) { # beyond the grace period. Tre +at like first $l->debug("$ip beyond grace period. Reinitializing speed +counter."); $last = $first = $now; $hits = $locked = 0; last CASE; } # update the values now $last = $now; $hits++; if ($hits < $samples) { $l->debug("$ip not enough samples to calculate speed."); last CASE; } if ($locked) { # already locked out, so forbid access $l->debug("$ip locked"); $result = FORBIDDEN; last CASE; } my $interval = $now - $first; $l->debug("$ip speed = ", $hits/$interval); if ($hits/$interval > $speed_limit) { $l->debug("$ip exceeded speed limit. Blocking."); $locked = 1; $result = FORBIDDEN; last CASE; } } $r->log_reason("Client exceeded speed limit.", $r->filename) if $result == FORBIDDEN; $DB{$id} = join " ", $first, $now, $hits, $locked; foreach my $key (keys %DB) { my ($first, $last, $hits, $locked) = split(' ', $DB{$key}); if ($now - $last > $expire) { delete $DB{$key}; } } tied(%DB)->shunlock; return $result; } 1; __END__

Replies are listed 'Best First'.
Re: mod_perl blocking greedy clients
by randyk (Parson) on Feb 11, 2006 at 20:59 UTC
    Just above the listing where you got the source for the module, there's a sample set of Apache configuration directives on how to use it:
    <Location /> PerlAccessHandler Apache::SpeedLimit # max 20 accesses/minute PerlSetVar SpeedLimit 20 # 5 hits before doing statistics PerlSetVar SpeedSamples 5 # amnesty after 30 minutes PerlSetVar SpeedForgive 30 </Location>
      I saw that code from the few lines above but it doesn't say where to apply it. What configuration file? Sorry, I'm lost.

        Apache configuration directives traditionally go in the Apache configuration file, unsurprisingly.

        I recommend you understand the code you intend on using and read the Apache documentation before implementing any changes.


        If the information in this post is inaccurate, or just plain wrong, don't just downvote - please post explaining what's wrong.
        That way everyone learns.

Re: mod_perl blocking greedy clients
by brian_d_foy (Abbot) on Feb 11, 2006 at 23:19 UTC