Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Getting bots ips from apache logs.

by idle (Friar)
on Dec 01, 2008 at 16:11 UTC ( [id://727129]=CUFP: print w/replies, xml ) Need Help??

Albeit most of modern firewalls have its own embedded functions for blocking bots, but sometimes it isn't enough.
So here is my basic script for analyzing apache logs and block(or something else) suspicious addresses.
Feedback is appreciated.
#!/usr/bin/perl -w use strict; use warnings; use POSIX qw(strftime); my $pattern = "\"GET \/ HTTP\/"; # request index page pattern my $httpd_log = "/var/log/httpd-access.log"; # log file my $ok = "1000"; # allowed connections per ip for $check_period my $check_period = 1; # check period in hours my $date = strftime("%d/%b/%Y:%H", localtime(time-$check_period*3600)) +; # date minus $check_period hours my (%ips, $ip, $start); open (LOG, $httpd_log) or die $!; while (<LOG>) { next unless m/$date/ || $start; # skipping old records $start=1; if (/^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*$pattern/go) { # g +etting ips $ips{$1}++; } } close LOG; foreach $ip (keys %ips) { if ($ips{$ip} >= $ok) { # print "$ip = $ips{$ip}\n"; next; # comment out this li +ne if you want to modify firewall rules and uncomment one of the foll +owing #system("/sbin/pfctl -t bots -T add $ip"); # adding ad +dress to table <bots> #system("/sbin/ipfw table 5 add $ip"); # adding addres +s to table 5 #system("/sbin/iptables -A INPUT -s $ip -j REJECT"); # + adding denying rule } }

Replies are listed 'Best First'.
Re: Getting bots ips from apache logs.
by oev (Initiate) on Nov 30, 2010 at 11:26 UTC
    slightly upgraded the script to monitor multiple logs and automatically remove the ban by the time, sends mail and logging all ban incidents:
    #!/usr/bin/perl -w use strict; use warnings; use Sys::Syslog qw(:DEFAULT setlogsock); use POSIX qw(strftime); my $pattern = "\"GET \/ HTTP\/"; # request index page pattern my @httpd_log = </var/log/httpd/domains/*.log>; my $ok = "1000"; # allowed connections per ip for $check_period my $check_period = 1; # check period in hours my $date = strftime("%d/%b/%Y:%H",localtime(time-$check_period*3600)); my (%ips, $ip, $start); my (%ips_ban, $ip_ban, $time); my $end_time = strftime("%H", localtime(time+$check_period*3600)); my $start_time; setlogsock('unix'); openlog("http_block", 'ndelay', 'LOG_SECURITY'); syslog("info","http_block started using ipfw \n"); system("/sbin/ipfw table 2 flush"); open my $pipe, "-|", "/usr/bin/tail", "-f", @httpd_log or die "could not start tail on file.log: $!"; while (<$pipe>) { $date = strftime("%d/%b/%Y:%H",localtime(time-$check_period*3600)); next unless m/$date/ || $start; # skipping old records $start=1; if (/^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*$pattern/go) { $ips{$1}++; } foreach $ip (keys %ips) { if ($ips{$ip} >= $ok) { if (exists $ips_ban{$ip}) { syslog("info", "BLOCKING $ip with $ips{$ip} conn not working!!!\n"); } if (not exists $ips_ban{$ip}) { #system("/sbin/pfctl -t bots -T add $ip"); #system("/sbin/iptables -A INPUT -s $ip -j REJECT"); system("/sbin/ipfw table 2 add $ip"); syslog("info", "BLOCKING $ip with $ips{$ip} conn \n"); system("echo 'BLOCKING $ip with $ips{$ip} conn by http +_block '|mail -s 'BLOCKING $ip by http_block' root"); } $ips_ban{$ip} = strftime("%H", localtime(time)); delete $ips{$ip}; next; } } $start_time = strftime("%H", localtime(time)); if ($end_time <= $start_time) { foreach $ip (keys %ips_ban) { if ($ips_ban{$ip}+1 <= $start_time) { system("/sbin/ipfw table 2 delete $ip"); syslog("info", "UNBLOCKED $ip \n"); system("echo 'UNBLOCKED $ip by http_block'|mail -s 'UNBLOCKED +$ip by http_block' root"); delete $ips_ban{$ip}; } } $end_time = strftime("%H", localtime(time+$check_period*3600)); } }; closelog;
Re: Getting bots ips from apache logs.
by Anonymous Monk on May 18, 2010 at 20:55 UTC
    Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://727129]
Approved by Tanktalus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2025-11-17 11:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What's your view on AI coding assistants?





    Results (72 votes). Check out past polls.

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.