http://www.perlmonks.org?node_id=219471

mkent has asked for the wisdom of the Perl Monks concerning the following question:

Hey, guys, thanks!!! This is a wonderful resource, and I incorporated some suggestions into the revised script below. I still have some questions, though!

BrowserUk, I decided against using Date:Manip even though I really like that module. That's because the module instructions warn that it's slower than other time modules and this script will be used most often when the web server is overloaded with requests; thus, speed is essential.

Abigail-II, a database would be nice, but the server is producing regular logs, so that's what I have to use.

In the following script, here are my questions:

1) Using strict produces errors that I don't have a global module loaded; what module is that?

2) The simulated $month switch statement doesn't work as expected; instead of values 0 through 11, it gives everything a value of 1. Getting it changed to a number makes timelocal accurate.

3. At the end, I pack all the referrers into an array; what I need to do is count each referrer as an unique URL, so that www.you.com is counted x times and www.me.com is counted y times so I can then tell the top referrer in the time period stipulated by the web page (which just has hours and minutes to enter). That will let me create output like
www.you.com 22
www.me.com 19
etc
How can I count an unknown value and produce this output
And is an array the best way to do it?

Any and all ideas welcome, and thanks in advance. I really appreciate the help!

Here's the script, followed by some raw log data:

#!/usr/local/bin/perl #use strict; use CGI qw(:standard); use CGI::Carp qw(fatalsToBrowser carpout); use Time::Local; # Grab information returned by web page $hour = param ("hour"); $minute = param ("minute"); # Allow perl to write to browser window print "Content-type: text/html\n\n"; # Current time in seconds $now = time; # Convert submitted time to seconds $compare_time = ($hour * 3600) + ($minute * 60); # Times extracted by logs must be >= to $target $target = $now - $compare_time; open LOGFILE, "datafile.html" || die "Can't open file"; @log_data =<LOGFILE>; # Grab useful information from each line of the web log foreach $log_line(@log_data) { # Grab date/time and referer ($date_string, $referrer) = ($log_line =~ /\([^\]+)\] "^"+"^"+"(^"+ +)"/); # Replace / and : with spaces $date_string =~ s!/! !g; $date_string =~ s!:! !g; # Dump junk at end of line $date_string =~ s! -0-9+!!; # Split date/time into useful information ($day, $month, $year, $hhour, $min, $sec) = split(' ', $date_string +); # Convert month from text to number if ($month == 'Jan') {$month = 0} elsif ($month == 'Feb') {$month = 1} elsif ($month == 'Mar') {$month = 2} elsif ($month == 'Apr') {$month = 3} elsif ($month == 'May') {$month = 4} elsif ($month == 'Jun') {$month = 5} elsif ($month == 'Jul') {$month = 6} elsif ($month == 'Aug') {$month = 7} elsif ($month == 'Sep') {$month = 8} elsif ($month == 'Oct') {$month = 9} elsif ($month == 'Nov') {$month = 10} else {$month = 11} # Calculate time on the log line in seconds $log_time = timelocal($sec,$min,$hhour,$day,$month,$year); if ($log_time >= $target) { push @refers, $referrer; } } <code> Some data: <pre> 216.45.43.42 - - 12/Dec/2002:18:39:15 -0500 "GET /news/opinions/varvel +.gif HTTP/1.1" 302 313 "http://www.freerepublic.com/forum/a3a95ca3c24 +a0.htm" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR +1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:15 -0500 "GET /images/header_aod2_1 +5.gif HTTP/1.1" 200 4162 "http://www.indystar.com/print/articles/1/00 +7735-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; W +in 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:15 -0500 "GET /images/storysearch2. +gif HTTP/1.1" 200 142 "http://www.indystar.com/print/articles/1/00773 +5-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win +9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:15 -0500 "GET /users/ads/misc/remax +_searchad3.gif HTTP/1.1" 200 2335 "http://www.indystar.com/print/arti +cles/1/007735-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Wind +ows 98; Win 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705 +)" 12.222.75.65 - - 12/Dec/2002:18:39:16 -0500 "GET /images/sports_03_aod +.gif HTTP/1.1" 200 3195 "http://www.indystar.com/print/articles/1/007 +735-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Wi +n 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:16 -0500 "GET /images/email.gif HTT +P/1.1" 200 138 "http://www.indystar.com/print/articles/1/007735-7671- +036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90 +; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:16 -0500 "GET /images/print.gif HTT +P/1.1" 200 139 "http://www.indystar.com/print/articles/1/007735-7671- +036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90 +; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:16 -0500 "GET /images/sidelinksend2 +.gif HTTP/1.1" 200 1009 "http://www.indystar.com/print/articles/1/007 +735-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Wi +n 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:16 -0500 "GET /images/pics2/image-0 +07735-7410.jpg HTTP/1.1" 200 18319 "http://www.indystar.com/print/art +icles/1/007735-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Win +dows 98; Win 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.370 +5)" 12.222.75.65 - - 12/Dec/2002:18:39:16 -0500 "GET /images/advertisement +_250strip.gif HTTP/1.1" 200 238 "http://www.indystar.com/print/articl +es/1/007735-7671-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Window +s 98; Win 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NET CLR 1.0.3705)" 12.222.75.65 - - 12/Dec/2002:18:39:17 -0500 "GET /users/ads/story/macs +elect/macselect_250_Oct.gif HTTP/1.1" 200 10436 "http://www.indystar. +com/print/articles/1/007735-7671-036.html" "Mozilla/4.0 (compatible; +MSIE 6.0; Windows 98; Win 9x 4.90; MSOCD; Q312461; YComp 5.0.0.0; .NE +T CLR 1.0.3705)"

edited: Fri Dec 13 16:15:35 2002 by jeffa - added <readmore>