http://www.perlmonks.org?node_id=221155

mkent has asked for the wisdom of the Perl Monks concerning the following question:

Thanks to PhiRate, Browseruk, Sauoq and others, I now have working code, but it needs some finishing touches:

1) Specifically, the last block, "Sort and print", prints everything instead of displaying the first 10 and then providing a way to display the next 10 and scroll back to the previous 10. Any ideas on how to do this?

2) The code is a bit slow. On a log with 18,646,915 bytes it takes a little over a minute and a half. But on a big log with 22,623,798 bytes it times out after 3 minutes. Any ideas on how to make it run faster?

Thanks, and here's the code:

use strict; use warnings; use Date::Manip; use CGI qw/:standard/; # Make sure security is not compromised by calling unpathed programs. $ENV{PATH} = "/bin:/usr/bin:/usr/local/bin:"; $ENV{IFS}=""; # Use CGI to print the header print header; # Make variables local only my %referers = (); my $row = 0; my $counter = 0; # Retrieve and security-check parameters my $site = param('site'); my $hour = param('hour'); my $minute = param('minute'); if ($hour !~ /^\d\d?$/) { die('Invalid hour'); } if ($minute !~ /^\d\d?$/) { die('Invalid minute'); } # Get date object for the checkpoint my $check_date = ParseDate("${hour}hours ${minute}minutes ago"); # Select the server log - current 12/19/02 my $data = ''; if ($site eq 'star') {$data = 'indystar/access_log'} elsif ($site eq 'topics') {$data = 'topics/access_log'} else {$data = 'noblesville/access_log'} # Create headline for web page print "<h1>Referrers in the past $hour hours and $minute minutes</h1>" +; # File handling, one line at a time; if can't open, say why open(FH,"$data") || die('Could not open $data: $!'); while (my $line = <FH>) { next if ($line !~ /^\S+ \S \S \[(\S+) \S+\] "[^"]+" \d+ \d+ "([^"] ++)"/); my $line_date = ParseDate($1); # Check to see if the line date is in the range we're after next unless Date_Cmp($line_date, $check_date)>0; # If the referer is new, set to 1 entry, otherwise increment if (!$referers{$2}) { $referers{$2}=1; } else { $referers{$2}++; } } close(FH); # Sort and print for (sort {$referers{$b} <=> $referers{$a}} keys %referers) { print "$_ - $referers{$_}<p>"; unless (++$counter % 10) { print "Press Enter"; <STDIN> } }
Here's some sammple log data it's reading:

66.149.65.62 - - 19/Dec/2002:09:02:59 -0500 "GET /images/email.gif HTTP/1.1" 304 - "http://www.indystar.com/print/articles/5/009542-7185-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; H010818)"
66.149.65.62 - - 19/Dec/2002:09:02:59 -0500 "GET /images/print.gif HTTP/1.1" 304 - "http://www.indystar.com/print/articles/5/009542-7185-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; H010818)"
66.149.65.62 - - 19/Dec/2002:09:02:59 -0500 "GET /images/sidelinksend2.gif HTTP/1.1" 304 - "http://www.indystar.com/print/articles/5/009542-7185-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; H010818)"
66.72.209.208 - - 19/Dec/2002:09:02:59 -0500 "GET /images/pics2/image-005305-3314.jpg HTTP/1.1" 304 - "http://www.indystar.com/print/articles/8/005305-9938-038.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; .NET CLR 1.0.3705; MSIECrawler)"
66.134.224.29 - - 19/Dec/2002:09:02:59 -0500 "GET /images/header_aod2_01.gif HTTP/1.1" 200 2011 "http://www.indystar.com/print/articles/6/009478-6696-040.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
66.72.209.208 - - 19/Dec/2002:09:02:59 -0500 "GET /print/articles/0/005306-8900-038.html HTTP/1.1" 200 8361 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; .NET CLR 1.0.3705; MSIECrawler)"