note
space_monk
<p>It's not a good idea to use lots of print statements to be outputting your webpage. Either use the [mod://CGI] or use HEREDOCS to output your page in a one or two statements. I prefer the latter, but I can understand using the former may allow your pages to automatically keep up with changing standards.</p>
<p>Also try and use the following when creating html:</p>
<ul>
<li>Try to consistently use lowercase elements. Uppercase is so AOL/1995 coding style.</li>
<li>Use th for table heading columns, not td</li>
<li>Use CSS for layout where possible; having a css element saying your table is 500px wide is better than having the width set with the old (deprecated?) width attribute</li>
<li>Run your page through the <a href="http://validator.w3.org/">W3C Validator</a> or a local validator program to see how standards compliant your page is</li>
</ul>
<p>Also, it is probably a good idea not to hard code the name of the logfile and the web output page into the file; these are really things that should be command line parameters so you can read any log file and output yur web page to any file name you like.</p>
<code>perl your_program <access.log >log.html</code>
<p>Your code was resetting the list of IPs on every line, whereas you want a count through the whole logfile</p>
<code>
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use POSIX;
# create this outside the loop - it doesn't change
# in fact it isn't used so why is it here at all? left in just in case
my %dates = (
'Jan' => '01',
'Feb' => '02',
'Mar' => '03',
'Apr' => '04',
'May' => '05',
'Jun' => '06',
'Jul' => '07',
'Aug' => '08',
'Sep' => '09',
'Oct' => '10',
'Nov' => '11',
'Dec' => '12',
);
my $yesterday = strftime("%d/%b/%Y",localtime(time()-86400));
my $yesterdayHits=0;
my $IPcount=0;
my $totalhits=0;
my $startDate;
my $tm = scalar(localtime);
my %ips=();
my @rows;
# read from logfile(s) supplied on command line, instead of fixed file....
foreach my $line (<>) {
$totalhits++;
# (.+) is horrible as '.' includes spaces, this is better ....
my $w = "(\S+?)";
$line =~ m/^$w $w $w \[$w:$w $w\] "$w $w $w" $w $w$/;
# could do all these as one statement, but split for readability...
my ($site, $logName, $fullName) = ($1,$2, $3);
my ($date, $time, $gmt) = ($4, $5, $6);
my ($req, $file, $proto) = ($7, $8, $9);
my ($status, $length) = ($10, $11);
$ips{$site}++;
my ($day,$month,$year)=split"\/",$date;
my $row = <<EOF;
<tr><td>$site</td><td>$line</td></tr>
EOF
push @rows,$row;
}
# Real Men use Data::Dumper :-)
foreach my $key ( sort keys %ips ) {
print STDERR $key, " => ", $ips{$key}, "\n";
}
# write to output file specified on command line instead...
print <<EOF;
<head>
<title>Access Counts</title></head>
<body>
<h1> Today is: $tm</h1>
<h3>Yesterday was $yesterday</h3>
<h3>There are $IPcount unique visitors in the log</h3>
<table BORDER=1 CELLPADDING=10 width='500px'>
<tr><th>IP</th>
<th>LOGFILE</th>
</tr>
@rows
<h2>Start Date is $startDate</h2>
<h2>Total hits: $totalhits</h2>
<h3>Hits Yesterday: $yesterdayHits</h3>
</table></p>
</body>
</html>
EOF
</code>
<!-- Node text goes above. Div tags should contain sig only -->
<div class="pmsig"><div class="pmsig-880879">
A Monk aims to give answers to those who have none, and to learn from those who know more.
</div></div>
1004276
1004276