It's not a good idea to use lots of print statements to be outputting your webpage. Either use the CGI or use HEREDOCS to output your page in a one or two statements. I prefer the latter, but I can understand using the former may allow your pages to automatically keep up with changing standards.
Also try and use the following when creating html:
- Try to consistently use lowercase elements. Uppercase is so AOL/1995 coding style.
- Use th for table heading columns, not td
- Use CSS for layout where possible; having a css element saying your table is 500px wide is better than having the width set with the old (deprecated?) width attribute
- Run your page through the W3C Validator or a local validator program to see how standards compliant your page is
Also, it is probably a good idea not to hard code the name of the logfile and the web output page into the file; these are really things that should be command line parameters so you can read any log file and output yur web page to any file name you like.
perl your_program <access.log >log.html
Your code was resetting the list of IPs on every line, whereas you want a count through the whole logfile
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use POSIX;
# create this outside the loop - it doesn't change
# in fact it isn't used so why is it here at all? left in just in case
my %dates = (
'Jan' => '01',
'Feb' => '02',
'Mar' => '03',
'Apr' => '04',
'May' => '05',
'Jun' => '06',
'Jul' => '07',
'Aug' => '08',
'Sep' => '09',
'Oct' => '10',
'Nov' => '11',
'Dec' => '12',
);
my $yesterday = strftime("%d/%b/%Y",localtime(time()-86400));
my $yesterdayHits=0;
my $IPcount=0;
my $totalhits=0;
my $startDate;
my $tm = scalar(localtime);
my %ips=();
my @rows;
# read from logfile(s) supplied on command line, instead of fixed file
+....
foreach my $line (<>) {
$totalhits++;
# (.+) is horrible as '.' includes spaces, this is better ....
my $w = "(\S+?)";
$line =~ m/^$w $w $w \[$w:$w $w\] "$w $w $w" $w $w$/;
# could do all these as one statement, but split for readability..
+.
my ($site, $logName, $fullName) = ($1,$2, $3);
my ($date, $time, $gmt) = ($4, $5, $6);
my ($req, $file, $proto) = ($7, $8, $9);
my ($status, $length) = ($10, $11);
$ips{$site}++;
my ($day,$month,$year)=split"\/",$date;
my $row = <<EOF;
<tr><td>$site</td><td>$line</td></tr>
EOF
push @rows,$row;
}
# Real Men use Data::Dumper :-)
foreach my $key ( sort keys %ips ) {
print STDERR $key, " => ", $ips{$key}, "\n";
}
# write to output file specified on command line instead...
print <<EOF;
<head>
<title>Access Counts</title></head>
<body>
<h1> Today is: $tm</h1>
<h3>Yesterday was $yesterday</h3>
<h3>There are $IPcount unique visitors in the log</h3>
<table BORDER=1 CELLPADDING=10 width='500px'>
<tr><th>IP</th>
<th>LOGFILE</th>
</tr>
@rows
<h2>Start Date is $startDate</h2>
<h2>Total hits: $totalhits</h2>
<h3>Hits Yesterday: $yesterdayHits</h3>
</table></p>
</body>
</html>
EOF
A Monk aims to give answers to those who have none, and to learn from those who know more.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.