Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

It's not a good idea to use lots of print statements to be outputting your webpage. Either use the CGI or use HEREDOCS to output your page in a one or two statements. I prefer the latter, but I can understand using the former may allow your pages to automatically keep up with changing standards.

Also try and use the following when creating html:

  • Try to consistently use lowercase elements. Uppercase is so AOL/1995 coding style.
  • Use th for table heading columns, not td
  • Use CSS for layout where possible; having a css element saying your table is 500px wide is better than having the width set with the old (deprecated?) width attribute
  • Run your page through the W3C Validator or a local validator program to see how standards compliant your page is

Also, it is probably a good idea not to hard code the name of the logfile and the web output page into the file; these are really things that should be command line parameters so you can read any log file and output yur web page to any file name you like.

perl your_program <access.log >log.html

Your code was resetting the list of IPs on every line, whereas you want a count through the whole logfile

#!/usr/bin/perl use strict; use warnings; use 5.010; use POSIX; # create this outside the loop - it doesn't change # in fact it isn't used so why is it here at all? left in just in case my %dates = ( 'Jan' => '01', 'Feb' => '02', 'Mar' => '03', 'Apr' => '04', 'May' => '05', 'Jun' => '06', 'Jul' => '07', 'Aug' => '08', 'Sep' => '09', 'Oct' => '10', 'Nov' => '11', 'Dec' => '12', ); my $yesterday = strftime("%d/%b/%Y",localtime(time()-86400)); my $yesterdayHits=0; my $IPcount=0; my $totalhits=0; my $startDate; my $tm = scalar(localtime); my %ips=(); my @rows; # read from logfile(s) supplied on command line, instead of fixed file +.... foreach my $line (<>) { $totalhits++; # (.+) is horrible as '.' includes spaces, this is better .... my $w = "(\S+?)"; $line =~ m/^$w $w $w \[$w:$w $w\] "$w $w $w" $w $w$/; # could do all these as one statement, but split for readability.. +. my ($site, $logName, $fullName) = ($1,$2, $3); my ($date, $time, $gmt) = ($4, $5, $6); my ($req, $file, $proto) = ($7, $8, $9); my ($status, $length) = ($10, $11); $ips{$site}++; my ($day,$month,$year)=split"\/",$date; my $row = <<EOF; <tr><td>$site</td><td>$line</td></tr> EOF push @rows,$row; } # Real Men use Data::Dumper :-) foreach my $key ( sort keys %ips ) { print STDERR $key, " => ", $ips{$key}, "\n"; } # write to output file specified on command line instead... print <<EOF; <head> <title>Access Counts</title></head> <body> <h1> Today is: $tm</h1> <h3>Yesterday was $yesterday</h3> <h3>There are $IPcount unique visitors in the log</h3> <table BORDER=1 CELLPADDING=10 width='500px'> <tr><th>IP</th> <th>LOGFILE</th> </tr> @rows <h2>Start Date is $startDate</h2> <h2>Total hits: $totalhits</h2> <h3>Hits Yesterday: $yesterdayHits</h3> </table></p> </body> </html> EOF
A Monk aims to give answers to those who have none, and to learn from those who know more.

In reply to Re: unique visitors from html logfile by space_monk
in thread unique visitors from html logfile by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-19 16:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found