Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Thanks to PhiRate, Browseruk, Sauoq and others, I now have working code, but it needs some finishing touches:

1) Specifically, the last block, "Sort and print", prints everything instead of displaying the first 10 and then providing a way to display the next 10 and scroll back to the previous 10. Any ideas on how to do this?

2) The code is a bit slow. On a log with 18,646,915 bytes it takes a little over a minute and a half. But on a big log with 22,623,798 bytes it times out after 3 minutes. Any ideas on how to make it run faster?

Thanks, and here's the code:

use strict; use warnings; use Date::Manip; use CGI qw/:standard/; # Make sure security is not compromised by calling unpathed programs. $ENV{PATH} = "/bin:/usr/bin:/usr/local/bin:"; $ENV{IFS}=""; # Use CGI to print the header print header; # Make variables local only my %referers = (); my $row = 0; my $counter = 0; # Retrieve and security-check parameters my $site = param('site'); my $hour = param('hour'); my $minute = param('minute'); if ($hour !~ /^\d\d?$/) { die('Invalid hour'); } if ($minute !~ /^\d\d?$/) { die('Invalid minute'); } # Get date object for the checkpoint my $check_date = ParseDate("${hour}hours ${minute}minutes ago"); # Select the server log - current 12/19/02 my $data = ''; if ($site eq 'star') {$data = 'indystar/access_log'} elsif ($site eq 'topics') {$data = 'topics/access_log'} else {$data = 'noblesville/access_log'} # Create headline for web page print "<h1>Referrers in the past $hour hours and $minute minutes</h1>" +; # File handling, one line at a time; if can't open, say why open(FH,"$data") || die('Could not open $data: $!'); while (my $line = <FH>) { next if ($line !~ /^\S+ \S \S \[(\S+) \S+\] "[^"]+" \d+ \d+ "([^"] ++)"/); my $line_date = ParseDate($1); # Check to see if the line date is in the range we're after next unless Date_Cmp($line_date, $check_date)>0; # If the referer is new, set to 1 entry, otherwise increment if (!$referers{$2}) { $referers{$2}=1; } else { $referers{$2}++; } } close(FH); # Sort and print for (sort {$referers{$b} <=> $referers{$a}} keys %referers) { print "$_ - $referers{$_}<p>"; unless (++$counter % 10) { print "Press Enter"; <STDIN> } }
Here's some sammple log data it's reading:

66.149.65.62 - - 19/Dec/2002:09:02:59 -0500 "GET /images/email.gif HTTP/1.1" 304 - "http://www.indystar.com/print/articles/5/009542-7185-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; H010818)"
66.149.65.62 - - 19/Dec/2002:09:02:59 -0500 "GET /images/print.gif HTTP/1.1" 304 - "http://www.indystar.com/print/articles/5/009542-7185-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; H010818)"
66.149.65.62 - - 19/Dec/2002:09:02:59 -0500 "GET /images/sidelinksend2.gif HTTP/1.1" 304 - "http://www.indystar.com/print/articles/5/009542-7185-036.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; H010818)"
66.72.209.208 - - 19/Dec/2002:09:02:59 -0500 "GET /images/pics2/image-005305-3314.jpg HTTP/1.1" 304 - "http://www.indystar.com/print/articles/8/005305-9938-038.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; .NET CLR 1.0.3705; MSIECrawler)"
66.134.224.29 - - 19/Dec/2002:09:02:59 -0500 "GET /images/header_aod2_01.gif HTTP/1.1" 200 2011 "http://www.indystar.com/print/articles/6/009478-6696-040.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
66.72.209.208 - - 19/Dec/2002:09:02:59 -0500 "GET /print/articles/0/005306-8900-038.html HTTP/1.1" 200 8361 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; .NET CLR 1.0.3705; MSIECrawler)"


In reply to count sort & output II by mkent

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (2)
As of 2024-06-16 08:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.