Obtaining Apache logfile stats?

mvam has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Obtaining Apache logfile stats? by DamnDirtyApe (Curate) on Mar 25, 2004 at 20:05 UTC
mvam, I'm unclear about what you want to accomplish, but I suspect you can come up with a better solution that skips the sed/awk portion of your process. Please post: The actual problem (maybe with examples) you're trying to solve here Your code so far Your output Your desired output _______________ DamnDirtyApe Those who know that they are profound strive for clarity. Those who would like to seem profound to the crowd strive for obscurity. --Friedrich Nietzsche	[reply]
Re: Obtaining Apache logfile stats? by sauoq (Abbot) on Mar 25, 2004 at 20:53 UTC
i wasnt able to get apache::parselog to work From this comment and your data sample, I have to guess that you aren't using a standard log format, right? Can you show us a sample of your log data and/or the CustomLog directive you use in your Apache configuration? Without that, we can't help you slice and dice it in Perl. is there a simple way with perl to parse the file and have the ability to pass a file name as an argument? Yes. Something like the following might work well enough for you depending, of course, on what you haven't told us yet... `#!perl -lan BEGIN { $SUM = $N = 0; $file = shift; } if ($F[1] eq $file) { my ($secs) = ($F[2] =~ /^(\d+)/); $SUM += $secs; $N++ } END { print "Average: " , $SUM/ $N; }` [download] Put that in a file and run it with two arguments, the full pathname of the file you want stats on and the pathname of the file your parsed log data (i.e. the sample you provided) is in. Something like: `perl get_stats /manual/misc/perf-tuning.html log.data` [download] -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]
Re: Re: Obtaining Apache logfile stats? by mvam (Acolyte) on Mar 25, 2004 at 21:46 UTC
a sample log line: x.x.x.x - 24/Mar/2004:12:26:52 -0800 "GET /manual/misc/perf-tuning.html HTTP/1.1" 200 0 48296 "http://localhost/manual/" "Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.6) Gecko/20040211 Firefox/0.8" and this is my logformat line: LogFormat "%v %{x-up-subno}i %t \"%r\" %>s %T %b \"%{Referer}i\" \"%{User-Agent}i\"" wap	[reply]
Re: Re: Re: Obtaining Apache logfile stats? by sauoq (Abbot) on Mar 25, 2004 at 22:28 UTC
The quick and dirty approach would be to just carve it up on white space like you are doing with awk anyway. The conversion is straight forward. Use `split` or perl's `-a` option (as in my example above.) Regardless of how you parse the input, you'll probably find it worthwhile to compute the statistics for every file accessed on one pass through your log. That's a lot more efficient than reading your whole log once for each file you want stats on. That's easy enough; just use a hash to maintain data for each filename as you traverse the log. -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]
Re: Re: Re: Re: Obtaining Apache logfile stats? by mvam (Acolyte) on Mar 25, 2004 at 23:14 UTC
Re: Re: Re: Re: Re: Obtaining Apache logfile stats? by sauoq (Abbot) on Mar 26, 2004 at 00:41 UTC
Some notes below your chosen depth have not been shown here
Re: Re: Re: Obtaining Apache logfile stats? by sauoq (Abbot) on Mar 25, 2004 at 21:56 UTC
How are you calculating the "0_seconds" portion of your sample data? -sauoq "My two cents aren't worth a dime.";	[reply]
Re: Re: Re: Re: Obtaining Apache logfile stats? by DamnDirtyApe (Curate) on Mar 25, 2004 at 22:05 UTC
Re: Re: Re: Re: Obtaining Apache logfile stats? by mvam (Acolyte) on Mar 25, 2004 at 22:01 UTC
Re: Obtaining Apache logfile stats? by mvam (Acolyte) on Mar 25, 2004 at 20:18 UTC
ok here we go: the problem i need to solve is taking an apache log file thats rotated daily and getting the time taken to serve each page in the log. i was able to get this data in a basic format using `awk '{print $3, %6, $9}' > /tmp/resultsfile` [download] this does a nice job of outputting the relevant fields in the log file. the next step would to be process this output in such a way that i could type say 'mod_rewrite.html' and find out how many times it was served and what the average of those time is.	[reply] [d/l]
Re: Re: Obtaining Apache logfile stats? by DamnDirtyApe (Curate) on Mar 25, 2004 at 20:35 UTC
Alright, perhaps try something along these lines: #! /usr/bin/perl use strict; use warnings; my $file = shift @ARGV; my @times = map { /(\d+)_seconds/; $1 } grep { /$file/ } <DATA>; my $totaltime; $totaltime += $_ for @times; my $avgtime = $totaltime / @times; print "Average time: $avgtime\n\n"; __DATA__ [24/Mar/2004:12:26:52 /manual/misc/perf-tuning.html 0_seconds [24/Mar/2004:12:27:33 /manual/mod/mod_status.html 0_seconds [24/Mar/2004:12:27:39 /manual/mod/module-dict.html 0_seconds [24/Mar/2004:12:27:46 /manual/misc/rewriteguide.html 0_seconds [24/Mar/2004:12:27:53 /manual/mod/mod_rewrite.html 5_seconds [24/Mar/2004:12:27:53 /manual/images/mod_rewrite_fig1.gif 0_seconds [24/Mar/2004:12:27:53 /manual/images/mod_rewrite_fig2.gif 0_seconds [24/Mar/2004:12:28:05 /manual/new_features_1_3.html 0_seconds [24/Mar/2004:12:29:53 /manual/mod/mod_rewrite.html 6_seconds [24/Mar/2004:12:29:54 /manual/mod/mod_rewrite.html 7_seconds [24/Mar/2004:12:29:55 /manual/mod/mod_rewrite.html 8_seconds [24/Mar/2004:12:29:56 /manual/mod/mod_rewrite.html 9_seconds [download] I still think you should try the format manipulation in Perl, though; it's easy to do, and you'll only have one script to maintain. _______________ DamnDirtyApe Those who know that they are profound strive for clarity. Those who would like to seem profound to the crowd strive for obscurity. --Friedrich Nietzsche	[reply] [d/l]
Re: Re: Re: Obtaining Apache logfile stats? by mvam (Acolyte) on Mar 25, 2004 at 21:34 UTC
this did produce the average time, but ended up with Use of uninitialized value in regexp compilation at ./avgtime.pl line 6, <DATA> line 12. this repeated for each line in DATA. i'm a perl moron as you can see, but i'm trying. the down side to these log files is that they can reach 2GB in a matter of hours so creating the temp result file can get somewhat expensive. i'm thinking about grepping out anything with a zero value since really we only want to see when the server has a load	[reply]
Re(4): how do i sort thee? by DamnDirtyApe (Curate) on Mar 25, 2004 at 21:57 UTC
Re: Obtaining Apache logfile stats? by Not_a_Number (Prior) on Mar 25, 2004 at 23:29 UTC
use strict; use warnings; my %HoA; while ( <DATA> ) { next unless (split '/')[-1] =~ /(.*)\s+(\d+)_seconds?$/; $HoA{$1}[0] += $2; # Total seconds per 'user' $HoA{$1}[1] ++; # Total times accessed per 'user' } # Print average access time for a given 'user': my $user = 'mod_rewrite.html'; print "Unknown user: $user" and exit unless $HoA{$user}; print "User: $user\n"; print "Total seconds: $HoA{$user}[0]\n"; print "Total accesses: $HoA{$user}[1]\n"; print "Av. access time: ", $HoA{$user}[0] / $HoA{$user}[1]; print "\n\n"; # Print the whole HoA: print "$_: @{ $HoA{$_} }\n" for keys %HoA; __DATA__ [24/Mar/2004:12:26:52 /manual/misc/perf-tuning.html 0_seconds [24/Mar/2004:12:27:33 /manual/mod/mod_status.html 0_seconds [24/Mar/2004:12:27:33 /manual/mod/mod_status.html 33_seconds [24/Mar/2004:12:27:39 /manual/mod/module-dict.html 0_seconds [24/Mar/2004:12:27:46 /manual/misc/rewriteguide.html 0_seconds [24/Mar/2004:12:27:53 /manual/mod/mod_rewrite.html 5_seconds [24/Mar/2004:12:27:53 /manual/images/mod_rewrite_fig1.gif 0_seconds rabbit!!! [24/Mar/2004:12:27:53 /manual/images/mod_rewrite_fig2.gif 0_seconds [24/Mar/2004:12:28:05 /manual/new_features_1_3.html 0_seconds [24/Mar/2004:12:29:53 /manual/mod/mod_rewrite.html 6_seconds [24/Mar/2004:12:29:54 /manual/mod/mod_rewrite.html 7_seconds [24/Mar/2004:12:29:55 /manual/mod/mod_rewrite.html 8_seconds [24/Mar/2004:12:29:56 /manual/mod/mod_rewrite.html 9_seconds [download] dave	[reply] [d/l]


P is for Practical
	PerlMonks