Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

A good way to parse Apache logs

by larsen (Parson)
on Aug 10, 2001 at 18:09 UTC ( #103874=perlquestion: print w/replies, xml ) Need Help??

larsen has asked for the wisdom of the Perl Monks concerning the following question:

A good way to parse Apache logs I'd like to find a feasible way to parse Apache logs. I faced this problem some time ago, so I installed Apache::Parselog, but, as it has been mentioned in this thread (Apache::ParseLog), I encountered some problem.

Now I ask you what's your preferred way to parse and obtain informations from logs. Particularly, how do you deal with these problems?

  • Parsing log (precompiled RE?)
  • Storing logs (store raw logs and parse them at need? how do you store them? what portion of logs do you store?)
  • What modules could be useful to perform these tasks? (what about Apache::LogDBI?)

Replies are listed 'Best First'.
Re: A good way to parse Apache logs
by echo (Pilgrim) on Aug 10, 2001 at 18:42 UTC
    Since you mention Apache::LogDBI I suspect you're interested in more than just parsing. Here's a solution that's scalable: use mod_log_spread. Each server in your web farm sends its log to a multicast group. You can then setup any number of listeners that will receive the logs and can act on them. This way you decouple not only parsing, but also storing the logs, from the web servers.
Re: A good way to parse Apache logs
by stefan k (Curate) on Aug 10, 2001 at 18:13 UTC
    Hmm,
    this might not be what you want to hear, but anyway...
    Do you really need to write YAALP (yet another apache logfile parser)? I think there must be mere millions of them out there (maybe even more than email- and irc-clients *grin*)

    Of course this might be kind of a practice for you or you might need some information that those can't deliver. If so, please discard this posting to /dev/null :)

    Regards... Stefan

      No, I think I'm not going to write a new logfile parser. Better, I'm not going to write a public available logfile parser. I think I'll write one just to learn why other parsers are better :)). What I'd like to find is the common used solution to perform this task (with solution I mean some composition of scripts and software, i.e. Perl + MySQL + something else...). And I'd like to know rationales that make this solution preferable.
Re: A good way to parse Apache logs (HTTPD::Log::Filter)
by ChemBoy (Priest) on Oct 26, 2004 at 19:16 UTC

    This is just for the historical record, since I'm sure you solved your problem years ago, but I'm having some luck (so far) with HTTPD::Log::Filter. One-line parse-and-extract:

    perl -MHTTPD::Log::Filter -lne'BEGIN{$f = HTTPD::Log::Filter->new(form +at=>"ELF",capture=>[qw(authexclude request referer)])} $f->filter($_) +;printf "%s %s\n",$f->authexclude,$f->referer' /var/your_httpd_log_he +re

    Granted, I have a longer definition of "one-line" than some, but it still fits in my tcsh buffer...



    If God had meant us to fly, he would *never* have given us the railroads.
        --Michael Flanders

      I also had good results with HTTPD::Log::Filter extracting the query strings from my access_log.
      my $hlf = HTTPD::Log::Filter->new(capture => [ qw(request status) ]); ... next unless $hlf->filter($ac_line); next unless $hlf->status == 200; my ($method, $query, $http_version) = split(' ', $hlf->request);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://103874]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (2)
As of 2022-05-29 04:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (101 votes). Check out past polls.

    Notices?