Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Web Logs Using DBI

If you're running mod_perl under Apache, you can do some neat tricks with DBI and your access logs. Instead of just logging accesses to a flat file, you can stuff the information into a relational database (like MySQL or Oracle).

Then you can use SQL select statements to pull the information back out and generate reports, much more easily (in my opinion) than if you had the data stored in a flat file. For example, you could select all accesses from a particular day with one select statement.

Apache Log Handling

To start with, it's helpful to understand that Apache handles requests in phases, with each phase being "handled" by a particular handler. mod_perl lets you tie a Perl program into that API--thus you can use a specialized logging program to log accesses. We're going to associate our logging program with the "PerlLogHandler" phase; you can do so by adding a line to your httpd.conf file:
PerlLogHandler Foo::DBILogger
You should change "Foo" to some unique identifier for your particular site; you don't have to do this, but it helps to organize your Perl libraries.

Foo::DBILogger will be a Perl module that you'll write and install somewhere in your @INC ("/usr/local/lib/perl5/site_perl/Foo/DBILogger.pm", for example). It must contain a subroutine called "handler"--this sub will be called by Apache for each request.

Creating the Database Table

Next we'll need to create the database table in which we'll store the log data. For this example, we'll assume that we're working with MySQL, but this should work with any relational database. You'll need to create the table in the mysql interactive program:
create table log ( stamp datetime not null, host varchar(255) not null, method varchar(8) not null, url varchar(255) not null, user varchar(50), browser varchar(255), referer varchar(255), status smallint(3), bytes int(8) );

Writing the Handler

This'll create the database table; now you just need to write the module that will populate it. So let's create that module, then. We'll start with the following statements:
package Foo::DBILogger; use Apache::DBI; use DBI; use POSIX 'strftime'; use Apache::Constants qw/OK/;
The first says that we're starting the package Foo::DBILogger. This should be, of course, whatever you called your log handler in the "PerlLogHandler" statement above. The next two lines pull in Apache::DBI (for persistent database connections, so that our logger is fast enough to handle many simultaneous accesses), and DBI, which provides our database connectivity. The POSIX line pulls in the strftime function to format the timestamp as MySQL likes to receive its timestamps. And the final line pulls in an Apache-specific constant that we'll want to send back to Apache after we've finished handling the logging.

Now we'll define our handler subroutine, starting with this:

sub handler { my $r = shift;
When Apache invokes the handler subroutine, it gives it as an argument an Apache::Request object, which contains all sorts of information about the request. It's from this information that we'll populate the table.

Next, we set up our database and our statement handle:

my $dbh = DBI->connect('dsn', 'user', 'password', 'mysql', { RaiseError => 1 }) or die "Can't connect: ", $DBI::errstr; my $sth = $dbh->prepare_cached(<<SQL) or die $dbh->errstr; insert into log (stamp, host, method, url, user, browser, referer, status, bytes) values (?, ?, ?, ?, ?, ?, ?, ?, ?) SQL
Remember, we're using Apache::DBI, which gives us persistent database connections, so we can just set up our database normally. Also note that we're using the prepare_cached method so that we're most likely using a pre-compiled version of our SQL statement.

Now all that remains is to fetch the information from the request object and insert it into the database by executing the statement. We can do that quite easily:

$sth->execute( strftime("%Y-%m-%d %H:%M:%S", localtime), $r->get_remote_host, $r->method, $r->uri, $r->connection->user, $r->header_in('Referer'), $r->header_in('User-agent'), $r->status, $r->bytes_sent ); $sth->finish;
This should be fairly self-explanatory, considering that the method names that we're calling are quite clear. Then we dispose of the statement handle by calling finish on it.

So we're done, and all that's left to do is return the OK message and finish up our module:

return OK; } 1;
Ending, of course, with "1;".

The Whole Thing

So, putting the whole thing together, we get:
package Foo::DBILogger; use Apache::DBI; use DBI; use POSIX 'strftime'; use Apache::Constants qw/OK/; sub handler { my $r = shift; my $dbh = DBI->connect('dsn', 'user', 'password', 'mysql', { RaiseError => 1 }) or die "Can't connect: ", $DBI::errstr; my $sth = $dbh->prepare_cached(<<SQL) or die $dbh->errstr; insert into log (stamp, host, method, url, user, browser, referer, status, bytes) values (?, ?, ?, ?, ?, ?, ?, ?, ?) SQL $sth->execute( strftime("%Y-%m-%d %H:%M:%S", localtime), $r->get_remote_host, $r->method, $r->uri, $r->connection->user, $r->header_in('Referer'), $r->header_in('User-agent'), $r->status, $r->bytes_sent ); $sth->finish; return OK; } 1;

Getting Data Back

So now you're getting data into your database; suppose you want to get some back out. Let's say that you want to get all of the hosts that have accessed pages in the last day. Use something like this:
select host, url from log where to_days(now()) - to_days(stamp) <= 1;

See Also

Take a look at Randal Schwartz's Web access logs with DBI, which includes some more stuff to put into your database, including data on the CPU time spent handling the transaction; and also at Lincoln Stein's Cool Tricks with Perl and Apache.

In reply to Web Logs Using DBI by btrott

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others contemplating the Monastery: (14)
    As of 2014-07-23 18:22 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (149 votes), past polls