|Perl Monk, Perl Meditation|
Web Logs Using DBIIf you're running mod_perl under Apache, you can do some neat tricks with DBI and your access logs. Instead of just logging accesses to a flat file, you can stuff the information into a relational database (like MySQL or Oracle).
Then you can use SQL select statements to pull the information back out and generate reports, much more easily (in my opinion) than if you had the data stored in a flat file. For example, you could select all accesses from a particular day with one select statement.
Apache Log HandlingTo start with, it's helpful to understand that Apache handles requests in phases, with each phase being "handled" by a particular handler. mod_perl lets you tie a Perl program into that API--thus you can use a specialized logging program to log accesses. We're going to associate our logging program with the "PerlLogHandler" phase; you can do so by adding a line to your httpd.conf file:
You should change "Foo" to some unique identifier for your particular site; you don't have to do this, but it helps to organize your Perl libraries.
Foo::DBILogger will be a Perl module that you'll write and install somewhere in your @INC ("/usr/local/lib/perl5/site_perl/Foo/DBILogger.pm", for example). It must contain a subroutine called "handler"--this sub will be called by Apache for each request.
Creating the Database TableNext we'll need to create the database table in which we'll store the log data. For this example, we'll assume that we're working with MySQL, but this should work with any relational database. You'll need to create the table in the mysql interactive program:
Writing the HandlerThis'll create the database table; now you just need to write the module that will populate it. So let's create that module, then. We'll start with the following statements:
The first says that we're starting the package Foo::DBILogger. This should be, of course, whatever you called your log handler in the "PerlLogHandler" statement above. The next two lines pull in Apache::DBI (for persistent database connections, so that our logger is fast enough to handle many simultaneous accesses), and DBI, which provides our database connectivity. The POSIX line pulls in the strftime function to format the timestamp as MySQL likes to receive its timestamps. And the final line pulls in an Apache-specific constant that we'll want to send back to Apache after we've finished handling the logging.
Now we'll define our handler subroutine, starting with this:
When Apache invokes the handler subroutine, it gives it as an argument an Apache::Request object, which contains all sorts of information about the request. It's from this information that we'll populate the table.
Next, we set up our database and our statement handle:
Remember, we're using Apache::DBI, which gives us persistent database connections, so we can just set up our database normally. Also note that we're using the prepare_cached method so that we're most likely using a pre-compiled version of our SQL statement.
Now all that remains is to fetch the information from the request object and insert it into the database by executing the statement. We can do that quite easily:
This should be fairly self-explanatory, considering that the method names that we're calling are quite clear. Then we dispose of the statement handle by calling finish on it.
So we're done, and all that's left to do is return the OK message and finish up our module:
Ending, of course, with "1;".
The Whole ThingSo, putting the whole thing together, we get:
Getting Data BackSo now you're getting data into your database; suppose you want to get some back out. Let's say that you want to get all of the hosts that have accessed pages in the last day. Use something like this:
See AlsoTake a look at Randal Schwartz's Web access logs with DBI, which includes some more stuff to put into your database, including data on the CPU time spent handling the transaction; and also at Lincoln Stein's Cool Tricks with Perl and Apache.