Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

How to view and filter logs in a database

by chrestomanci (Priest)
on Oct 26, 2012 at 10:18 UTC ( [id://1001033] : perlquestion . print w/replies, xml ) Need Help??

chrestomanci has asked for the wisdom of the Perl Monks concerning the following question:

Greetings wise brothers.

In my day to day perl projects, I mostly use Log4perl for logging. It is flexable and it works quite nicely.

Recently I have joined a complex project involving a cluster of dozens of servers, each running several different daemons. The cluster processes arround 100_000 jobs per week, with different parts of each job running on different servers in the cluster. Log files are stored localy on each server. Each daemon logs to stdout which is redirected to a log file.

Needless to say it is difficult to investigate issues when stuff goes wrong as there are many log files on different servers to look at, and the only way to isolate log messages that relate to a particular job is to use grep. The volume of the log files is also a problem. Currently we log a lot of detail as it might be needed for later anasys, but this makes for huge log files and slow greping of them.

For that reason I am investigating putting the log messages into a database using Log::Log4perl::Appender::DBI or such like, as this would have the advantage of putting all the logs in one place, and make it easier to purge verbose messages on a schedule while keeping errors for much longer.

My problem is how to view and filter the logs in a user freindly way. While in theory I could write some SQL like: Select * from logs where jobID = 12345 in practice it is not a user friendly way of doing things.

I have seen that log4javascript has a web interface tool for log messages that lets you search and filter log messages. (See their demo). Does anyone know of a web interface that can be used to view and filter log messages from a database? My google searches are coming up empty.

Replies are listed 'Best First'.
Re: How to view and filter logs in a database
by Corion (Patriarch) on Oct 26, 2012 at 10:31 UTC

    Have you looked at using (for example) Elasticsearch for logging? That would somewhat make it easier to store your log lines and other information, and potentially also makes it easier to filter on the relevant terms.

    I think I remember clinton either showing or having implemented something like that, but my search doesn't find it.

Re: How to view and filter logs in a database
by iguanodon (Priest) on Oct 26, 2012 at 11:59 UTC
    You might want to look at Splunk. It's not free, and it's not an RDBMS under the hood. But it collects and indexes logs from different systems, it has an SQL-like query language, and there is a web interface.

Re: How to view and filter logs in a database
by Anonymous Monk on Oct 26, 2012 at 10:58 UTC
Re: How to view and filter logs in a database
by Anonymous Monk on Oct 26, 2012 at 15:28 UTC
    If the volume of log files is bad, the volume in a database might be even worse of a problem. A scanner, maybe hooked to logrotate, could comb through the files and pick-out the meaningful messages. Then, gzip the files for a couple of days and then deep-six them. The script that gathers up the log messages could put them into whatever database table or tables you want, and the tables in question would be indexed appropriately.

      Fair point about the volume of the data, though I am hoping that logging to a database will help solve that.

      At the moment I am logging quite verbosely, as the verbose logs are needed when working out why something crashed, but I am also keeping the log files for about two months, for statistics and reporting. (Which does not need the vebose messages)

      My plan with logging to a database, is that I can log verbosely, and then purge all the verbose messages after a couple of days (unless something crashes), to leave behind a much smaller volume of non verbose messages, that will be kept for the standard 2 month window.

      This would rely on using a database that will efficently compact and re-use space from deleted records. I have allready had bad experences in that area with MySQL. Postgres or Couch look to be promising.