|go ahead... be a heretic|
How to quickly make sense of 1800 log filesby afoken (Abbot)
|on Apr 08, 2017 at 09:10 UTC||Need Help??|
Sorry, no code here, just a description. The code is propritary, and very specific to the problem. It does not make much sense without knowing the product and its source code.
After a large clinical test of one of our products, we got back about 1800 log files. The task: make sense of those logs. What were the product's problems, what errors that should never happen did happen, how did the users (ab-)use the product, short: How can we improve the product?
The log files are strictly line-oriented, roughly similar to logfiles generated by the Apache webserver. Each line starts with a time stamp, followed by a log source (which component caused the log line), followed by a log message. The three parts are separated by spaces. Log messages from one component are almost always further divided into a named log level, a source code reference (source file and line), a log message from a set of about 200 fixed messages, and some variable text. Other components also use fixed messages, but variable data might be mixed in (think of printf("Foo is %.2f cm, bar is %.2f N\n",foo,bar)).
Perl to the rescue:
A first script extracts the fixed log messages from the source code, and generates a perl module that contains the fixed messages and a function to mark the fixed message in a log line.
A second script uses the module generated by the first script, DBI, DBD::SQLite, and File::Find to read all log files, extract the device serial number from the log files' path, splits each line into timestamp, source, and raw message. The raw message is then handled differently for the different sources, to extract fixed messages and variable parts, log level, and source reference, if present. All of the data is then inserted into an SQLite database containing a single, not normalised table, using columns for serial number, file name, line number, timestamp, source, raw message, fixed message, variable part, log level, source reference.
A third script uses DBI, DBD::SQLite, and Text::CSV_XS reads a list of named SQL SELECT statements from a text file (example below), then creates a CSV file for each named SELECT statement, runs the query against the SQLite database, and writes the data fetched into the CSV file. Existing CSV files won't be overwritten.
Exampe for the query file (from memory):
The workflow is quite simple. Scripts 1 and 2 are executed once to create the database, then someone edits the query file and runs the third script. CSV files are opened in Excel. We make some educated guesses about what happened, add a few new queries to the query file, run the third script again, and look at the new output, again in Excel. Wash, rinse, repeat.
Why CSV and not Excel? I could have used Spreadsheet::WriteExcel or Excel::Writer::XLSX to generate native Excel files. But we may need to draw some graphs from the log data, and gnuplot needs CSV input. Plus, writing CSV files from Text::CSV_XS is much easier than writing Excel files.
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)