|Welcome to the Monastery|
Defensive Programming and Audit Trailsby FoxtrotUniform (Prior)
|on Aug 06, 2002 at 05:31 UTC||Need Help??|
Why write a logfile?
When processing data programatically, writing a log of what's being done, and to which data, is extremely helpful. It gives you a blow-by-blow description of what your code's doing, verifying (or falsifying) your assumptions at every step of the process. If it's detailed enough, you can use it to recover from small disasters ("Crap! I didn't mean to delete that record!"), and it's essential in tracking down errors in large, multi-step data grinders. The small cost of keeping a text file around (which, once your program's run is complete, can easily be gzipped...) is negligible compared to the extra debugging muscle.
davorg covered some of this ground quite well in his excellent Data Munging with Perl book; I'm going to expand on the material there, probably repeat some of it, and digress a little bit into good and bad methods of "Defensive Programming".
What to log?
The quick, and usually wrong, answer is "everything!" It's great to know exactly what your code's doing during development, but in production, logging every byte of every packet you process isn't just silly, it defeats the purpose. Remember, most of your logs are going to be mind-numbingly normal, and most of the time, if you're looking at the logs you're looking for problems. Having to sort through megabytes of chaff makes your job much harder.
davorg suggests tuneable levels of logging, and I strongly concur. I'd suggest making the logging level a command-line parameter (the -v switch seems to be popular for this purpose), rather than davorg's suggested environment variable, if only because it's more visible to the user.
In general, it's probably best to log more data than less. Definitely log key data: if you're munging sendmail logs looking for mail to abuse, root, or postmaster, then log the To: address of each mail sent, and whether you processed or discarded the record.
If you're going to be counting on the logs as backups (this may make more sense than doing "real" backups if the "real" backup would entail a database dump, for instance, using much more space than a flat-text log), you will of course have to log all the data necessary to rebuild a useful system. This might mean logging everything, or you may be able to skip "nice-to-have" data that aren't strictly required for functional operation.
Log all error conditions! Log any unexpected input! Make these log entries immediately obvious! (I tend to prefix any such entries with ***ERROR or ***WARNING, depending on severity.) Check all important assumptions made about the data, and as many unimportant ones as is practical.
What not to log?
How to log
The easiest way is to just write to STDERR. There's no guarantee that anything will get saved: you'll have to redirect the output from the shell. This is sometimes useful, and often catastrophic (when you realize five minutes into a job that can't be backed out that all the output's going to the console, for instance). On the other hand, any logging's better than no logging at all, and writing to STDERR is often good enough, especially if you're just writing a quick hack.
The second-easiest way is to pull something from CPAN and use it instead of rolling your own logging code. A quick search found:
Whew! If you choose the third way (write your own logging routines), don't bother posting it to CPAN.
Keep in mind that your logfiles should be easy for both humans and computers to read. I usually find that I identify a problem by scanning the logs by hand, then write a script to parse them, identifying each instance of a problem and fixing (or at least hiding) it programatically. Try to separate input records with an empty line, for instance. Good coding style usually makes good logging style, too; read perlstyle.
While we're on the subject, the more standard your logs can be, the better, since tools written to work on one set can be applied to another. (Suggestions?)
I don't like XML logs; I find them too difficult to read in a standard text editor (which, if I'm logging in remotely, is probably all I have).
What other resources are available?
LogReport is a general online resource for logging software, knowledge, and practices.
A note about defensive programming
Defensive programming doesn't mean "recover seamlessly from all input errors without a peep", it means "recognize the errors and do what's appropriate". If you don't know that your input's screwed, and your "defensive" program is substituting in reasonable defaults, you could have a big problem (bogus results, broken software upstream making it into production, etc) that doesn't make itself known until it's too late to fix (or at least to fix easily). On the other hand, if your input's broken in a consistent and harmless manner, you don't want kilobytes of benign errors drowning out other, more significant problems in your logs. What to do?
In general, I'd lean towards more logging, rather than less, and more fine-grained control over logging verbosity. If you know that you're going to be encountering the same sort of brain-damaged input over and over (Word-generated HTML, for instance), a simple "got bogus input foo, ignoring" the first time you see it is probably okay.
Update: added readmore tag. Thanks trs80!
Update 2002 Aug. 9: added Resources section