Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Balancing Logging and Code Readability

by ack (Deacon)
on Sep 14, 2009 at 15:44 UTC ( #795164=perlquestion: print w/ replies, xml ) Need Help??
ack has asked for the wisdom of the Perl Monks concerning the following question:

Over the past few years, I've been progressively interested in some of the more sophisticated and useful strategies to integrate logging into my Perl scripts. My interest is both for debugging and maintenance purposes and also for capturing the results of my scripts.

But what I have found is that as I integrate logging into my scripts, it is becoming harder and harder to see and follow the logic of the "heavy-lifting" that my scripts are accomplishing...in essence, I am having trouble "seeing the forest for the trees."

My scripts are not continuous-running or long-running scripts; though they often taken many minutes to complete. Many of them are a multitude of scripts that are integrated together using some master control script that doles out the work to the other scripts.

I used to "roll my own" logging so that I could print results to the screen and to log files...but it was tedious and time-consuming to reinvent the wheel every time I wrote a script. So I have, in the last few months, migrated to Log4Perl. It is wonderful and gives me everything I need and a tremendous amount of flexibility...just like it promised to do. I am still a novice and am learing more and more every day about how to use it more effectively.

But with its power comes in vivid colorful obviousness the old problem that I found even with my own home-grown logging: the plethora of logging invocations throughout my code makes it increasingly difficult to follow the real logic of my scripts. The logging invocations are distracting and become "noise" in my logic.

This seems like something is wrong and seems to almost defeat the whole idea of flexible, integrated and pervasive logging in scripts.

So my interest is: How does one balance the effective use of logging while maintaining the sense and view of the intrinsic logic of the scripts being logged? How does one ensure that the logging doesn't become so much "noise" that the original intent of the code is lost?

UPDATE: Thanks to everyone for all the great advice and comments. One thing I didn't note (slipped my mind) is that most of the logging that I use is really not for debugging. Rather I use it to allow me a lot of flexibility in how to capture the script's results. Using, for example, Log4perl I can easily have the output go just to the screen...which I use for most initial debugging and output formatting adjustments, etc...or with simple changes I can have the output got to a log file(s) and to the screen. I can also get only certain types of output to the screen (for example, I can just send notifications when certains stages are processing completed to the screen and a log file and then have all of the details go just to the log file). Even with Log4perl I still tend to do, as one responder apparently does, too...loggings or prints that I ultimately remove when the application script gets into production. That sort of helps me "clean up" the script to regain a sense of the logic and script flow. Anyway, thanks to all of the great feedback. It is comforting to know that I'm not the only that's faced the dilema of balancing code and logging readability.

ack Albuquerque, NM

Comment on Balancing Logging and Code Readability
Re: Balancing Logging and Code Readability
by Illuminatus (Curate) on Sep 14, 2009 at 16:27 UTC
    It has always seemed to me that the 8-level priority scale of syslog makes it pretty easy to classify your logging. While levels 0-1 probably don't apply to most scripts (unless they are truly mission-critical), levels 2-7 are pretty self-explanatory. Levels 2-4 should all be for messages that report conditions that are known problems the script recognizes, and most likely require some action (or at least investigation). Levels 5-6 are for messages that report conditions that may or may not be problems (as far as you know). Analysis of consistent reporting of such conditions over time may warrant them being 'promoted' to a higher level. Level 7 is for 'everything else'. When you know you have a problem, but you also know that none of the other logging identifies the condition(s), copious level-7 logging of data and conditions that are currently assumed as 'always OK' is done, since at least one of these assumptions must be invalid (or some of the higher-level logging would catch it).

    Unless there is a performance issue, lots of level-7 logging is almost never a bad thing. Since all log messages can be identified by severity, it is an easy thing to filter out this 'noise'.

    fnord

      I really like when people actually use syslog in custom inhouse solutions. But unfortunately it seems that most developers leans toward ease of implementation rather than ease of maintenance. I guess if you make a small project it makes sense not to log to syslog but if you make something that is in production and is critical to a production system it SHOULD use syslog.

      When using syslog it is ALOT easier to get all logs to a central location and actually filter them with tools like Swatch or Logwatch.

      I hope more developers could put on the system administrator's cape and learn the mantra re-use and NOT re-invent.

Re: Balancing Logging and Code Readability
by SuicideJunkie (Priest) on Sep 14, 2009 at 16:55 UTC

    How to keep the logging from swamping out your code logic is a good question.

    One thing I would suggest is that info and debug level logging can replace comments. Just like code, there is no sense duplicating non-code either.

    As for the warn and error levels, testing for those cases could mostly be done at the beginning of a function, while you're gathering and validating parameters. Rather than putting a log message in the center of a function where it makes a call, try to move the logging into the function being called. The function was probably written to keep things clear and avoid duplication of code, so you might as well use it to avoid duplication of logging commands too.

    If you can keep the conditional logging confined to the beginning/end and thus out of the logic, and then replace one line comments with one line debug/info logging, the code should stay clean. The trick would be learning to read debug-info logging as if it were a comment. Perhaps by adding a rule to the code highlighting in your editor.

      I've had the same problem as the OP and the idea of adding a syntax rule to recognize log lines and highlight them as comments crossed my mind as well. Then I remembered I saw a perl module on CPAN once that allowed you to write comments in a special way that would be logged. I can't seem to find it after ~10 minutes of searching but it did look interesting so if this rings a bell for someone else, please post a follow-up.
        Since you're already using Log::Log4perl, you might check out it's built-in :resurrect tag or the similar functionality provided by Filter::Log4perl.

      Wow! Excellent suggestions and some interesting strategies that I hadn't ever thought of.

      One thing I would suggest is that info and debug level logging can replace comments. Just like code, there is no sense duplicating non-code either.

      That is something that had never occured to me. I really like that. It will take some getting used to, however. I tend do copious commenting and have found that, to some extent, the commenting (as with the logging) can easily get in the way of viewing the actual code logic and flows. The idea of using logging as an alternative to commenting sounds really powerful and offers the opportunity to "kill two birds with one stone" as they say.

      As for the warn and error levels, testing for those cases could mostly be done at the beginning of a function, while you're gathering and validating parameters.

      Again, a great suggestion. I will work on that.

      Rather than putting a log message in the center of a function where it makes a call, try to move the logging into the function being called. The function was probably written to keep things clear and avoid duplication of code, so you might as well use it to avoid duplication of logging commands too.

      So true and an excellent observation and strategy.

      The trick would be learning to read debug-info logging as if it were a comment. Perhaps by adding a rule to the code highlighting in your editor.

      I agree; it is something I'll have to work on.

      One thing that your suggestions leave me curious about is sort of highlighted by that last comment (re: "...learning to read debug-info as if it were a comment..."). Does that bode poorly for maintainability of code by others? That is, does that tend to make the code less accessible to others who may not have developed the facility of reading debug-info as comments?

      Just curious. Thanks so much for the great ideas and suggestions. I really appreciate it.

      ack Albuquerque, NM
Re: Balancing Logging and Code Readability
by kwaping (Priest) on Sep 14, 2009 at 23:59 UTC
    I think you've hit upon a very basic dilemma that most, if not all, good programmers have. I know that issue is something I struggle with as well.

    My solution to the problem has been two-fold:
    1. I make sure that the first thing on the line is the "important" code.
    2. I sometimes line-break then indent my logging code, to further emphasize its subordinate status.

    Here's an example:
    # bad log("This code is broken!") unless $obj->method_call(arg1 => 'abc', ar +g2 => 123); # good $obj->method_call(arg1 => 'abc', arg2 => 123) or log("This code is broken!");
    I'm not saying my way is the One True Way, but that's what I do and it works for me.

    ---
    It's all fine and dandy until someone has to look at the code.

      I have been thinking about a similar approach. I think mine is a bit more drastic (probalby unnecessarily so). I had thought about moving all of my logging invocations over to the right side of the code margins so that they are drastically dislocated from the code. I was thinking that it would give a more visual continuity to the code distinctly separate and different from the logging.

      I haven't tried it yet. I thought I'd take one of my more lengthy scripts and try it out just to see how it looks visually.

      Your reflection upon your strategy gives me additional motivation to look into trying this strategy, too.

      As I think about the dilema, I am feeling a strong sense that putting most all of these suggestions together as an overall strategy could be an excellent solution to my challenge.

      Thanks...to everyone!

      ack Albuquerque, NM
        I'm interested to see what your final strategy is. Please keep us updated!

        ---
        It's all fine and dandy until someone has to look at the code.
Re: Balancing Logging and Code Readability
by Marshall (Prior) on Sep 15, 2009 at 07:43 UTC
    Here is a performance tip for you: If you have some debug statement that was useful/needed in some low level routine that potentially might get called a million times in the real app, use the "constant" pragma to set the value of your debug flag. (see example below). The advantage is that Perl 5.10 and maybe 5.8?, will detect that the "if" statement below could never be true if DEBUG => 0, and that statement is just not even compiled! So there is no run time check of that flag like there would be if say "if $DEBUG were used.
    #!/usr/bin/perl -w use strict; use constant DEBUG => 1; print "this is debug" if DEBUG;
    Update:
    I don't know of any "one size fits all" advice for logging. When I write the code initially, I often have lots of print statements so that I can verify that my logical thinking is right and that the code is doing what I think it is doing. I write some code, debug, then take out most of these print statements. Perl compiles and runs so quickly that using "print" is often easier than fiddling with the debugger!

    When I release an application to the users, the objective of the logged info is not for me to "debug" the problem meaning know exactly why it happened at a fine level of detail. The objective is for me to have enough information so that I can re-create your problem. Once I can re-create the problem on my machine, then I will figure out "why", maybe by turning on some flags like above in my example.

      Another common thing to accomplish something similar is to declare a vprint() or vwarn() function and have the check for DEBUG only occur there (instead of all over the place in your code). Then you just vprint("debug stuff") everywhere. Still doesn't get rid of the clutter but it ends up being a bit less than trailing if DEBUG everywhere.

        Very good suggestion and straegy. That strategy can be extended to several other challenges I've encountered with logging...and with debugging, too. In fact, whether one uses logging or (as I noted to Marshall) just straight prints for debugging, gathering the frequently repeated code(s) into subroutines always makes my code much more readable and much less "cluttered."

        Excellent observation and suggestion. Thanks, saberworks.

        ack Albuquerque, NM

      I don't know of any "one size fits all" advice for logging.

      I totally agree. In fact, I find that "one size" doesn't even work for my own challenges.

      Thanks for the ideas...especially about use constants. I haven't had any reall performance problems with the logging, but I had wondered if there might be a way to handle at least some of the challenge if, in the future, I experienced a performance problem.

      As I noted in my Update to my original post, I tend to use logging mostly for its flexibility in capturing script progress and capturing script results.

      I, like you, tend to use print statements for my debugging. At the moment I attribute my preference for prints vs. logging to the reality that I'm still new to the logging capabilities that Log4perl offers. I have presumed that with more experience and understanding I would tend to migrate to using logging for debugging, too.

      However, I've never had any real problem with using prints for debugging so I haven't really had any strong incentive or desire to migrate away from that strategy. I am, however, experimenting with using logging and have found that it has some interesting and useful contributions to make to my debugging. So, for me, the vote is still out on using logging for debugging.

      Thanks so much for your reply. I really appreciate it.

      ack Albuquerque, NM
      Here is a performance tip for you:
      use constant DEBUG => 1; print "this is debug" if DEBUG;
      While this is all well and fine, it can't be changed in run-time. Log::Log4perl has a mechanism is_level that allows for reasonably high performance while allowing configuration changes in run-time (init_and_watch).
      $logger->debug("Erroneous array: @super_long_array") if($logger->is_debug());
      Benchmark your code and see if you can live with the penalty. Performance is not everything.
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
        Performance is not everything.
        Quite true!

        When you make a logging file, you have to think about how big can it get? And how and when to you "clean it up?".

        I have log files that I "prune" once per month or even once per year!

        I have applications where I tell the users to turn on some .cfg flags when they are having troubles. My goal is to get enough info so that I can replicate their problem on my machine.

        Ok, now that I can replicate the problem, I am in my development environment (super fancy!:, ie Textpad). I can turn on debug things that would otherwise "just bomb" the user system in terms of the shear amount of output.

        So there are reasons for this "turn on option x, requires changing the code" other than just execution performance, turning some option on might generate so much output that that in itself causes a problem. I allow log options in the ".cfg" file that will be helpful to me in replicating the problem. I have other other options in the code that I use to debug the problem and find out "why".

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://795164]
Approved by Old_Gray_Bear
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2014-08-20 09:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (108 votes), past polls