http://www.perlmonks.org?node_id=498310

Has anyone worked in an environment where program logs were placed in a central location so support people can easily access them? Does central logging work well, or is it generally too much effort for too few gains?

I work with a group that creates bioinformatics applications with perl and have been been asked whether a central logging system is both feasible and useful. These applications typically call both commercial and open source applications as part of their operation. Some of these don't behave well and/or change often, so it can be an effort to create logs that are both useful and make sense so someone other than the author.

I'm guessing that central logs could help support people isolate general problems, but that in most cases authors would need to be called to interpret application errors. It also likely depends on the experience and background of the support people. What do you think?

Replies are listed 'Best First'.
Re: Central logging methods and thoughts
by 5mi11er (Deacon) on Oct 07, 2005 at 21:37 UTC
    Centralizing syslogs can definitely be a good thing, they can also cause problems, or be difficult to implement.

    Why it's a good thing: better time syncronization and thus easier to decipher interoperation issues.

    The potential problems: Firewalls get in the way, locations geographically disperse, too many devices trying to report too much information to one box can cause network problems, syslog information loss, etc.

    If things are geographically dispersed, it might make sense to have a hierarchy of syslog servers involved. One server for each location could collect the syslog information for that site, then forward the logs to a central server located elsewhere. Closely related to that issue is the fact that firewalls could get in the way of the syslog traffic flow. Obviously, rules on the firewall need to be created/modified such to allow the traffic.

    And if you've got too much syslog data for a network or server to handle, then, you really need to look hard at whether it's a good idea any more. Generally, at that point, the corporation needs to distribute the syslog load, and a central log point no longer is feasible.

    And finally, check into syslog-ng (next generation) if you haven't already. We're getting ready to roll it into production here, but we've got at least a months worth of work to ensure everythings ready to migrate over...

    -Scott

Re: Central logging methods and thoughts
by Tanktalus (Canon) on Oct 07, 2005 at 21:04 UTC

    Are you talking about logging or tracing? I delineate the two: logging is to let the user know what you're doing, and tracing is to let the author know how the user screwed you up ;-)

    Central logs are fine (I'm thinking especially of syslog here). Central tracing, I'm not too sure it matters, as long as it's easy to find to give to the author.

Re: Central logging methods and thoughts
by Zaxo (Archbishop) on Oct 07, 2005 at 21:05 UTC

    IMO, it should be possible to write a log file parser which a user can apply to filter out unwanted entries. That will be much easier if everything which writes to a single file uses the same logging format. Inconsistent loggers should write to a different file.

    After Compline,
    Zaxo

Re: Central logging methods and thoughts
by aufflick (Deacon) on Oct 09, 2005 at 06:57 UTC
    One of the wackier ideas I have had to solve this problem, is to use a private IRC server.

    (waits for laughter to subside...)

    But seriously, I think it could work quite well. Standard syslog is flawed at a large site central logging because we ran out of channels, and that got me thinking. Here are some of the things that I think would be good about using IRC for application logging:

    • With IRC you have easy bot's to save the log to disk, and you can do that in more than one location to have redundancy of storage
    • When you are tracing a live problem, you can just log in to the appropriate logging IRC channel (with the appropriate security) and watch the log
    • Support staff could use regular IRC client features to alert them to error strings as soon as they occur within the log
    • The protocol is lightweight, well supported, and requires no extra daemons/libraries to be installed on your servers.
    • Multiple servers running the same application (eg. a web farm) could log to a single channel, thus automatically interleaving into a single time-sorted log (each server would use a different nick to allow easy source identification)

    Syslog does, of course, have many benefits. Not least of which is the ability to chose local/remote/local+remote logging via the syslog config. There are also extensions to syslog which address some of it's weaknesses.

    I'd be very interested if anyone has ever done anything like this. Of course I have left may issues unaddressed, like security etc.

    Update: Specifically, I'm very interested to hear if anyone has ever implemented some form of logging server using POE

      Wow, I'm not an IM'er, never have been; but, I think this borders on brilliance.

      The thing that sucked me in the most about your idea is the channels, and with the ability to name channels however you'd like, there are potentially infinite channel names (ok, pedantically, it only approaches infinity; I assume there is a character limit for the channel names). I imagine being able to create a channel for, perhaps, different network device types, say, Cisco Routers, and another for, say, Nortel Switches, and yet another for Checkpoint Firewall routers.

      Changing hats, as a server guy, I could create my own set of server channels keeping track of resources like drive space, memory, cpu usage; and changing hats again, as an application developer/baby-sitter, I can create channels for the interoperation of various applications that all work together, etc.

      Now some of these abilities are already in syslog, but we're pretty limited in the number of channels we can use, so trying to coordinate between all the groups to agree on the "standards" to keep from "poluting" one anothers syslog files could get pretty ugly.

      I also like the relatively light weight for the "broadcast" ability of the syslog information. I'm not very familiar with the actual IRC protocol implementations, but way back when, I think I recall that if you wanted to create an IRC 'server', that server just had to ask (and receive permission) to receive the IRC messages; and similarly a client simply had to ask a server to be able to receive the appropriate messages. This seems to be fairly light weight, and things are even better if IRC now can actually use multicasting.

      ++ many times for this very cool idea.

      -Scott

      On its face, it might seem like a good idea. The problem is that IRC was intentionally designed to accomodate delays in communication. The timestamp in a given log is the timestamp for when the client recieves the message. Lag in the network, on the IRC server, or on the client machine could easily lead to inaccurate timestamp data -- even to the point of causing events to appear in a different order from which they happened (from different processes).

      A partial solution would be sender-side timestamps, but then you have authority issues as well (how do you *know* someone doesn't accidently duplicate a login ID for a given application? what about multiple instances?). Most of these are solvable, but rely heavily on the senders to do the right thing.

      A solution which I have seen work well is implemented over a database, with a logging daemon running on each local host. It works sort of like this: an application performs IPC (in this case, it was an XML message to the local daemon using a telnet protocol) sending a few pieces of information (pid, status-code{1=warn, 2=err, etc.}, description). The local daemon timestamps it in the order recieved, and creates DB transactions that log the relevant info, from the daemon (including it's timestamp, the host name, etc.).

      In this setup, all applications log verbosely (not quite 'trace', but about 'debug' level), and the daemon can be configured to drop or forward messages at various levels. So, we can move to 'debug' on a given *machine* with one instruction to its daemon.

      There are some problems with the whole thing, but it has served us well overall.

      <-radiant.matrix->
      A collection of thoughts and links from the minds of geeks
      The Code that can be seen is not the true Code
      "In any sufficiently large group of people, most are idiots" - Kaa's Law
        Agreed that a simple protocol like IRC has issues with security & integrity. You would have to trust yourself and your colleagues notto be stupid or evil.

        With the system you use, do you find that you have scaling problems with the db inserts? I assume that the local daemon will retry if the db becomes unavailable, but what does your app do if the local daemon becomes unavailable?

Re: Central logging methods and thoughts
by pg (Canon) on Oct 08, 2005 at 03:01 UTC

    This depends on how tight the relationship is between programs.

    I prefer that all programs in the same application send theirs logs to the same repository, or at least does so for each logical group. This helps. When you have an issue with an application, it is not always possible to determine from onset, the program that is causing the problem. If the logs are everywhere, it will be very laborious for you to search around.

    Obviously the log messages should be nicely formatted, and other than the error message, also provide the following:

    • The name of the binary
    • Time of logging
    • If possible, the line of the source code. (or at least somewhere close)
    • It will be very helpful, if you can log things like data used, SQL statement submitted, SQLCODE etc. (things that allow you to determine problem without repeating it, or things that make the repeating possible)
Re: Central logging methods and thoughts
by Perl Mouse (Chaplain) on Oct 10, 2005 at 09:52 UTC
    Large companies tend to do their logging/monitoring centrally. They have thousands of devices (computers, disk arrays, switches, routers, tape robots, etc.) and only a handful of staff to monitor. Centralization is a necessity. There are commercial products like HP Openview and Tivoli.

    Central logging as another benefit: central logging implies remote logging. Remote logging means that if a machine goes haywire (or, in a hostile environment, get corrupted), it's less likely logs get wiped out.

    In many places I've worked, be it as an employee or a contractor, some sort of central logging was done. From simple things as FTP'ing local logs in a nightly batch job, to thousands of machines monitored/logged with Tivoli, displaying the current status of the environment on a monitor 3 metres wide.

    Personally, I'd go for central logs. If only because that means I know where to go digging for possible log file entries. But then, I look at the problem more from a sysadmin angle than a programmers' or end-users'.

    Perl --((8:>*