http://www.perlmonks.org?node_id=462081

Jaap has asked for the wisdom of the Perl Monks concerning the following question:

I am currently maintaining a webserver for a (long-term) project at school. It would be nice to have some sort of alarmbell going off if the thing stops working.

I could make something myself, but it would have to run on a 2nd server somewhere offsite for best results.

I can imagine more sysadmins would be in the same situation. Does anyone know of a free server monitoring tool/site/something? How would you monitor the server? Any thoughts?

Replies are listed 'Best First'.
Re: [OT] Server Monitoring
by Anonymous Monk on May 31, 2005 at 15:08 UTC
      Pretty nice. Except when the whole machine or the network goes down.
        You can run it from another machine. It's pretty mature, and widely used.
        I'm a little curious - what sort of site monitor would you expect to survive the whole machine or system going down?

        It's not very difficult to set up redundant nagios monitoring, you could have an on and an off-site monitor just by sync'ing the configuration files (which probably don't change very much at all).

Re: [OT] Server Monitoring
by dragonchild (Archbishop) on May 31, 2005 at 15:10 UTC
    There's several "stops working" scenarios you might want to look at.
    • httpd stops working.

      Use a cron job on the webserver to make sure an httpd process exists.

    • Database stop working.

      Use a cron job on the DB server to make sure the DB process is running.

    • The machine itself stops working.

      Use a cron job on some other machine to ping the server.

    Cron jobs and Mail::Sendmail to your cellphone are all you really need.


    • In general, if you think something isn't in Perl, try it out, because it usually is. :-)
    • "What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?"
      True. But i was kindof hoping there would be a ping service out there already.
        If it takes you more than 30 minutes to write these three scriptlets, I'd be shocked.

        • In general, if you think something isn't in Perl, try it out, because it usually is. :-)
        • "What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?"
Re: [OT] Server Monitoring
by marto (Cardinal) on May 31, 2005 at 15:38 UTC
    Hi,

    You may want to take a look at the open source tool Server Monitor.
    I guess you could alter the code to get it to do exactly what you want, if it does not already.
    Hope this helps.
    Cheers,

    Martin
Re: [OT] Server Monitoring
by terce (Friar) on May 31, 2005 at 15:24 UTC
    If the machine is accessible from the public internet, there are many third party services (some free, some not) which will ping a given url and alert you if it's not accessible. Try googling for "website uptime".
Re: [OT] Server Monitoring
by cbrandtbuffalo (Deacon) on May 31, 2005 at 16:21 UTC
    No one has mentioned Big Brother. Although they have been bought by Quest, there is still a free version. And it's written in Perl with many Perl plug-ins.
      And what about Big Sister? It's written in Perl, it's Open Source and it should work, according to some colleagues of mine. Which I don't trust much ;)

      Flavio (perl -e 'print(scalar(reverse("\nti.xittelop\@oivalf")))')

      Don't fool yourself.
Re: [OT] Server Monitoring
by zentara (Archbishop) on May 31, 2005 at 16:34 UTC
    You have 2 monitoring options, one is a daemon which tests the existence of the server's process. Like intelli-monitor.pl.

    Here is something I did a while back in a similar vein, maybe it will give you some ideas. Maybe a 2 pronged approach, a daemon and an lwp script to test if the apache is locked up, or overloaded.


    I'm not really a human, but I play one on earth. flash japh
Re: [OT] Server Monitoring
by TedPride (Priest) on May 31, 2005 at 17:40 UTC
    All you need is something that requests a file from the server every x number of minutes - even if the file is only one character - and does some sort of alert if the request times out x number of time, and the script is still able to load pages from other sites (you have to test the latter, or it could be your Internet connect having problems and not the server). A few minutes of coding using the proper modules should cover this, assuming you have a computer connected to the Internet that can run this in the background. See HTTP::Request and LWP::UserAgent.