|We don't bite newbies here... much|
The best thing I ever did with Nagios was defining my tests in the test and not in the nagios server.
That probably doesn't make sense and I bet there is an actual term for what I'm trying to describe. Basically, you make your tests produce something like TAP output that says "I'm about to test blah. I expect blah. I got blah... " and so on. Then your nagios plugin just has to interpret the results, instead of interpreting the behavior of the service.
Why is that cool? Every time you change a test you have to alter the bit that runs the test and then alter the bit that interprets the test to match. If you use a test that describes itself, you don't need the test description to be on the nagios server. You can change the test on the tested server and you're done. It makes developement and maintenance of tests soooo much easier.
Hooo... I bet that didn't make sense either...
Example: You have a web server that runs SSI and you want to test that SSI is working. Later you decide you want to make sure that exec is disabled.
Normal: You create a web page that uses SSI. You set up a nagios plugin to read that page. Later you add something that uses SSI exec to the page. You edit the nagios plugin to look for evidence that the SSI exec failed.
Better: You create a web page that tests SSI and produces the results as something parseable*. You set up nagios to read that page and parse the test results. Later you edit the web page that tests SSI and produces the parseable results. No fiddling with nagios required.
*Parseable test results can be super simple. I do a thing where h3 blocks containing test names are followed by two h2 blocks containing the literal expected output followed by the generated output. If the expected matches the generated, test passes.
Wheeee... I hope that made sense and doesn't sound too stupid!