Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

DeathClock

by bpoag (Monk)
on Feb 21, 2007 at 21:48 UTC ( [id://601435]=sourcecode: print w/replies, xml ) Need Help??
Category: Utility Scripts
Author/Contact Info bpoag@comcast.net
Description: DeathClock attempts to determine how much time remains before a given filesystem has zero free space remaining, based on how quickly existing storage is being utilized. It will send a panic message via email to one or more recipients depending upon how confident it is that death (zero free space in the filesystem) is imminent. DeathClock can be used as a monitoring tool as well, spitting out predictions of woeful and untimely filesystem demise at user-specified intervals. It can either be run as a one-time gauge, or set to monitor a filesystem constantly. DeathClock is a morbid script that lives a mostly solitary life, with the possible exception of his partially mummified mother, in a gothic-inspired home on a hill overlooking a motel. It enjoys taxidermy, quiet dinners with the motel guests, and attacking unsuspecting customers while they shower.

Example output:

# /usr/local/bin/deathclock.pl /prod 37 1 1 0 foo@bar.com


DeathClock: Starting up..
DeathClock: Collecting 37 seconds of growth information for /prod. Please wait......................................
DeathClock: 85094.79 MB remaining in /prod. Estimated time of death: 7w 5d 9h 56m 9s.


#!/usr/bin/perl
#
# DeathClock 0.1 written 021307:1105 by BJP
#
# DeathClock attempts to predict how much time remains until a given f
+ilesystem runs out of storage.
#
# Usage: deathclock.pl <filesystem> <number_of_samples> <sleep_interva
+l> <cycles> <panic_threshold> <email_address>
#
# 
#
$version=0.1;

use Mail::Sendmail;

startupRoutine();
mainRoutine();

sub mainRoutine()
{
  while(true)
  {
    collectStatusInfo();
    determineTimeRemaining();
    $counter++;

    if ($samples==$counter-1)
    {
      print"\n";
    }
         
    if ($samples<=$counter-1)
    {
      print "DeathClock: $spaceRemaining MB remaining in $filesystem. 
+Estimated time of death: $timeRemaining.\n";
    }
         
    else
    {   
      print ".";
    }
         
    if ($counter-$samples>=$duration && $duration!=0)
    {
      print"\n";
      exit();   
    }
         
    sleep($sleep);
  }
}


sub collectStatusInfo()
{
   @dfStuff=`df -m | grep " $filesystem\$"`;
   @filesystemState=split /\s+/,@dfStuff[0];
   $filesystemSize=$filesystemState[1];
   $spaceRemaining=$filesystemState[2];
   unshift(@history,$spaceRemaining);
   if ($counter>$samples)
   {
     pop(@history);
   }
}
sub determineTimeRemaining()
{
 $timeRemaining=$spaceRemaining/((($history[$samples]-$history[0])/($s
+amples*$sleep))+0.0001);

 ## 1) ^^^^ A really, REALLY long way of saying y=mx+b and solving for
+ x :)
 ## 2) The +0.0001 there is to prevent div-by-zero errors when the del
+ta stays flat for the entire sample set.


  if($timeRemaining<5270400 && $timeRemaining>0) # 2 months
  {
    panicCheck();

    $wks=int($timeRemaining/604800);
    $timeRemaining=int($timeRemaining%604800);
    $day=int($timeRemaining/86400);
    $timeRemaining=int($timeRemaining%86400);
    $hrs=int($timeRemaining/3600);
    $timeRemaining=int($timeRemaining%3600);
    $min=int($timeRemaining/60);
    $sec=int($timeRemaining%60);
 
    $timeRemaining="".$wks."w ".$day."d ".$hrs."h ".$min."m ".$sec."s"
+;
  }

  else
  {
    $timeRemaining="Immortal (>2 months)";
  }
}


sub panicCheck()
{

  if ($timeRemaining<3600 && $timeRemaining>=0)
  {
    $panicStateDuration++;
  }
 
  else
  {
    $panicStateDuration=0;
  }
 
  if ($panicStateDuration>=$panicThreshold && $panicThreshold!=0)
  {
    print "DeathClock: Panic threshold exceeded!\n";
    notifyRecipients();
  }
}


sub notifyRecipients()
{
 
 if ($counter>=$lastPanic+$silencerWaitTime || $firstPass==1) # If we'
+ve waited long enough between sending emails..
 {
   print "DeathClock: Sending panic notification to $mailRecipients.\n
+";
 
   %mail=( To => $mailRecipients,
   From => $mailSender,
   Subject => "DeathClock Panic Notification",
   Message => "DeathClock strongly believes less than 1 hour remains u
+ntil $filesystem on $whereAmI runs out of space." );
 
   sendmail(%mail) or die("Send failed: $Mail::Sendmail::error\n");
   $lastPanic=$counter;
   $firstPass=0;
 }
 
  else
  {
   print "DeathClock: Supressing email notification. Recipient(s) were
+ notified less than 10 minutes ago.\n";
  } 
}
 

sub startupRoutine()
{
 
  if ($#ARGV!=5)
  {
    if ($#ARGV<5)
    {
      print"\n\nToo few arguments specified. See below.";
    }
 
 else
 {
   print"\n\nToo many arguments specified. See below.";
 }
 
 print "\n\nDeathClock v$version written 021307:1105 by BJP";
 print "\n\n    Usage:  $0 <filesystem> <number_of_samples> <sleep_int
+erval> <cycles> <panic_threshold> <email_address>";
 print "\n  Example:  $0 /tmp 10 2 34 18 foo@bar.com";
 print "\n  English:  Monitor /tmp, based on 10 readings, checking eve
+ry 2 seconds, and do so 34 times. If DeathClock sees less than an hou
+r";
 print "\n            remains for 18 cycles in a row, it will notify f
+oo@bar.com\n";
 print "\nSpecifying 0 for the cycles parameter will run DeathClock in
+definitely. Specifying 0 for panic_threshold will disable panic repor
+ting.";
 print "\nMultiple email recipients can be specified using a comma del
+imiter, and no spaces between the addresses.";
 print "\n\n\n";
 
 exit();
 }
 
 else
 {
        # Usage: deathclock.pl <filesystem> <number_of_samples> <sleep
+_interval> <cycles> <panic_threshold> <email_address>
 
 print "\n\nDeathClock: Starting up..\n";
        $whereAmI=`hostname`;
 $mailSender=$ENV{"USER"}."\@"."$whereAmI.yourcompanyname.com";
 $filesystem=$ARGV[0];
 $samples=$ARGV[1];
 $sleep=$ARGV[2];
 $duration=$ARGV[3];
 $panicThreshold=$ARGV[4];
 $mailRecipients=$ARGV[5];
 @history=();
 $queueLength=scalar(@history)+0;
 $silencerWaitTime=600/$sleep; # hardcoding this at 10 minutes..
 $firstPass=1;
 chomp($whereAmI);
 $wait=($sleep*$samples)+0;
 print "DeathClock: Collecting $wait seconds of growth information for
+ $filesystem. Please wait.";
 
  }
 
}
Replies are listed 'Best First'.
Re: DeathClock
by bpoag (Monk) on Feb 21, 2007 at 22:02 UTC
    I wrote this script not so much as a general-purpose utility but rather as a tool to tell me how much time remains when the proverbial shit hits the fan; How much time I have to work with will have a bearing on what method of solving the problem I choose. (Deleting stale files versus lining up more storage off the SAN, etc.)

    DeathClock does a few clever things that may not be immediately recognizable on the surface..

    It will attempt to predict when you'll run out of space based on a running average of ($samples) readings. The larger the number of samples, the more accurate the reading will be, but it's not recommended that you use a large value on filesystems where the amount of free space remaining is very volatile. For most purposes, a sample window of about 30 cycles does nicely.

    DeathClock will also build up something resembling "confidence" that a panic state is imminent. Right now, I have this value hardcoded to 1 hour, i.e. if DeathClock sees that a filesystem has consistently less than one hour remaining, it will nudge its confidence value ($panicStateDuration) higher and higher until it exceeds the user specified panic threshold value ($panicThreshold).

    Once this happens, an email is fired off, a timer is set so that the email recipient isn't bothered for another 10 minutes, and DeathClock continues to monitor the situation. Once the 10 minute window has elapsed, and it still sees a panic-worthy situation, it will fire off another email.

    You can set up DeathClock to spit out a single answer:
    deathclock.pl /home 30 2 1 0 foo@bar.com

    ..Or, you can have DeathClock monitor the situation live:
    deathclock.pl /home 45 1 0 0 foo@bar.com

    Play around with it, you'll see. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://601435]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-20 02:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found