Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Actively monitor space

by Anonymous Monk
on Aug 07, 2012 at 11:50 UTC ( #985949=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all. I currently work in a setting where I have access to cluster style computing. We have recently been having a lot of issues with jobs filling up all available disk space. I was wondering if someone knows the best way to actively monitor disk space and if there is a way to pause or otherwise modify submitted jobs to halt processes before the disk quota is reached.

Currently I am just using the  df -H command to check how much space is available, and I could see maybe coding this into a bash script but I feel like a perl script designated for this task would be more useful. Does anyone know of any modules or methodologies that would be useful for me to actively monitor disk space and take actions on currently running processes if neccessary? I am very much a noob, so please bare with me if my question lacks appropriate detail. Thank you for your time!

Replies are listed 'Best First'.
Re: Actively monitor space
by Jenda (Abbot) on Aug 07, 2012 at 12:17 UTC

    I don't think there is or can be anything general. This is something the system accepting and scheduling the jobs has to do. If you have the source code, find the best places where a disk space check could be added and jobs either rejected or postponed. If there is a way to "pause or otherwise modify submitted jobs" it depends on the scheduler. If we don't know what do you use, we can't help.

    Also keep in mind that blindly pausing all jobs might not help. There may be jobs producing a lot of data, jobs using lots of temporary space, but also jobs that aggregate data produced by other jobs and if you pause those, you are eventually going to run out of space anyway as there will be nothing that'd clean up the produced intermediary results. You need to know the scheduler and you need to know the jobs!

    Enoch was right!
    Enjoy the last years of Rome.

Re: Actively monitor space
by RichardK (Parson) on Aug 07, 2012 at 12:05 UTC

    You didn't say exactly what OS you're using, but have you might try using quota, it should be built into your version of linux/unix and you just have to configure it & turn it on.

    A quick google search will turn up lots of info about "linux quota" for example, or ask your local sysop.

Re: Actively monitor space
by sundialsvc4 (Abbot) on Aug 07, 2012 at 12:55 UTC

    There are many job-monitor systems out there for Unix and Linux, and one of the things they can regulate is “expected disk-space usage.”   But they really can’t observe how much disk space any particular job is using, nor can they necessarily regulate it (even with quota).   You need to know the jobs.

    Many shops try to do job scheduling with crontabs, and they get into trouble because you really can’t predict the completion-time of one unit of work.   You wind up either wasting time or doubling-up on resources.

    So-called “workflow monitoring” systems can be surprisingly effective.   The various components of what we call “jobs” can in fact all run as child-processes of a parent that is simply forking children and waiting for them to complete.   Heck, if the workflow is predictable and unchanging, that can be how you run jobs with a great deal of control, e.g. driven by a Perl script.   All the work is done under the auspices of one parent pid.

Re: Actively monitor space
by ww (Bishop) on Aug 07, 2012 at 13:48 UTC

    Clumsy pseudo-pseudo-code... from one who doesn't work with clusters... nor have any ken of their peculiarities, but could something like this conceivably work?

    # monitor disk space in a cluster # create @list_machines -- which do you need to check my $i = 0; # create counter (flag) for $_(@list machines) { my $machine = $_; # select machine # This and the next line == +the WEAK POINTS? # log in/open $machine # if necessary, my @dirs = selection of dirs to check for change my $dir (@dirs) { readdir the dirs in preceding line; # dir /as for win + to obtain dir of all files in all subdirs write dir results to local RAM file, $dir.$machine.$i $i++; if ( $i == 2 ) { $i = 0; } next; } } sleep (some amount of time); for $_(@list machines) { my $machine = $_; # select machine if ( (exists $dir.$machine.0) && (exists $dir.$machine.1) ) { compare those two files (diff, df, pure-perl list comparison) ...PROFIT... } }
Re: Actively monitor space
by bfdi533 (Friar) on Aug 08, 2012 at 03:38 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://985949]
Approved by Ratazong
Front-paged by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2017-03-24 03:06 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (295 votes). Check out past polls.