Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Using flag files to monitor cluster jobs

by bwelch (Curate)
on Oct 26, 2005 at 20:45 UTC ( #503164=perlquestion: print w/ replies, xml ) Need Help??
bwelch has asked for the wisdom of the Perl Monks concerning the following question:

Starting with a script does around a dozen lengthy analysis tasks, so I'm trying to update it to use a cluster and do the jobs in parallel. Each analysis script is submitted with a load sharing facility (LSF) Using LSF to monitor the analysis jobs is frowned upon, as that tends to put a load on LSF and keep it from doing more important things. So, I need to use flag files that tell me when each analysis is done.

This submits a job and waits for completion:

use strict; my $jobID = `bsub -q long -o $cl_log -J seqPipe "$jobCmd"`; print "Waiting for analysis work.\n"; while (1){ last if ( -e "$resultsDir/analysis_completed" ); print LOG "."; sleep 15; } print "\nFound completion flag. Analysis jobs are done.\n";

Assuming each analysis job creates its own flag to indicate completion, this will get more complicated when I'm trying to monitor a dozen jobs. Later on, there's the possibliity that some jobs will have their own set of analysis jobs that they need to monitor.

I'm thinking this while statement will end up with a set of tests, one for each job. Still, it seems inefficient to keep testing for all those files until the last one has finished. Any ideas on better ways to grow this system?

Comment on Using flag files to monitor cluster jobs
Select or Download Code
Re: Using flag files to monitor cluster jobs
by GrandFather (Cardinal) on Oct 26, 2005 at 20:58 UTC

    It may sound a little bizzare on first glance, but you could use email to manage completion processing. If the analysis tasks take a long time and the completion processing doesn't have to be particularly prompt, then using email for signaling can be a quite viable option. :)


    Perl is Huffman encoded by design.
Re: Using flag files to monitor cluster jobs
by BrowserUk (Pope) on Oct 26, 2005 at 21:15 UTC

    Stick the names you are looking for in an array and splice them out as they are found. When the array is empty, all your jobs are done.

    Update: See benizi's post below for an important correction to this untested logic.

    my @dirs = qw[ ... ]; my @rFiles = map{ "$_/analysis_completed" } @dirs; while( @rFiles and sleep 15 ) { -e $rFiles[ $_ ] and splice @rFiles, $_ for 0 .. $#rFiles; } print "All done";

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      When using splice in a loop like that, don't forget to reverse the indices. (Splicing at index N doesn't affect indices 0..N-1, but it does map indices N+1..$#END to N..$#END-1.) And you should specify the length (1, in this case) of the array to be spliced out.

      e.g. Using your code, with @dirs = qw/A B C/;, and running "touch A/analysis_completed" from another shell.

      $ tree . |-- 503170.pl |-- A |-- B `-- C $ perl -l 503170.pl All done $ tree . |-- 503170.pl |-- A | `-- analysis_completed |-- B `-- C

      When A/analysis_completed exists, your code splices the entire @rFiles array. The following would do the right thing:

      my @dirs = qw[ ... ]; my @rFiles = map{ "$_/analysis_completed" } @dirs; while (@rFiles and sleep 15) { -e $rFiles[$_] and splice @rFiles, $_, 1 for reverse 0..$#rFiles; } print "All done";

        Very good point.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Using flag files to monitor cluster jobs
by chromatic (Archbishop) on Oct 26, 2005 at 22:32 UTC

    If you're working on an operating system that has some kind of file monitoring or file notification system, you can avoid the (admittedly simple) sleep loop by registering interest in file creation or deletion. SGI::FAM is an old module that gives access to SGI's FAM library. (See The Watchful Eye of FAM for more.)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://503164]
Approved by GrandFather
Front-paged by kwaping
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (9)
As of 2014-10-22 22:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (122 votes), past polls