Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)

by scorpio17 (Canon)
on Dec 02, 2009 at 19:26 UTC ( [id://810655]=perlmeditation: print w/replies, xml ) Need Help??

Sometimes you have a script that needs to be run at regular intervals (weekly, daily, hourly, etc.) On unix, this is usually accomplished using cron. However, if a cron job takes a sufficiently long time to run, the next instance may begin running before the current instance completes. Depending on what the script does, this may create problems such as: consuming computer resources (cpu, memory, disk, bandwidth), corrupting database tables, or damaging files and directory structures.

In my case, I have a script that parses log files from several different locations (some at remote sites), and aggregates the results into a single database. There is a web app that allows users to view this data, issue queries, generate reports, etc. The process of parsing all of the log files may take several minutes, so doing this "live" every time someone accesses the web app is not feasible. Instead, I run my "database update" once per hour, using cron, and let the web app pull from the database. The disadvantage is that "new" data may not show up for up to 60 minutes. Originally, this was not a problem. But as more people have begun using the web app, there have been requests to see more "dynamic updates". The easy fix was to simply run the cron job every 10 minutes. The update usually takes no more than 5 minutes, so this works. But occasionally a big update may take 15 minutes or more. Trust me when I tell you that running an update, while another update is in progress, causes Very Bad Things To Happen ™

My initial fix to avoid this was to create a "lock file" as the script began running, then delete it when it finished. I checked for the existence of the lock file before creating it, and if it was there already I could assume that another job was running, and simply exit.

This seemed to work, until we had an unexpected power outage (in the middle of an update, of course). Since the script failed to exit normally, it didn't delete the lock file. Several days later, users began to complain about missing data. When I investigated the cause, I discovered that the old lock file was causing all the updates to abort.

At that point, I decided to actually store the job's process id in the lock file. Then, if a job sees a lock file, it can read the old process id, and check to see if there's actually a running process that has that id, and if so, exit - otherwise, assume it's the result of an interrupted job and continue on, cleaning up as necessary.

And then I had another problem arise, completely unrelated, except that it also needed a script to be run via cron, and bad things would happen if multiple instances ran at the same time. So I started thinking about how I could abstract this "only run one instance via cron" functionality out into a module, so that I could easily add it to any script with a single line (something like, 'use Cron::AvoidMultipleRuns;', for example).

I have not been able to find anything that does this on CPAN, nor using google- but I may be overlooking it. If anyone knows of anything like this, please let me know.

Otherwise, I'm posting my solution here. If you have any comments or suggestions for ways to improve it, please let me know.

Here's an example script, pretend that this is run via cron:

#!/usr/bin/perl use strict; # make it impossible to run this script more than once at a time use Cron::AvoidMultipleRuns; print "Running... pid = $$\n"; # pretend this script actually does something here... sleep 20;

Here is the module abstracting out the "run only once" behavior:

package Cron::AvoidMultipleRuns; use strict; my $cleanup; INIT { $cleanup = 0; (my $pid_file = "$0.pid") =~ s/\.pl//; if ( -e $pid_file ) { open my $fh, '<', $pid_file or die "Can't open $pid_file for reading: $!\n"; my $line = <$fh>; my (undef, $pid) = split(/\s+/, $line); if ($pid) { print "Found a pid file: pid = $pid.\n"; my $status = kill(0, $pid); if ($status) { print "old job is still running.\n"; exit; } else { print "The old job is no longer running.\n"; } } } open my $fh, '>', $pid_file or die "Can't open $pid_file for writing: $!\n"; flock ($fh, 2) or die "can't obtain exclusive lock on file $pid_file: $!\n"; print $fh "pid: ", $$, "\n"; close $fh; $cleanup = 1; } END { (my $pid_file = "$0.pid") =~ s/\.pl//; if ( -e $pid_file && $cleanup ) { unlink $pid_file or die "can't unlink $pid_file : $! \n"; } } 1;

Here's how it works: if the script is called 'demo.pl', then when it runs, a file called 'demo.pid' is created containing a line like this: "pid: 12345" where '12345' is the process id number. When the job completes, 'demo.pid' is deleted. If you try running demo.pl again, in another terminal, while the first job is still running, it should see the pid file and exit immediately. If you create a bogus pid file, then try running demo.pl, it should see that no process with that id is actually running, and go ahead and create a new pid file with the current, valid, process id. I originally used BEGIN and END, but I didn't like that 'perl -c demo.pl' actually caused a pid file to be created/deleted. So I changed BEGIN to INIT (to avoid creation), then introduced the $cleanup variable to avoid the attempted deletion. It works, but seems a little kludgy. My other concern is whether or not this module would have some kind of unwanted interaction with other modules that also use the INIT and END blocks.

Replies are listed 'Best First'.
Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by JavaFan (Canon) on Dec 02, 2009 at 23:43 UTC
    I often use:
    use Fcntl qw !LOCK_EX LOCK_NB!; die "Another instance is already running" unless flock DATA, LOCK_EX|LOCK_NB; ... your code here ... # Don't forget __END__ or __DATA__ __END__
    Note that file locking solutions only prevent concurrent running on the same OS instance. If the job can be run from different machines, but you want to prevent concurrent runs from different boxes, you'll need a different solution. (Acquiring a database lock for instance).

      That works? That's hot.

      It's beautiful.
Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by merlyn (Sage) on Dec 02, 2009 at 19:57 UTC
    You need the Highlander solution ("there can be only one").

    -- Randal L. Schwartz, Perl hacker

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

      ++merlyn. I followed the highlander link and spotted this in the code listing at line 4:
      open HIGHLANDER, ">>/tmp/renew.cgi.highlander" or die "Cannot open hig +hlander: $!";
      I never (ever!) thought I would have cause to correct merlyn, but this contains one of my pet peeves, error messages which don't tell you what's happening, e.g.
      Cannot open highlander: No disk space at ... Cannot open highlander: Permission denied at ...
      Analyzing any problem requires the support person to read the code to discover the problem filename. They don't want to do that (and in a compiled language they can't), they just want to fix the problem. Suggest this:
      my $highlander = '/tmp/renew.cgi.highlander'; open HIGHLANDER, ">>$highlander" or die "Cannot open $highlander: $!";
      Cannot open /tmp/renew.cgi.highlander: No disk space at ... Cannot open /tmp/renew.cgi.highlander: Permission denied at ...
        autodie both adds the file name to the error message and saves you from having to do the or die ...; dance:
        # open.pl use strict; use warnings; use autodie; open HIGHLANDER, ">>/root/renew.cgi.highlander";
        when run (as non a root user), gives:
        +% perl open.pl Can't open '>>/root/renew.cgi.highlander' for appending: 'Permission d +enied' at open.pl line 6 +%

        Wouldn't it be nice if the $! error message came with the filename already embedded?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        That code was written pretty early. The suggestion you're making is one I started preaching a bit later.

        -- Randal L. Schwartz, Perl hacker

        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by ikegami (Patriarch) on Dec 02, 2009 at 20:18 UTC

    At that point, I decided to actually store the job's process id in the lock file.

    The OS is better at controlling concurrency because it can do clean up for dead processes. Whether the file is locked or not could be used to indicate concurrency.

    package Cron::AvoidMultipleRuns; use strict; use warnings; use Cwd qw( realpath ); use Errno qw( EWOULDBLOCK ); use Fcntl qw( LOCK_EX LOCK_NB ); use File::Spec::Functions qw( rel2abs ); my $lock_file; my $lock_fh; { $lock_file = realpath(rel2abs($0)) . '.lock'; open($lock_fh, '+>>', $lock_file) or die("Can't create lock file \"$lock_file\": $!\n"); if (!flock($lock_fh, LOCK_EX|LOCK_NB)) { undef $lock_fh; if ($! == ($^O =~ /Win32/ ? 33 : EWOULDBLOCK)) { die("Another instance of this program is running. Exiting. +\n"); } else { die("Cannot lock lock file \"$lock_file\": $!\n"); } } } END { if (defined($lock_fh)) { undef $lock_fh; # Release lock unlink($lock_file) ;#or warn("Can't unlink lock file \"$lock_file\": $!\n"); } } 1;

    You had a bug where the lock can be defeated (accidentally or otherwise) by using a symlink to the script. It's fixed by realpath(rel2abs()) in my code.

    die is overkill (punny!) for errors in removing the pid file. In fact, even a warning is sounds unnecessary to me.

    An even better method for Windows would be to create a named mutex instead of creating a file. That way, it gets cleaned up automatically.

    Alternatively, You could simplify the code a lot by simply locking the script.

    package Cron::AvoidMultipleRuns; use strict; use warnings; use Errno qw( EWOULDBLOCK ); use Fcntl qw( LOCK_EX LOCK_NB ); our $lock_fh; { open($lock_fh, '<', $0) or die("Can't open script \"$0\": $!\n"); if (!flock($lock_fh, LOCK_EX|LOCK_NB)) { undef $lock_fh; if ($! == ($^O =~ /Win32/ ? 33 : EWOULDBLOCK)) { die("Another instance of this program is running. Exiting. +\n"); } else { die("Cannot lock script \"$0\": $!\n"); } } } 1;

    I really dislike the name of the module. The module has nothing to do with cron and is not just useful for cron scripts, for starters. Then there's the problem that you don't want to prevent multiple runs. You want to prevent simultaneous runs.

    A useful improvements would be to allow the caller to specify a lock file name if he's not happy with the default.

Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by moritz (Cardinal) on Dec 03, 2009 at 08:54 UTC
    I think Cron:: is a bad top level namespace when the module is not specific to cron at all.

    What about using Process:: instead? Something like Process::AvoidConcurrentRuns or Process::AvoidMultipleInstances or Process::Singleton

    Also if you put this on CPAN, you might think about different possible use cases:

    • Users might want to use it exactly as you have shown
    • Users might want to use it as you've shown, but without the verbose output
    • Users might want to just query if another instance is already running, and behave as they see fit
    The interfaces for that might look like this:
    # first case: use Process::Singleton qw(:auto_verbose); # second case: use Process::Singleton qw(:auto); # third case: use Process::Singleton qw(is_running); if (is_running()) { ... } else { ... }

    This is of course just an example, but I want to encourage you to think more about the interface.

Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by scorpio17 (Canon) on Dec 03, 2009 at 22:07 UTC

    Thanks for the feedback, everyone.

    Here's my revised version:

    package HighLander; use strict; use Fcntl qw( LOCK_EX LOCK_NB ); open(our $fh, '<', $0) or die("Can't open \"$0\": $!\n"); unless ( flock($fh, LOCK_EX|LOCK_NB )) { print "Another instance of \"$0\" is already running. Exiting...\n"; exit(0); } 1;

    Here's an example of something using it:

    #!/usr/bin/perl use strict; use HighLander; # if this script is already running, you'll never make it this far... print "Running...\n"; sleep 10;

    A few notes:
    I like using flock much better than creating a pid file. But creating a lock file is just as annoying. Getting a lock on the script itself is brilliant! When the process goes away, so does the lock, so the issue with bogus pid files lying around after interrupted jobs goes away, too. I've tested this on RedHat linux (RHEL5). I've also tried it using a run of the real script and a run with a symbolic link - and it does the right thing in that case also. On WinXP, using perl v5.8.7 (ActiveState build 813), it seems to work - sort of - the second job exits, but I don't see the "Another instance is already running" message. But the original intent was to use this with cron, which doesn't apply to WinXP anyway. Win32::Mutex is probably more appropriate for Windows users. And if you really need the process id, Proc::Pidfile seems like a better choice.

    ikegami: I don't understand this line from your code:

    if ($! == ($^O =~ /Win32/ ? 33 : EWOULDBLOCK)) {

    This tests for an error condition, after getting a lock, and if you're running on Windows... 33?! It seems to work okay without this check, but I'd like to understand this better, for future reference.

    Also - it seems like the file handle ($fh) needs to be a global variable, so I used 'our' instead of 'my'. If I use 'my', then when the package goes out of scope, the lock gets dropped!

    mortiz: I'd like to add something like this:

    use Highlander qw( :verbose );
    To turn on the print statement. Something like this:
    ... unless ( flock($fh, LOCK_EX|LOCK_NB )) { if ($verbose) { print "Another instance of \"$0\" is already running. Exiting...\n +"; } exit(0); } ...

    ... but I don't see how to accomplish this using only an export tag. Any hints?

    I'm not planning to upload this to CPAN. It seems too simple to bother. Besides, merlyn solved this problem over 9 years ago! He should get the credit, not me.

    Thanks again for the helpful comments, everyone.

      I don't understand this line from your code

      It checks the reason for why flock failed instead of assuming it's because the file is already locked.

      flock returns error EWOULDBLOCK on unixy systems and error 33 on Windows.

      it seems like the file handle ($fh) needs to be a global variable, so I used 'our' instead of 'my'. If I use 'my'

      Correct. In my version, the END sub kept it alive.

      I'd like to add something like this: use Highlander qw( :verbose );

      First, :name traditionally has a meaning already. Dashes are usually used for options. As a bonus, -foo means the same thing as '-foo' even when strict is in use, so less quoting is needed.

      Secondly, suppressing the message by default is a bad idea. The option should silent the message when provided.

      On to the good stuff,

      use Highlander -silent;
      is the same as
      BEGIN { require Highlander; Highlander->import(-silent); }

      so you need to create a method called import that looks like

      my $opt_silent; sub import { my $class = shift; $opt_silent = 0; for (@_) { if ($_ eq -silent) { $opt_silent = 1; } else { require Carp; Carp::carp("Unrecognized symbol $_"); } } }

      But it won't work. The problem is that the module has already been executed (and the lock obtained) by require before import is called.

      But you know what? The option isn't really needed. The message can always be suppressed on the command line by using output redirection.

      I'm not planning to upload this to CPAN. It seems too simple to bother.

      Yet you had to write it and needed help to do so. You can also credit whoever you want in the docs.

Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by tokpela (Chaplain) on Dec 03, 2009 at 10:25 UTC

    I would call this a Mutex and would name accordingly. Maybe something like Process::Mutex?

    There are a few mutex implementations already on CPAN. I have used the Win32::Mutex module a bit in my development.

    If you are going to create a generic module, I would provide different solutions depending on the OS. The above solutions using flock should work well on *nix systems while the Win32::Mutex provides an interface for Windows.

      flock works on windows too
Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by kyle (Abbot) on Dec 03, 2009 at 03:51 UTC
Re: RFC: A new module to help avoid running multiple instances of the same script (via cron, for example)
by atcroft (Abbot) on Dec 02, 2009 at 19:59 UTC

    On *nix systems, if you know the pid of the process in question, you can also send it a signal 0, and check the results. Might be helpful....

      He's already doing that.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://810655]
Front-paged by keszler
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-28 16:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found