Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Lightweight Solution To "Only 1 Process Running" On AIX

by Limbic~Region (Chancellor)
on May 01, 2009 at 14:15 UTC ( [id://761299]=perlquestion: print w/replies, xml ) Need Help??

Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

All,
There are a number of non-perfect solutions to running only 1 instance of a process at a time. The common solutions are pid files, lock files, checking the process table - or some combination. The trouble with most of them is that they are not bullet proof (you can delete a locked file for instance). Dominus published a presentation on File Locking Tips and Traps back in 2003. One of the suggested solutions was to lock the file of the running process itself:
open SELF, "< $0" or die ...; flock SELF, LOCK_EX | LOCK_NB or exit; or flock DATA, LOCK_EX | LOCK_NB or exit; ... __DATA__
In the few instances I have needed this, one of these have worked. Recently I suggested a co-worker improve his code by using one of these techniques and he told me neither worked on AIX. The error in both cases is "unable to lock: A file descriptor does not refer to an open file."

Does anyone have any insight into why this is or have any suggestion on an alternative lightweight and nearly bullet proof way to accomplish the original objective?

Cheers - L~R

Replies are listed 'Best First'.
Re: Lightweight Solution To "Only 1 Process Running" On AIX
by Bloodnok (Vicar) on May 01, 2009 at 15:11 UTC
    ... hence your CB comment - with which I wholeheartedly agree :D

    I've seen code such as your 2nd example fail if there's no __DATA__ section in the file.

    As to the 1st example, well that's a different kettle of fish...

    Given a simple script (tst.pl):

    >cat tst.pl use warnings; use strict; use Fcntl qw/:flock/; open SELF, "< $0" or die ; flock SELF, LOCK_EX | LOCK_NB or die "$!";
    A simple run results in...
    >perl tst.pl A file descriptor does not refer to an open file. at tst.pl line 7.
    Re-ruuning and generating a truss log, using >truss -f perl tst.pl  > log 2>&1, the (business) end of which is...
    . . . 389372: open("tst.pl", O_RDONLY|O_LARGEFILE) = 3 389372: kioctl(3, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY 389372: fstatx(3, 0x30020848, 128, 010) = 0 389372: kfcntl(3, F_SETFD, 0x00000001) = 0 389372: kfcntl(3, F_SETLK, 0x2FF222A0) Err#9 EBADF 389372: access("/usr/lib/nls/msg/en_GB/libc.cat", 0) Err#2 ENOENT 389372: access("/usr/lib/nls/msg/en_US/libc.cat", 0) = 0 389372: _getpid() = 389372 389372: open("/usr/lib/nls/msg/en_US/libc.cat", O_RDONLY) = 4 389372: kioctl(4, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY 389372: kfcntl(4, F_SETFD, 0x00000001) = 0 389372: kioctl(4, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY 389372: kread(4, "\0\001 &#65533;\007\007 I S O 8".., 4096) = 4096 389372: lseek(4, 0, 1) = 4096 389372: lseek(4, 0, 1) = 4096 389372: lseek(4, 0, 1) = 4096 389372: _getpid() = 389372 389372: lseek(4, 0, 1) = 4096 389372: close(4) = 0 A file descriptor does not refer to an open file. at tst.pl line 7. 389372: kwrite(2, " A f i l e d e s c r".., 68) = 68 389372: kfcntl(2, F_GETFL, 0x00000008) = 1 389372: kfcntl(1, F_GETFL, 0x00000008) = 1 389372: kfcntl(2, F_GETFL, 0x00000008) = 1 389372: close(3) = 0 389372: kfcntl(2, F_GETFL, 0x00000008) = 1
    We see, from 389372: open("tst.pl", O_RDONLY|O_LARGEFILE)        = 3 that the script is successfully opened on file descriptor 3.

    Later, we see that the file on descriptor 3 is both open, the operations are valid and the file will/should close across an exec call...

    389372: fstatx(3, 0x30020848, 128, 010) = 0 389372: kfcntl(3, F_SETFD, 0x00000001) = 0
    From 389372: kfcntl(3, F_SETLK, 0x2FF222A0)          Err#9  EBADF, we can see that the kernel considers FD3 to refer to a closed file - but we haven't seen a close - either explicitly (via a call to close()) or implicitly (via an intervening call to exec()).

    Thus we can only conclude that there is an underlying problem with AIX.

    As to a suitable answer ... I'm afraid I can't help you (aside from the fact that there is, in perl 5.8, nothing untoward mentioned in perlaix - but I expect you already know that! :-) :-((

    A user level that continues to overstate my experience :-))
      Bloodnok,
      Thanks! Would you be able to try the same code using File::FcntLock as suggested by tye? I don't have an environment where I can play as this is really coming from another group (I work mostly on Solaris boxes). At least knowing if that will work or if this is strictly an AIX issue will go along way to saving time.

      Again, thanks for confirming this is an issue on AIX.

      Cheers - L~R

        As the trace shows, flock is implemented via fcntl on AIX ... so there shouldn't be any difference.

        The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon
        No probs LR - at least _you_ have a decent box/OS on which to work :D.

        I hope that there's no rush for the test - I'm not sure I'll have the time 'til Tuesday (Monday being a bank holiday in the UK an' all).

        Will keep you posted...

        A user level that continues to overstate my experience :-))
Re: Lightweight Solution To "Only 1 Process Running" On AIX (open for writing)
by almut (Canon) on May 01, 2009 at 16:41 UTC

    It seems the filehandle must be opened for writing, on AIX.  This works for me (tested on AIX 5.1, 5.3 and 6.1, with Perl 5.8.2 and 5.8.4):

    use strict; use warnings; use Fcntl qw(LOCK_EX LOCK_NB); open DATA, ">>", $0 or die $!; flock DATA, LOCK_EX | LOCK_NB or die "already running\n"; print "started $$\n"; sleep 300; __DATA__

    For example

    $ ./761299.pl started 1220690 ^Z Suspended $ bg [1] ./761299.pl & $ ./761299.pl already running $ kill 1220690 [1] Terminated ./761299.pl $ ./761299.pl started 1220692

    truss shows:

    kfcntl(3, F_SETLK, 0x2FF22430) = 0 # when not already r +unning kfcntl(3, F_SETLK, 0x2FF22430) Err#13 EACCES # otherwise

    (instead of __DATA__, you can of course also open SELF, or whatever)

    Update: replaced open DATA, ">>", "/dev/null" with open DATA, ">>", $0  (with more than one program, you'd use the same file, otherwise)

    A problem with this approach would be that the program couldn't be installed on/run from a read-only file system...

Re: Lightweight Solution To "Only 1 Process Running" On AIX (fcntl)
by tye (Sage) on May 01, 2009 at 15:05 UTC

    tye predicts PEBKAC

    Tell your friend, "Where's your code?". Fourth-party debugging without the code involved seems likely doomed. Posting a node about this seems like it really shouldn't have been your next step.

    I like fcntl-based locks better than flock, so you could ask your friend to try using those instead (just in case flock is just fundamentally broken on his copy of Perl). See fcntl, Fcntl, and File::FcntlLock.

    Update: http://www.nntp.perl.org/group/perl.perl5.porters/2009/04/msg145477.html was interesting, if perhaps more tangential than on-target here.

    - tye        

      tye,
      I wrote and ran code myself at your and JavaFan's prodding - shown here. I get the same error. The perl is 5.8.2 (I can provide the -V if desired but I doubt it). I will look into File::FcntLock but I was hoping to avoid having to compile anything since that is a long process in this environment.

      Cheers - L~R

        You can do fcntl-based locking in pure Perl. Now that you mention that File::FcntlLock has an XS component, I bet it only does that to avoid what might seem somewhat hackish use of pack. I suspect google can pretty quickly find you examples of how to do this is plain Perl (or just follow the documentation -- it isn't rocket surgery).

        - tye        

Re: Lightweight Solution To "Only 1 Process Running" On AIX (access)
by tye (Sage) on May 01, 2009 at 16:40 UTC

    fcntl on EBADF:

    The argument cmd is F_SETLK or F_SETLKW, the type of lock (l_type) is an exclusive lock (F_WRLCK), and fd is not a valid file descriptor open for writing.

    Open the file for write access.

    - tye        

      tye,
      Thanks but opening $0 for write access seems like a bad idea :-)

      almut provided the /dev/null solution to do the same thing though.

      Cheers - L~R

        Actually, the /dev/null solution I initially suggested is a bad idea :)  because you could only ever have one such program per system...

        Based on your response, I'm guessing that you are thinking of ">", which is "write access and create/truncate". There are lots of other ways to get write access to a file.

        - tye        

Re: Lightweight Solution To "Only 1 Process Running" On AIX
by hossman (Prior) on May 01, 2009 at 23:10 UTC

    It's been 11 years since I've used an AIX machine, but if i remember correctly binding a socket to a specific port works on AIX (and every other OS i've ever tried)...

    #!/usr/bin/perl -l use warnings; use strict; use Socket; my $sock_fh; BEGIN { my $port = 987654; # any constant specific to this app socket($sock_fh, PF_INET, SOCK_STREAM, getprotobyname('tcp')) or die "socket: $!"; bind($sock_fh, sockaddr_in($port, INADDR_LOOPBACK)) or (warn "another process is already running ($!)\n" and exit(1)); } # BEGIN: where you do your work print "send interupt to kill process"; while (<>) { sleep 1; } # END: where you do your work END { $sock_fh->close() if defined $sock_fh; }
Re: Lightweight Solution To "Only 1 Process Running" On AIX
by JavaFan (Canon) on May 01, 2009 at 14:58 UTC
    The error in both cases is "unable to lock: A file descriptor does not refer to an open file."

    Considering your code does a plain exit if the flock fails, I quite surprised you get any error message at all.

    Perhaps the code you are showing isn't the code you are running?

      JavaFan,
      The code shown is the extract from the slides presented by Dominus. I provided it here so people wouldn't have to go to an offsite location to see them. Here is complete running code that demonstrates the problem:
      #!/usr/bin/perl use strict; use warnings; use Fcntl qw/:DEFAULT :flock/; open(SELF, '<', $0) or die "Unable to open '$0' for reading: $!"; flock SELF, LOCK_EX | LOCK_NB or exit; print "I must be the only process running\n"; sleep 300;
      and
      #!/usr/bin/perl use strict; use warnings; use Fcntl qw/:DEFAULT :flock/; flock DATA, LOCK_EX | LOCK_NB or exit; print "I must be the only process running\n"; sleep 300; __DATA__

      Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://761299]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-23 11:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found