Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Proc::PID::File problem generating pid files, or: does it matter where a pid file lives?

by tospo (Hermit)
on May 21, 2010 at 16:24 UTC ( #841112=perlquestion: print w/ replies, xml ) Need Help??
tospo has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I'm working on a daemon for some asynchronous processing and I'm using Proc::PID::File to check if the daemon is already running, as suggested here.
My problem boils down to the creation of the pid file. The following simple code:

#!/usr/bin/env perl use Proc::PID::File; die "Already running!" if Proc::PID::File->running();
produces the following error:
pid "/var/run/proc_pid.pl.pid" already locked: Bad file descriptor
I guess this is due to permissions because I can solve it by pointing Proc::PID::File to a different directory:
#!/usr/bin/env perl use Proc::PID::File; die "Already running!" if Proc::PID::File->running(dir=>"/my/home/dir" +);
This works fine and if I use that in my daemon script it also works and "ps" also lists it fine.
My main question is: is there a problem with putting pid files in non-standard locations? Thanks for your help!

Comment on Proc::PID::File problem generating pid files, or: does it matter where a pid file lives?
Select or Download Code
Re: Proc::PID::File problem generating pid files, or: does it matter where a pid file lives?
by JavaFan (Canon) on May 21, 2010 at 16:52 UTC
    My main question is: is there a problem with putting pid files in non-standard locations?
    Not in principle. Pid files have their problems (something may throw away the file, the pid may have been reused, race conditions if you don't use locking (but if you use locking, why have a pid file?)), but they are just a file. Of course, if there's some program checking for pid files, it may need to be told where the pid file is, but other than that, not really. Perhaps you may have a higher risk deleting the pid file if the file is in your home dir, but that's more a procedural issue.
      Not in principle. Pid files have their problems (something may throw away the file, the pid may have been reused, race conditions if you don't use locking (but if you use locking, why have a pid file?)), but they are just a file.
      In discussing whether or not a "pid" (or "lock") file is good or bad, it is useful to define what one should be used for. The point of the lock file is to signal other processes that access to a system resource is controlled (for whatever reason).
      Take a simple, single user environment. If I start a program in a window that utilizes several specific data files, and those file should only be updated by my program, the simple way is to implement file locking. That way, if another instance of the program is started, it will fail on file access. Not very pretty (if multiple files are accessed, you have to lock all of them), but effective. A better way is to check for the instance of the lock file - a way of saying "this set of resources is in use". Cleanup is easy. It also allows for graceful recovery when the user mistakenly starts another instance of the program.
      Now, extend that to long running processes (daemons or server processes) that require exclusive access to specific system resources (database, web, ftp etc). Such access may be relatively easy to handle (network ports, etc), or relatively expensive (cleanly shutting down a large database server can be time consuming, with failure extremely painful). Under these circumstances, you would need to be able to control the server from an arbitrary location (not necessarily the terminal process from which it was started).
      Enter the "pid" file (a file whose existance signals that the service is - or was - running). The content of the file is the process id (hence the name) of the controlling process. The file's (non)existance can communicate the state of the process:
      • File exists, PID points to running process:
        Controlling process is running, should be available (not always true).
      • File exists, PID points to non-existant process:
        Process was running, but did not shut down cleanly. Some cleanup will probably be necessary.
      • File does not exist, but process appears in system process table:
        • Programming logic error: Process did not properly create environment on startup (more information later).
        • Some other process deleted the PID file.
      • File does not exist and process does not appear in system process table:
        Startup should be safe.
      When starting up such a server process (which will usually be started as the "root" user), the following sequence should normally be followed:
      1. Check for existence and program access to:
        • PID directory (typically /var/run or /usr/local/var/run)
        • Log file directory
        • Data file directory
        • Other system resources (network ports, specific hardware)
      2. Reset user credentials (first reset group ID, then user ID)
      3. Create PID file and lock for exclusive access.
      4. Open log file(s)
      5. Allocate resource(s)
      Orderly process shutdown is generally in the reverse order.

      As noted above, the PID file is typically located in /var/run (or /usr/local/var/run). This allows for orderly and generic startup and shutdown procedures, as well as troubleshooting. If you do not have to go searching for pid files, it is much quicker.

      The PID file also serves another purpose: it allows external resources to communicate with the service in a specified manner. Take, for instance, the "cron" process, which provides scheduled processes to run. In order to maintain the process, you can either edit the configuration files and restart the process, or you can send it a message to reread it's files. The "standard" unix method for maintaining the configuration is through the "crontab" program. When the configuration is changed (by crontab -e or crontab -r, the program sends a "SIGHUP" signal to the process whose id is contained in the PID file. The cron process does not shut down, but simply rereads it's configuration.
      The "syslog" daemon works on the same basis. If you were to shut down the daemon, then restart it, you would lose messages. Instead, it simply rereads the configuration file on the fly and acts appropriately.

      Yes, the use of such files adds some complexity to the program, but in many cases, the added functionality is worth it (at least in my not so humble opinion :-D ). YMMV.

      Update: Added explicit reference to pid file location.

        1. Check for existence and program access to:
          • PID directory (typically /var/run or /usr/local/var/run)
          • Log file directory
          • Data file directory
          • Other system resources (network ports, specific hardware)
        2. Reset user credentials (first reset group ID, then user ID)
        3. Create PID file and lock for exclusive access.
        4. Open log file(s)
        5. Allocate resource(s)

      Are you aware that this still has a race condition? You run a lot of tests in step 1, most of those tests involve system calls. Step 2 has two system calls. Each and every system call may cause a task switch to a malicious program that -- with a little bit of luck and good timing -- can change what you checked for in step 1, causing the following steps to fail rather unexpectedly. And each and every system call may cause a task switch to a second instance fighting for the PID file.

      Daemons do not need PID files, and most daemons contain code that they don't really need, for backgrounding, logging, restarting, dropping privileges, and to prevent multiple instances. The daemontools reduce code complexity in daemons and they take care of backgrounding, logging, restarting, dropping privileges, and single instances. Even communication via signals works completely without PID files (with a patch, SIGUSR1 and SIGUSR2 can also be sent). Daemontools may look strange, and some of DJBs decisions (errno, location in filesystem, ...) may cause a little bit of confusion, but once you unterstand what happens, the daemontools are the most natural way to implement daemons on Unix and derivates.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        The PID file also serves another purpose: it allows external resources to communicate with the service in a specified manner.
        I would say, that's the only reason to use PID files. There's no need to use PID files to prevent simultaneous access to resources; for that, lock files are enough. And if all you care about is preventing concurrent running of the same program (which is what the OP needs), all you need to do is obtain a lock on yourself (no external files needed):
        flock DATA, LOCK_EX or die "Another instance is already running";
Re: Proc::PID::File problem generating pid files, or: does it matter where a pid file lives?
by graff (Chancellor) on May 22, 2010 at 03:57 UTC
    Have you tried putting the pid file in /tmp ? That's a more typical location for processes that don't run with root privileges, because there's no permission problems there (every user can write to /tmp, whereas write access to /var/run/ is more likely to be restricted).
Re: Proc::PID::File problem generating pid files, or: does it matter where a pid file lives?
by nagalenoj (Friar) on May 22, 2010 at 05:49 UTC

    To my point of view, It is a matter of accessibility to processes.

    Before selecting the pid files directory, take a note of following points:

    * Number of users accessing the machine(either from local or from remote).
    * Access permission for the directory and pid file for other users. Because, the file can be removed/renamed/modified if it is accessi ble to other users(knowingly or unknowingly). So, better to give permission which is enough.
    * Different processes which could access the pid file (for inter-communication between processes (i.e). one process could read the pid file and send signal to your process).

    Not noticing the above things, could create unnecessary problems sooner or later.

Re: Proc::PID::File problem generating pid files, or: does it matter where a pid file lives?
by tospo (Hermit) on May 24, 2010 at 08:22 UTC

    Thanks a lot guys for all your helpful comments. Wow, proceng, that's a whole bookchapter you wrote there - thanks for taking the time to explain all of that!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://841112]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (15)
As of 2014-09-18 08:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (109 votes), past polls