http://www.perlmonks.org?node_id=185704


in reply to flock() broken under FreeBSD?

The important part is:
64310 shares. 64310 waiting for previous instance(s) to exit... 64308 owns
which shows process 64310 getting a shared lock (and holding it) and then the other process (64308) successfully getting an exclusive lock.
Are you assuming that you can hold the lock while up- or down-grading it? I did, but the FreeBSD flock(2) man page provides this enlightenment:
A shared lock may be upgraded to an exclusive lock, and vice versa, simply by specifying the appropriate lock type; this results in the previous lock being released and the new lock applied (possibly after other processes have gained and released the lock).
I'm guessing that when 64310 hits the (blocking)   flock( \*DATA, LOCK_EX ); the previous shared lock that 64310 held is released, and 64308 sneaks in to own the lock, leaving 64310 blocked. Then 64310 seems to go away, though I can't tell if that's permanent, or a side-effect of your killing off the scripts.

Replies are listed 'Best First'.
Re: Re: flock() broken under FreeBSD?
by dws (Chancellor) on Jul 27, 2002 at 17:25 UTC
    Following the debugging technique of "imagine how something might happen, then go confirm it," here's a story for how the behavior tye observes might happen. It involves an imagined implemententation of FreeBSD flock(2), and might provide some guidance for someone who cares to dig into the FreeBSD source.

    Assume an OS implementation of flock() that either intentionally or inadvertantly gives priority non-blocking requests. That is, a non-blocking flock() request will be satisfied without unblocking other processes that are waiting to aquire a lock, even though the non-blocking request releases its prior lock first. (Ignore whether this is sensible, and just assume that it's coded that way.)

    Now consider this scenerio: Process A holds a shared lock on F. Process B blocks on a blocking requests to acquire an exclusive lock. Process A makes a non-blocking request to "upgrade" its lock to exclusive. Now, according to the flock(2) man page, this means releasing the shared lock first. But, since the request is a non-blocking one, and since the flock() routine is coded to give priority to non-blocking requests, process A acquires an exclusive lock, even though B was waiting first. B is still blocked. Following the same logic, A can then repetitively "downgrade" the lock to shared, and upgrade to exclusive, all without unblocking B. B is starved until either A makes a blocking flock() request, or A releases the lock by an explicit close or by process termination.

    This is how it might happen, given the code tye provides. Can someone with access to FreeBSD sources (and the will to use them) confirm whether this is what's going on?

      Well I tried my code on Linux and the failure case looks a little different:

      $ ./locktest & sleep 4; ./locktest [1] 1553 Using flock()... 1553 shares. 1553 owns 1553 shares Using flock()... 1557 shares. 1557 waiting for previous instance(s) to exit... 1553 owns 1557 owns Running... 1557 owns 1553 can't revert self lock to shared: Resource temporarily unavailabl +e 1557 shares ^C [1]+ Exit 11 ./locktest
      Which demonstrates that Linux doesn't have the strong preference for non-blocking requests like FreeBSD appears to have.

      Having lock up-/down-grading introduce a race condition where the lock is freed first is such a horrid design choice to my mind that I didn't even consider the possibility when reading "man flock" (this is not even mentioned in Linux's extremely short version of "man flock" tho my test cases show that it is the case there as well).

      Thanks for the enlightenment. Now I have one more reason to hate flock. I should find a module that provides a convenient wrapper for fcntl locks... (:

              - tye (but my friends call me "Tye")
        The FreeBSD man pages hint at this behavior:

        A shared lock may be upgraded to an exclusive lock, and vice versa, sim­ply by specifying the appropriate lock type; this results in the previous lock being released and the new lock applied (possibly after other pro­cesses have gained and released the lock).

        Requesting a lock on an object that is already locked normally causes the caller to be blocked until the lock may be acquired. If LOCK_NB is included in operation, then this will not happen; instead the call will fail and the error EWOULDBLOCK will be returned.


        So, if you try to upgrade your shared lock, you release the lock and get in line behind everyone else blocked for an exclusive lock. If you try to upgrade your exlusive lock to a shared lock, you release the exclusive lock and get in line (behind everyone blocking for an exclusive lock) for a shared lock.
        What would you want to happen if 2 processes get read-only locks and then try to "upgrade" them at the same time to read-write? What reasonable semantics can you suggest for resolving this?

        If there is no reasonable way to implement upgrades, then it is not unreasonable to say, Silly programmer. Upgrades don't exist. If you will want to write, then you should tell me that from the start!

        Getting a shared lock is a promise that you won't be doing any modifications based on your read. Don't make that promise if it isn't true.