Following the debugging technique of "imagine how something might
happen, then go confirm it," here's a story for how the behavior tye
observes might happen. It involves an imagined implemententation of FreeBSD flock(2), and might provide some guidance for someone who cares to dig into the FreeBSD source.
Assume an OS implementation of flock() that either intentionally or inadvertantly gives priority non-blocking requests. That is, a non-blocking flock() request will be satisfied without unblocking other processes that are waiting to aquire a lock, even though the non-blocking request releases its prior lock first. (Ignore whether this is sensible, and just assume that it's coded that way.)
Now consider this scenerio: Process A holds a shared lock on F. Process B blocks on a blocking requests to acquire an exclusive lock. Process A makes a non-blocking request to "upgrade" its lock to exclusive. Now, according to the flock(2) man page, this means releasing the shared lock first. But, since the request is a non-blocking one, and since the flock() routine is coded to give priority to non-blocking requests, process A acquires an exclusive lock, even though B was waiting first. B is still blocked. Following the same logic, A can then repetitively "downgrade" the lock to shared, and upgrade to exclusive, all without unblocking B. B is starved until either A makes a blocking flock() request, or A releases the lock by an explicit close or by process termination.
This is how it might happen, given the code tye provides. Can someone with access to FreeBSD sources (and the will to use them) confirm whether this is what's going on?