I guess I still do not understand the basic objection to flock - does it not do what it claims to? My understanding is that the OS ensures that only one process can lock a file at a time. If process #2 changes a file between the
time that process #1 opens it and locks it, what harm is done? #1 locks it, and then has #2's changes. As long as
all your processes are using flock, I still can't quite
see the problem with the race condition. Could you please explain it again? Thanks.
P.S. I will be out of the country, so may not reply for a
week, but I am interested in this. :)
| [reply] |
I don't object to flock() at all. It does what it claims to do, but using it incorrectly can be the problem. A race condition is when you have this type ofrun of events which is somewhat common, especially in older scripts (I have seen this a lot in CGI scripts):
- Open FH for reading
- Lock FH
- Read FH
- Close FH
- Re-open FH for writing
- Lock FH
- Write to FH
- Close FH
Here you should be able to see the race. Another process can get an exclusive lock on the FH during the read open (read-only opens don't generally get exclusive locks), and between the close of the read and open of the write. Hence, you can have multiple processes working on the file in a way you do not want which could currupt your data ("Hey! Why is my counter file suddenly blank??").
Consider this flow:
- Proc A open FH for write (using > which clobbers the file contents)
- Proc B opens FH for reading (no lock attempt since it won't likely get an exclusive lock granted)
- Proc A locks FH
- Proc A works with FH
- Proc A closes FH
One race concern here is that if another process wants to read the contents of this file, it will get garbage since proc A clobbered the file contents. Having proc B attempt an exclusive lock is futile since they are not generally granted to r/o opens. By using semaphores, you can avoid this situation.
These are just two examples (there is also an issue of hardware physically not being done writing to disk before another process opens the file). A good idea is to write the flow of your locks on a whiteboard and see what would happen if multiple processes are doing that same flow at once (I generally add sleeps at key points to show myself this, like in the example script in node 14140.
I hope this makes more sense, if not let me know.
Cheers,
KM
| [reply] |
Well, using anything incorrectly can lead to problems, but I still think a simple flock is best - just be careful about it.
> Here you should be able to see the race. Another process
> can get an exclusive lock on the FH during the read open
> (read-only opens don't generally get exclusive locks),
> and between the close of the read and open of the write.
I don't agree with this. First, if another process cannot get
an exclusive lock while *any* other lock is on it. So if
process A locks a file for reading (shared lock) and then
process B tries to get an exclusive lock, process B cannot
get the lock until *all* the locks are gone - shared
and exclusive. In the second case, yes it's a problem, but
that's a bad coding problem, not a problem with flock.
The right way to do it of course is to open the file for
read/write, get an exclusive lock, read in the file, rewind
(usually), write to the file, and close it, releasing the
exclusive lock.
- Proc A open FH for write (using > which clobbers the file contents)
- Proc B opens FH for reading (no lock attempt since it won't likely get an exclusive lock granted)
- Proc A locks FH
- Proc A works with FH
- Proc A closes FH
No need for a semaphore, just change the above a bit:
- Proc A opens FH for read/write (the file is not changed at all yet)
- Proc B opens FH for reading (and gets a shared lock)
- Proc A locks FH exclusively, after B has released it's shared lock
- Proc A works with FH
- Proc A closes FH
And yes, I need to update my tutorial. :)
| [reply] |