Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

flock and read-ahead buffering on input

by dbooth (Novice)
on Apr 17, 2014 at 19:54 UTC ( [id://1082675]=perlquestion: print w/replies, xml ) Need Help??

dbooth has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to figure out how flock interacts with read-ahead buffering on input, and I have not been able to find any documentation about it. Suppose I open and lock a file using flock for exclusive access, like this as apparently advised in the Perl Cookbook:

sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!"; flock(FH, LOCK_EX) or die "can't write-lock numfile: $!"; $num = <FH> || 0;

(I am using an exclusive lock because after reading the file, I intend to re-write it.) Between the calls of sysopen and flock, the file has been opened but has not yet been locked. Therefore, if a read-ahead buffer were filled when the file was sysopen'ed, and another process wrote to the file after that but before flock was called, then it would seem to me that the example above would be unsafe, because reading (from the read-ahead buffer) immediately after the flock could return the old file contents instead of the current contents . . . unless flock automatically invalidates the read-ahead buffer, or the read-ahead buffer is not filled until the file is actually read. But the perl documentation on sysopen, read and flock say nothing about read-ahead buffering, so I am left to guess.

Can anyone point to conclusive documentation that indicates whether the above code is actually safe (i.e., that the read operation is guaranteed to return the current file content)? Or documentation about when read-ahead buffering is done? Or how it interacts (or not) with flock? Or explain why the above code is correct, in light of this analysis?

Replies are listed 'Best First'.
Re: flock and read-ahead buffering on input
by wazat (Monk) on Apr 18, 2014 at 03:06 UTC

    I'm not sure I understand your question.

    Assuming advisory locking, your exclusive lock is useless unless other processes are also using locking.

    If other processes are using locking, then they should obtain at least a shared lock before reading. If they obtain a lock before you attempt an exclusive lock they you won't get your lock. If you get your lock first, then they won't get one.

    I assume that by read ahead buffering, you are referring to read ahead by the OS kernel Any writes by your process should invalidate the relevant block buffered in the kernel.

    If you are referring buffering in the user space of the other processes via the standard IO library. I don't think this will happen unless the other processes perform a read of the file before trying to get a lock.

    If you are dealing with networked file systems. Things become murky.

    I'm most familiar with *NIX. If you are running on MS windows then behaviour may differ.

Re: flock and read-ahead buffering on input
by eyepopslikeamosquito (Archbishop) on Apr 17, 2014 at 22:34 UTC

    FWIW, I have successfully used Perl flock on both Unix (many flavors) and Windows for many years. Essentially, I use the approach described here. I found that using a separate zero-size lock file to (advisory) lock the (whole) file being updated, rather than attempting to lock the file being changed, improved robustness and overall simplicity of the code, especially on Windows, where I ran into a number of glitches when attempting to lock the file being modified.

Re: flock and read-ahead buffering on input
by dave_the_m (Monsignor) on Apr 17, 2014 at 21:18 UTC
    I find your description of a potential problem hard to follow. What actor are you worried about performing the buffering? The process running your code above? Some other process? The OS?

    Your code example looks perfectly safe. Except note that flock doesn't do mandatory locking. Locking the file doesn't stop other processes doing something with the file; it only stops other processes locking the file. So if all processes that access the file are under your control, and you ensure that all such processes do a sysopen/flock before doing anything else with the file, then this code should be safe.

    (Except if the file is being accessed over a network, in which case all bets are probably off)

    Dave.

      I am worried about perl or the OS performing read-ahead buffering as a result of the sysopen call. I am aware that flock is advisory locking, not mandatory. And I am assuming a local file system.

      Clearly read-ahead buffering is normally done (at least on linux), since this post, for example, discusses controlling linux's read-ahead buffer size, and this post discusses controlling the read-ahead buffer size from perl. Therefore, given that read-ahead buffering is normally done, how can the above code possibly be safe? I am fully prepared to believe that it is, but I have not been able to find any authoritative evidence to support that belief, and I want my code to be safe.

      For example, if flock was guaranteed to invalidate the read-ahead buffer, or if the act of another process writing to the file was guaranteed to immediately invalidate the read-ahead buffer, or if the read-ahead buffer was guaranteed to not be filled until this process did a read on the file handle, then the above code would indeed be safe. Can anyone point me to any evidence that that any of these is true, or any other evidence that indicates why the above code is safe, given the fact that read-ahead buffering normally occurs when reading a file?

        Perl won't do any read-ahead buffering of the file until you perform the first read action. If the kernel performs read-ahead, it has a global picture of the file and and all processes, and will be capable of invalidating any buffered data if necessary.

        So that code is perfectly safe for your needs.

        Dave.

Re: flock and read-ahead buffering on input
by Anonymous Monk on Apr 17, 2014 at 22:35 UTC
    I think that fseek even of a zero-length move flushes all buffers.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1082675]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-03-28 22:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found