|Think about Loose Coupling|
Best practices for modifying a file in place: q's about opening files, file locking, and using the rename functionby davebaker (Pilgrim)
|on Nov 03, 2006 at 00:30 UTC||Need Help??|
davebaker has asked for the wisdom of the Perl Monks concerning the following question:
I would like to modify a file in place using the method recommended as the "best" method in the Perl Cookbook, 2d ed., recipe 7.15, Modifying a File in Place with a Temporary File. I don't understand something the authors say; it seems critical to understand it, though.
(The file to be modified in my case is an important flat-file database (one record per line); web users can use a CGI script to either add data to the file or to edit their own records; I'm concerned about possible file corruption when two or more users are submitting new or revised data at about the same instant. I know I could use a real database but I really want to figure out file locking using Perl. Seems like this issue must come up all the time in a multiuser environment, whether web or internal network.)
The code provided in the recipe is:
Some discussion follows, then the authors say:
Note that rename won't work across filesystems, so you should create your temporary file in the same directory as the file being modified.
(Emphasis supplied by me.)
Q1: In "The truly paranoid programmer would lock the file", which file are the authors referring to?
Q2: Regarding the reason for being "truly paranoid" -- is this because we don't want another running instance of this script to be writing to $new while we are, so we ought to revise this script (and hence both instances) to get a LOCK_EX before writing to $new?
To get the desired file lock, the authors caution that the "tricky part" is to first open the file for writing without clobbering its contents. I have read elsewhere in the book that "open (OUT, ">", $out)" would "clobber" any existing file named $out before a script would have a chance to get a lock on the file, and I've read (p. 421 of Programming Perl, 3d ed.) that the best method for writing to a file is to use sysopen, which does not clobber any file that exists, as in:
Q3: I'm not sure I completely understand the hazards of "clobbering." Is the problem the fact that $new might exist already because another instance of this script running at the same time had created $new a split-second ago in connection with its own update of $old, and that our process will destroy the contents of that $new due to the way ">" works, thereby causing the other instance (e.g., another web user submitting data via the same page's form) to produce mangled or empty data when that instance renames $new to $old? Yikes, there goes the database.
Q4: In a multi-user environment, does a careful programmer need to use "sysopen/flock LOCK_EX/truncate" every time a script needs to write a file? If a plain open ">" technique is used there would seem to be a potential clobbering problem.
Q5: A final wrinkle on the addition of a file lock for $new in the recipe: wouldn't we would want to keep $new open (and hence the LOCK_EX in place) until after the "rename( $new, $old )"? Would that work, though? I'm concerned that the rename function implicitly closes the file being renamed and breaks the lock on it before doing something as drastic as renaming it.