|Perl: the Markov chain saw|
Re: how to tell if a file is still being modified (use the filename as a communications channel)by grinder (Bishop)
|on Sep 15, 2003 at 20:08 UTC||Need Help??|
In cases like these I use the name of the file itself as a channel to other processes to let them know whether they are allowed to play with it or not. This does, however, require that you have control over the process that is sending you the files.
All you have to do is to arrange for the sender to put files on your server according to a specific filename convention (e.g. PUT sekret.data or PUT sekret.data.uploading in ftp parlance).
After the transfer is complete, the sender then sends down another command to rename the file: RENAME sekret.data sekret.data.ready or RENAME sekret.data.uploading sekret.data, respectively. Whatever works best for you. The trick is that the sender must do this, the receiver cannot.
As a receiver, you only have to search for files with the agreed-upon extension (.ready or whatever). You can even push the vice as far as renaming the file, on the receiving side (e.g. sekret.data.done) so that the sending side knows that the file has been processed, should the housekeeping be their responsibility.
This is also pretty robust in terms of sudden death reboots. It becomes trivial to determine if files need to be resent or reprocessed.
This is a language- and platform-agnostic technique. You can use it pretty much anywhere you can give names to things. If you can't rename, (sometimes not possible with anonymous ftp uploads) you can always create another file along the principal file (e.g. sekret.data.is-ready) possibly with zero-length, possibly containing an MD5 checksum, to achieve a similar result.
The main point to remember is that you don't want to try and second-guess the sender on the receiving side. To try and do so will cause untold pain. Just get the sender to tell you.