I rememberred that we had this problem with our automatic job processing too. We will kick off processing when certain file has arrived from FTP. We came up with several solutions:
in reply to how to tell if a file is still being modified
1) By periodically checking the size of the file coming in, and if the file size has stopped growing, then we would assume that the file transfer has stopped. However this does *NOT* work, at least not reliablly! We had one perticular case where the FTP has paused / died and the system thought that the file has been received properly, and started to process the file. It made a total mess that took many days to resolve.
2) A better approach than the first one is to modify the system to receive the data file first, followed by a trigger file. The system will act on the arrival of the trigger file. This approach is more reliable than the first one, however, it assumes that the sender works properly. We had a case when the sender program/script sent half the data file, and then somehow sent the trigger file without checking the completion of the data file. This of cause caused another mess.
3) The best senario is when the data file has an integrated verification mechanism, like a ZIP file. You can be certain if the incoming ZIP file has arrived completely by periodically testing the integrity of the ZIP file with the zip -T switch. This method works 100% of the time.
The 3rd method, with a self validating file format, is the preferred method, if the client/sender can produce such format. If not possible, then fall back to the 2nd method with additional trigger file. If this is not possible, then fall back to the 1st method and pray. :-D