Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

select, stat, and waiting for files

by oakbox (Chaplain)
on Sep 12, 2005 at 11:47 UTC ( #491227=perlquestion: print w/replies, xml ) Need Help??

oakbox has asked for the wisdom of the Perl Monks concerning the following question:

I have a remote process that takes between 10 and 30 seconds to complete. This process creates a 1 to 3 meg file and then scp's that file to my server. The process on my server knows where that scp'd file will show up, but needs to wait until it is completed before sending it on to the user.

Let's assume that the remote process is doing it's thing, and that I am now, sitting on my server, waiting for the result file.

my $sanity; while(1){ if(-e $pdffile){ last; } sleep 1; $sanity++; if($sanity > 60){ return(0); } }
When I get through that chunk, I know the file exists. Yeah! Now I need to wait for it to stop growing...
while(1){ select(undef, undef, undef, 0.5); # .5 second wait my $usizer = (stat($pdffile))[7]; # how big is that file? if($usizer eq $sizer){ last; } # did it grow at all? $sizer = $usizer; # set size for next round }
Now for the question: Is this the *right* way to do this? Another programmer mentioned that using select/stat isn't reliable due to disk buffering/ network concerns.

- Thank you,

Replies are listed 'Best First'.
Re: select, stat, and waiting for files
by osunderdog (Deacon) on Sep 12, 2005 at 12:12 UTC

    First off, I assume you're on a *nix machine. But figured I should state that in advance.

    One problem you are facing is the file comes into existence, but isn't quite all there yet. I've seen this handled two ways 1) create the payload file in another location and then use a local system move to put it in the monitor directory and 2) move the payload file then move a trigger file (zero byte file) to the monitor directory.

    Neither of these techniques are perfect, but they might be good enough™

    Another option might be to use something like dnotify or it's ilk to tell you when the file has arrived. Although these will depend on your os.

    I've tried to use an os dependent directory monitor, however, I ran into problems when the monitor directory went from a local disk to an NFS mount. That's a much harder problem to solve.

    Just some thoughts

    Hazah! I'm Employed!

Re: select, stat, and waiting for files
by gri6507 (Deacon) on Sep 12, 2005 at 12:44 UTC
    To go along with the previous idea, before starting to scp'ing your file, touch a filename.size file which only contains the size of the file you are about to scp. That way, your server could look at the size file and figure out if the actual file is completely there.

      Or even better, if you're going to do it like that: scp the file and at the end ssh and touch a filename.ok. When filename.ok shows up you know that the first file has been send already.

      Or, scp the file and log the scp session in a seperate file. Then, scp this second file. When the second file arrives you know the first one has arrived. Later on you can check the contents of the seperate file to look for transfer errors.

      if ( 1 ) { $postman->ring() for (1..2); }
Re: select, stat, and waiting for files
by sasikumar (Monk) on Sep 12, 2005 at 13:23 UTC
    Hi Richard,

    I Would be sending a signal to the server process stating the scp is over.
    I feel that would be a best method. But i am not sure as i have not dealt with such huge files across network.

      The problem with this approach is that the signal will probably appear before the file is actually completely written to the disk, due to the write-caching of most modern 'NIX systems. This makes it dangerous to assume anything based on the scp.
        It doesn't matter whether the file is physically written to disk; the kernel will pretend that it is, and scp will use the kernel to access the file so will see it that way.

        The only time having the file physically on disk would matter would be if the machine shut off unexpectedly, or if something was accessing the disk directly instead of using the kernel's filesystem drivers.

Re: select, stat, and waiting for files
by samizdat (Vicar) on Sep 12, 2005 at 13:00 UTC
    After you complete the send, have your program send a tiny file along. When that file's send is complete, you know you have all of the big one. This should take much less than 1/2 second :D

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://491227]
Approved by ww
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (8)
As of 2020-03-30 17:25 GMT
Find Nodes?
    Voting Booth?
    To "Disagree to disagree" means to:

    Results (175 votes). Check out past polls.