Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Perl Best Practices book: is this one a best practice or a dodgy practice?

by eyepopslikeamosquito (Canon)
on Sep 03, 2005 at 05:09 UTC ( #488859=note: print w/ replies, xml ) Need Help??


in reply to Re: Perl Best Practices book: is this one a best practice or a dodgy practice?
in thread Perl Best Practices book: is this one a best practice or a dodgy practice?

... the second suggested solution--using the IO::Insitu module--does use a back-up strategy to ensure that data is not lost if the program abends.

True. But it is still not re-runnable. Which makes it dangerous in the hands of naive users who interrupt a program with CTRL-C, then re-run it. If they do that, they may suffer permanent data loss and without being aware of it.

It seems to me that you can get re-runnability with little extra effort: simply write the temporary file first and only overwrite the original (via atomic rename) after the temporary has been successfully written.

As a test, I pressed CTRL-C midway through running this test program:

use strict; use warnings; use IO::InSitu; my $infile_name = 'fred.tmp'; my $outfile_name = $infile_name; my ($in, $out) = open_rw($infile_name, $outfile_name); for my $line (<$in>) { print {$out} transform($line); } # Try pressing CTRL-C while file is being updated. sub transform { sleep 1; return "hello:" . $_[0]; }
This is what I saw:
total 20 drwxrwxr-x 2 andrew andrew 4096 Sep 3 14:44 ./ -rw-rw-r-- 1 andrew andrew 0 Sep 3 14:42 fred.tmp -rw-rw-r-- 1 andrew andrew 191 Sep 3 14:42 fred.tmp.bak drwxrwxr-x 11 andrew andrew 4096 Sep 3 14:42 ../ -rw-rw-r-- 1 andrew andrew 288 Sep 3 14:41 tsitu1.pl
Now, of course, blindly re-running the test program resulted in permanent data loss (an empty fred.tmp file in this example).

Update: Just to clarify, this problem is broader than the naive user scenario given above and may bite you anytime a script is automatically rerun after an interruption -- a script that is run automatically at boot time, for example.


Comment on Re^2: Perl Best Practices book: is this one a best practice or a dodgy practice?
Select or Download Code
Re^3: Perl Best Practices book: is this one a best practice or a dodgy practice?
by TheDamian (Priest) on Sep 03, 2005 at 05:35 UTC
    Which makes it dangerous in the hands of naive users who interrupt a program with CTRL-C, then re-run it. If they do that, they may suffer permanent data loss and without being aware of it.
    To quote Oscar Wilde's Miss Prism: "What a lesson for him! I trust he will profit by it." ;-)
    It seems to me that you can get re-runnability with little extra effort: simply write the temporary file first and only overwrite the original (via atomic rename) after the temporary has been successfully written.
    The IO::Insitu module could certainly be reworked to operate that way. Except that then would fail to preserve the inode of the original file. :-(. Perhaps I will add an option to allow it to work whichever way (i.e. "inode-preserving" vs "rerunnable") the user prefers.

    Bear in mind though that an "atomic rename" isn't really atomic under most filesystems, so even this approach still isn't going to absolutely guarantee rerunnability.

      Bear in mind though that an "atomic rename" isn't really atomic under most filesystems

      rename is atomic on POSIX systems. Win32 has atomic rename and I just checked and rename uses it on modern Win32 operating systems. That qualifies as "most" of the Perl universe in my book (covering the two most common Perl environments, even if TheDamian chooses to call one of the top two "obscure"). Perhaps you have evidence to the contrary or perhaps you are thinking of pre-rename methods using link/unlink?

      - tye        

        My mistake. I hadn't realized we were talking about "atomic rename(1)", rather than more general renaming (such as link/unlink sequences). Sorry for the confusion.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://488859]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (11)
As of 2014-07-30 15:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (235 votes), past polls