... the second suggested solution--using the IO::Insitu module--does use a back-up strategy to ensure that data is not lost if the program abends.
True. But it is still not re-runnable. Which makes it dangerous in the hands of naive users who interrupt a program with CTRL-C, then re-run it.
If they do that, they may suffer permanent data loss and
without being aware of it.
It seems to me that you can get re-runnability with little
extra effort: simply write the temporary file first and
only overwrite the original (via atomic rename) after
the temporary has been successfully written.
As a test, I pressed CTRL-C midway through running this test program:
use strict;
use warnings;
use IO::InSitu;
my $infile_name = 'fred.tmp';
my $outfile_name = $infile_name;
my ($in, $out) = open_rw($infile_name, $outfile_name);
for my $line (<$in>) {
print {$out} transform($line);
}
# Try pressing CTRL-C while file is being updated.
sub transform {
sleep 1;
return "hello:" . $_[0];
}
This is what I saw:
total 20
drwxrwxr-x 2 andrew andrew 4096 Sep 3 14:44 ./
-rw-rw-r-- 1 andrew andrew 0 Sep 3 14:42 fred.tmp
-rw-rw-r-- 1 andrew andrew 191 Sep 3 14:42 fred.tmp.bak
drwxrwxr-x 11 andrew andrew 4096 Sep 3 14:42 ../
-rw-rw-r-- 1 andrew andrew 288 Sep 3 14:41 tsitu1.pl
Now, of course, blindly re-running the test program resulted
in permanent data loss (an empty fred.tmp file in this example).
Update: Just to clarify, this problem is
broader than the naive user scenario given above
and may bite you anytime a script is automatically
rerun after an interruption -- a script that is run
automatically at boot time, for example.
Further update: More detail on Win32 rename, related to tye's response below, can now be found at Re^7: Read in hostfile, modify, output.
|