[RFC] File::Replace

Many of you are probably aware of the pattern of opening a temporary file, reading from the original file and writing the modified contents to the temporary file, and then renameing the temporary file over the original file, which is often an atomic operation (depending on OS & FS). I recently wrote a module to encapsulate this behavior, and here is one of three interfaces that are available in File::Replace. There are several options to configure the behavior, including the ability to specify PerlIO layers, what happens if the file doesn't exist yet, etc.

use File::Replace 'replace2';
 
my ($infh,$outfh) = replace2($filename);
while (<$infh>) {
    # write whatever you like to $outfh here
    print $outfh "X: $_";
}
close $infh;   # closing both handles will
close $outfh;  # trigger the replace
[download]

Since I hope this is something that you might find useful, I would be happy about any feedback you might have!

To give a practical example, here is an update of my code from this node. As you can see I was able to get rid of eight lines of fairly complicated code, while keeping the main loop entirely unchanged. The module also adds some more robustness, as it incorporates a few more checks on whether operations were successful or not.

#!/usr/bin/env perl
use warnings;
use strict;
use File::Replace 'replace2';

my $filename = "/tmp/test.html";
my @to_insert = ( '<p>Hello,',
    'World! It is '.gmtime(time).' UTC</p>' );

my ($ifh,$tfh) = replace2($filename);
my $found;
while (<$ifh>) {
    print $tfh $_;
    if (/<!--\s*INSERT\s+HERE\s*-->/i) {
        $found=1;
        print $tfh "$_\n" for @to_insert;
    }
}
close $ifh;
close $tfh;
die "Marker not found" unless $found;
[download]

Comment on [RFC] File::Replace Select or Download Code

Replies are listed 'Best First'.
Re: [RFC] File::Replace by haukex (Archbishop) on Jan 03, 2019 at 19:20 UTC
Well, I dove into the rabbit hole that is tieing `ARGV` in order to overload Perl's magic `<>` operator... quite an adventure, but at least I have something to show for it: Tie::Handle::Argv - A base class for tying Perl's magic `ARGV` handle File::Replace::Inplace - Emulation of Perl's `-i` switch via File::Replace The former is probably only interesting if you want to implement your own module that overloads `ARGV`, while the latter is hopefully something interesting: It's a drop-in replacement for Perl's `-i` switch: `$ echo -en "hello\nworld\n" >foo.txt $ perl -i -ple '$_ = ucfirst' foo.txt $ cat foo.txt Hello World $ echo -en "quz\nbaz\n" >bar.txt $ perl -MFile::Replace=-i -ple '$_ = ucfirst' bar.txt $ cat bar.txt Quz Baz` [download] So what's the advantage of this, especially given that `-i`'s behavior was recently updated to operate in a fashion similar to File::Replace? Well, the above works on older Perls too (on *NIX, I've tested all the way down to 5.8.1, although I recommend ≥5.16, and on Windows I do have a bit more trouble testing, but it should work down to at least 5.10). This should be an advantage since you get guaranteed behavior even if you're not sure whether the box you're currently on has 5.28+ installed. Also, File::Replace is a lot more stringent in its checks, every step is checked for errors, which are generally fatal.	[reply] [d/l] [select]
Re: [RFC] File::Replace by Anonymous Monk on Sep 20, 2017 at 22:27 UTC
A guy\|gal after my own heart! Thanks for creating the module. I am, however, curious to know why you want to support more than one method (3!) to interact with your module.	[reply]
Re^2: [RFC] File::Replace by haukex (Archbishop) on Sep 21, 2017 at 07:19 UTC
I'm glad you like it! I am, however, curious to know why you want to support more than one method (3!) to interact with your module. The TL;DR is that I know people have different preferences, and since all three interfaces are fully tested and provide the same functionality (same parameters, same safety features, etc.), it really just is a matter of preference. The slightly longer answer is: I started with only the "single magic tied filehandle" interface because I thought it was kind of neat, and it helps keep short scripts short. However, I realized that not everyone likes too much magic, plus it might be too hard to remember which I/O functions operate on the input file and which on the output file, so I implemented the "two filehandles" interface because I thought it would be the most natural, and indeed, as I demonstrated in the root node, for code that is already using two filehandles, no major changes to the I/O code should be needed. By that point however, I was packing too much into one module and I was running into implementation problems, so I made the cleaner separation of putting the core functionality in an OO module, and wrapping that functionality with the tied filehandles. I figured that some people might prefer OO and/or dislike tied filehandles. It has the added benefit of being the only of the three interfaces that doesn't use tied filehandles at all, in case the user wants to do something fancy with the underlying filehandles, like tie-ing them on their own.	[reply]
Re: [RFC] File::Replace by Anonymous Monk on Sep 20, 2017 at 21:14 UTC
Do not assume that it will be an atomic operation – it very likely will not be. There can also be issues with directory caches on network file systems. (Some remember that a file does not exist, and continue to say that it does not when it does. And vice-versa.)	[reply]
Re^2: [RFC] File::Replace by haukex (Archbishop) on Sep 21, 2017 at 07:05 UTC
Do not assume that it will be an atomic operation Absolutely, which is why I make a point of that both in the root node and the module's documentation - the module's job is to do the rename. I'll highlight that caveat a little more in the next version.	[reply]
Re^3: [RFC] File::Replace by Anonymous Monk on Sep 26, 2017 at 23:41 UTC
Documentation sections which specifically address NFS (at least NFSv3 and NFSv4), would be beneficial. There are many subtle issues regarding both client and server-side (filename) caching, in this very popular network filesystem, which directly intersect what your module will be doing. Freely reference good external web-sites and pages ...	[reply]

Back to Meditations