http://www.perlmonks.org?node_id=1199720

Many of you are probably aware of the pattern of opening a temporary file, reading from the original file and writing the modified contents to the temporary file, and then renameing the temporary file over the original file, which is often an atomic operation (depending on OS & FS). I recently wrote a module to encapsulate this behavior, and here is one of three interfaces that are available in File::Replace. There are several options to configure the behavior, including the ability to specify PerlIO layers, what happens if the file doesn't exist yet, etc.

use File::Replace 'replace2'; my ($infh,$outfh) = replace2($filename); while (<$infh>) { # write whatever you like to $outfh here print $outfh "X: $_"; } close $infh; # closing both handles will close $outfh; # trigger the replace

Since I hope this is something that you might find useful, I would be happy about any feedback you might have!

To give a practical example, here is an update of my code from this node. As you can see I was able to get rid of eight lines of fairly complicated code, while keeping the main loop entirely unchanged. The module also adds some more robustness, as it incorporates a few more checks on whether operations were successful or not.

#!/usr/bin/env perl use warnings; use strict; use File::Replace 'replace2'; my $filename = "/tmp/test.html"; my @to_insert = ( '<p>Hello,', 'World! It is '.gmtime(time).' UTC</p>' ); my ($ifh,$tfh) = replace2($filename); my $found; while (<$ifh>) { print $tfh $_; if (/<!--\s*INSERT\s+HERE\s*-->/i) { $found=1; print $tfh "$_\n" for @to_insert; } } close $ifh; close $tfh; die "Marker not found" unless $found;

Replies are listed 'Best First'.
Re: [RFC] File::Replace
by haukex (Archbishop) on Jan 03, 2019 at 19:20 UTC

    Well, I dove into the rabbit hole that is tieing ARGV in order to overload Perl's magic <> operator... quite an adventure, but at least I have something to show for it:

    The former is probably only interesting if you want to implement your own module that overloads ARGV, while the latter is hopefully something interesting: It's a drop-in replacement for Perl's -i switch:

    $ echo -en "hello\nworld\n" >foo.txt $ perl -i -ple '$_ = ucfirst' foo.txt $ cat foo.txt Hello World $ echo -en "quz\nbaz\n" >bar.txt $ perl -MFile::Replace=-i -ple '$_ = ucfirst' bar.txt $ cat bar.txt Quz Baz

    So what's the advantage of this, especially given that -i's behavior was recently updated to operate in a fashion similar to File::Replace? Well, the above works on older Perls too (on *NIX, I've tested all the way down to 5.8.1, although I recommend ≥5.16, and on Windows I do have a bit more trouble testing, but it should work down to at least 5.10). This should be an advantage since you get guaranteed behavior even if you're not sure whether the box you're currently on has 5.28+ installed. Also, File::Replace is a lot more stringent in its checks, every step is checked for errors, which are generally fatal.

Re: [RFC] File::Replace
by Anonymous Monk on Sep 20, 2017 at 22:27 UTC

    A guy|gal after my own heart! Thanks for creating the module.

    I am, however, curious to know why you want to support more than one method (3!) to interact with your module.

      I'm glad you like it!

      I am, however, curious to know why you want to support more than one method (3!) to interact with your module.

      The TL;DR is that I know people have different preferences, and since all three interfaces are fully tested and provide the same functionality (same parameters, same safety features, etc.), it really just is a matter of preference. The slightly longer answer is:

      1. I started with only the "single magic tied filehandle" interface because I thought it was kind of neat, and it helps keep short scripts short. However, I realized that not everyone likes too much magic, plus it might be too hard to remember which I/O functions operate on the input file and which on the output file, so
      2. I implemented the "two filehandles" interface because I thought it would be the most natural, and indeed, as I demonstrated in the root node, for code that is already using two filehandles, no major changes to the I/O code should be needed.
      3. By that point however, I was packing too much into one module and I was running into implementation problems, so I made the cleaner separation of putting the core functionality in an OO module, and wrapping that functionality with the tied filehandles. I figured that some people might prefer OO and/or dislike tied filehandles. It has the added benefit of being the only of the three interfaces that doesn't use tied filehandles at all, in case the user wants to do something fancy with the underlying filehandles, like tie-ing them on their own.
Re: [RFC] File::Replace
by Anonymous Monk on Sep 20, 2017 at 21:14 UTC
    Do not assume that it will be an atomic operation – it very likely will not be. There can also be issues with directory caches on network file systems. (Some remember that a file does not exist, and continue to say that it does not when it does. And vice-versa.)
      Do not assume that it will be an atomic operation

      Absolutely, which is why I make a point of that both in the root node and the module's documentation - the module's job is to do the rename. I'll highlight that caveat a little more in the next version.

        Documentation sections which specifically address NFS (at least NFSv3 and NFSv4), would be beneficial. There are many subtle issues regarding both client and server-side (filename) caching, in this very popular network filesystem, which directly intersect what your module will be doing. Freely reference good external web-sites and pages ...