http://www.perlmonks.org?node_id=488838


in reply to Re: Perl Best Practices book: is this one a best practice or a dodgy practice?
in thread Perl Best Practices book: is this one a best practice or a dodgy practice?

I'm not even sure what the advantage might be to using the same filename for input and output.
I certainly don't suggest (either here or in the book) that you should use the same file for input and output. I merely observe that, if you allow people to independently specify input and output filenames on the command-line, some of them inevitably will use the same name for both files (either intentionally or accidentally). So you have to be aware of the possibility and cope with it somehow.

In the book, I suggest two solutions, one of which is more efficient but apparently fails on certain obscure operating systems (and, yes, I will definitely update the book to reflect that limitation, as soon as I can).

  • Comment on Re^2: Perl Best Practices book: is this one a best practice or a dodgy practice?

Replies are listed 'Best First'.
Re^3: Perl Best Practices book: is this one a best practice or a dodgy practice?
by spiritway (Vicar) on Sep 03, 2005 at 04:47 UTC

    if you allow people to independently specify input and output filenames on the command-line, some of them inevitably will use the same name for both files (either intentionally or accidentally).

    Ah, that makes perfect sense now... it's preemptive programming. You're right - someone will eventually do that...

Re^3: Perl Best Practices book: is this one a best practice or a dodgy practice?
by fizbin (Chaplain) on Sep 05, 2005 at 12:08 UTC
    In that case, might it not be better to make the "best practice" exposition something like this?
    When accepting command-line arguments that specify both an input file and an output file, be aware that eventually someone using your program will specify the same filename for both input and output. If your code looks like this:
    # Standard open for input, open for output code
    then the result is going to be a blank output file! (or whatever it is your program produces given a blank input file). This isn't very friendly to the user. Slightly more userfriendly is to check for this condition in your code:
    if ($infile eq $outfile) { die "Same filename given for both input and output!"; }
    This protects against a user who accidentally uses the same filename for input and output, but doesn't protect against filenames which are different strings but name the same file (for example, "foo" and "./foo", or "foo" and "bar", if "bar" is a symbolic link to "foo")

    However, it may be the case that the occasional user actually does want the output file to replace the input file, much as perl's own -i option does. Certainly, if possible, we should allow this since we want our programs to be useful to the user. In this case: (... and here continue on with the unlink trick, etc.)

    --
    @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
      If you want to be a bit more thorough, you could do something like
      use File::Spec::Functions( rel2abs canonpath ); my $full_inpath = canonpath( rel2abs( $infile ) ); my $full_outpath = canonpath( rel2abs( $outfile ) ); if ( $full_inpath eq $full_outpath ) { ... }
      Yes, there are ways to fool this. Ideally, there would be some is_same_real_file() in File::Spec that would take into account symlinks, case sensitivity, volume names, and the like.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

        You might be interested in Cwd::abs_path -- but in general, what you want to do here isn't always possible. Consider a windows box that SMB-exports two directories, one of which is a child of the other.


        Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).