Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Temporary text files with File::Temp

by jarich (Curate)
on Feb 26, 2008 at 03:25 UTC ( #670208=perlquestion: print w/ replies, xml ) Need Help??
jarich has asked for the wisdom of the Perl Monks concerning the following question:

I teach Perl to programmers using both *nix and Windows operating systems. One of the suggestions I give them is to use File::Temp for temporary files. It's a great module, especially because it's standard so everyone will already have it.

Unfortunately, I've just discovered that File::Temp doesn't exactly do what I want it to. When my students using *nix open temporary text files, everything just works. When my students using Windows open temporary text files, their newlines (\n) are not correctly translated into crlf.

John M. Dlugosz touched on the problem a couple of years ago in Why does File::Temp use sysopen? but he was blaming the wrong cause. File::Temp uses binmode (or an equivalent) if it can. It says so in the documentation.

BINMODE

The file returned by File::Temp will have been opened in binary mode if such a mode is available. If that is not correct, use the binmode() function to change the mode of the filehandle.

Note that you can modify the encoding of a file opened by File::Temp also by using binmode().

The binmode documentation says:

For the sake of portability it is a good idea to always use it when appropriate, and to never use it when it isn't appropriate.

I realise I can "fix" File::Temp's use of binmode for text files by calling binmode:

my ($tmp_fh, $tmp_name) = tempfile(); binmode($tmp_fh, ":crlf" );

but I don't want to teach that. I want an easy, 100% portable way of doing it. I want my students to be able to write exactly the same code on either system and have it work - even for text files.

Am I missing an easier way of doing this?

Comment on Temporary text files with File::Temp
Select or Download Code
Re: Temporary text files with File::Temp
by bcrowell2 (Friar) on Feb 26, 2008 at 04:09 UTC

    I believe Active State Perl on Windows implements the POSIX module -- http://aspn.activestate.com/ASPN/docs/ActivePerl/5.8/lib/POSIX.html -- so maybe:

    use POSIX qw(tmpnam); my $tmp_name = tmpnam();

    In general, I think it's a laudable goal to write code that's 100% portable, and works properly on all platforms without any special casing. Realistically, I don't think that's possible for any perl program of any significant complexity. Maybe it's doable if everything your students are writing is just short exercises. The one time I wrote a cross-platform perl app to run on both Windows and Unix, all I managed to do was isolate all the OS-dependent stuff in one file, but I did have to have a bunch of special casing in that file. I remember lots of hassles with, e.g., filename globbing.

      I think it's a laudable goal to write code that's 100% portable, and works properly on all platforms without any special casing. Realistically, I don't think that's possible for any perl program of any significant complexity.

      To a large extent, I agree. It can be very difficult to write Perl programs of significant complexity that are 100% portable, especially if you're not practiced in doing such.

      However, I'm not talking about writing Perl programs of any complexity in this question. I don't think it should be hard to open a temporary text file and be able to print stuff including newlines to it, and have the right thing happen. Regardless of operating system. This is an introductory course to Perl after all.

      My students should be able to write:

      # Reverses each line in the given file # and replaces that file with the reversed copy use File::Temp qw(tempfile); use File::Copy qw(move); use Fatal qw(open close move); my $filename = shift or die "Usage: $0 filename"; open(my $in, "<", $filename); my ($tmp_fh, $tmp_name) = tempfile(); while(<$in>) { chomp; print {$tmp_fh} scalar(reverse($_)), "\n"; } close $in; close $tmp_fh; move($tmp_name, $filename);

      and expect it to work flawlessly on any operating system that has Perl installed. There's nothing in that code which would suggest to a Perl novice (or me until last week) that that code isn't portable.

      I realise that it's fundamentally a question of what the logical default case should be. File::Temp has several choices; always binmode the file, never binmode the file, binmode the file if passed a flag, don't binmode the file when passed a flag. They've chosen the first, I was hoping there might have been an undocumented way to get the last.

      Alternately if there is a special layer I can pass to binmode itself to say "treat as text" so that Perl can do it's special text-file magic by translating \n to the appropriate thing; I'd love to hear about that. This isn't achieved by using the :crlf layer, because that layer is Windows specific, and won't do the right thing under the *nixes.

      Any other suggestions?

      thanks, jarich

        The problem is that the default for normal open and the default for temfile is different.

        Teach to always specify the mode of a file, text or binary, because on some platforms it matters. For that matter, be clear about the character encoding too.

        Maybe you should update the File::Temp module.

        Or, write your own tempfile function that calls the regular open. That way you have consistent semantics. That's what I do.

        —John

        jarich, you wrote:

        "Alternately if there is a special layer I can pass to binmode itself to say "treat as text" so that Perl can do it's special text-file magic by translating \n to the appropriate thing; I'd love to hear about that."

        I was recently investigating the use of the I/O layers and seem to recall that there may be a way to do what you asked about. But I'm by no means an expert on the topic so it isn't something I've got much experience with.

        But wouldn't that constitute exactly what you said you didn't want...special processing for each environment? And wouldn't that defeat your goal of keeping it simple for your students?

        ack Albuquerque, NM

      Regarding POSIX's tmpnam, you might want to check the documentation on that too:

      tmpnam

      Returns a name for a temporary file.

      $tmpfile = POSIX::tmpnam();

      For security reasons, which are probably detailed in your system's documentation for the C library tmpnam() function, this interface should not be used; instead see File::Temp.

      With respect to actually writing portable code, it's probably gotten a lot easier since you tried it. File::Spec, File::Copy, improvements to glob and File::Glob etc, all help.

Re: Temporary text files with File::Temp
by jwkrahn (Monsignor) on Feb 26, 2008 at 04:26 UTC

    You can also create a temporary file with open using undef as the third argument:

    open my $TEMP, '+>', undef or die "Cannot open temporary file: $!";
      if teaching his students to use binmode wasn't acceptable, neither is this

        We teach binmode, for use when we expect the file to be binary. If there were a layer I could pass to binmode which turned off binary, or turned everything back to text processing (text/reset/default/native) which didn't require another module to do it, I'd be thrilled. Unfortunately this doesn't seem to be the case.

        I don't mind teaching my students that File::Temp assumes that temporary files will be binary by default. I was just hoping that someone could show me the easy, portable way we could revert that. But it doesn't seem to exist in Perl 5.10 and below. Perhaps I'll get a patch into File::Temp for 5.10.1, or alternately have some luck getting a layer for binmode which turns off binary.

        Unfortunately in-memory files isn't really what I was looking for. They're cool, but they don't really solve the problem I was having. I want to be able to write out a file to a temporary location, then when I know that it's been fully written, I want to move the file over an original atomically. This means that at no point can someone access that file and get invalid data. Old previously-valid data, sure, but never only partly-written data.

        thanks, jarich.

Re: Temporary text files with File::Temp
by Anonymous Monk on Feb 26, 2008 at 06:16 UTC
    The use of O_BINARY where available was in the very first back-of-the-envelope suggestion from Tom Christiansen in March 2000. I admit to not really thinking through the consequences but clearly Tom imagined that most temp files would be binary rather than text. There is a bit of a backwards compatibility issue with changing the default to text, so any option would have to be provided in addition to current usage. I would be open to a TEXTMODE patch that has been tested on windows. I've never really had a Windows box for testing things on and no user has ever offered patches to get it working properly on windows.

    Couple of asides:

    • tmpnam is not safe. By the time you get round to using it the file could have been taken by someone else.
    • You should be teaching people the OO interface rather than the subroutine interface. It has the advantage that the file is deleted at the correct time. This is a real problem on windows where you can't delete an open file (at least not without using Win32 calls that I can't test). Using new() and newdir() methods give much better control and more predictable file deletion instead of relying on END blocks.
Re: Temporary text files with File::Temp
by adamk (Chaplain) on Feb 27, 2008 at 05:55 UTC
    As a small aside, on Unix at least the use of Temp::File-generated temp files is inappropriate if you are using it for the purpose of atomic overwrite.

    The reason is that most of the time /tmp is not on the same partition as the file you want to overwrite, so the resulting system move command to move the file into place is not atomic, it has to copy the bitstream.

    If you want atomic overwrite, you HAVE to write the initial file to somewhere on the same partition as the target, which generally means to the same directory.
      File::Temp allows you to write the temp file to any directory you choose. There is no requirement to have it end up in /tmp.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://670208]
Approved by NetWallah
Front-paged by NetWallah
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (14)
As of 2014-12-18 20:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (61 votes), past polls