Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Tip on better performance: Open and close output file or leave it open?

by Anonymous Monk
on Sep 07, 2014 at 15:18 UTC ( [id://1099806]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I would like to ask you which is more optimal for my code.
I have a script that takes, say maybe 15 mins or more to run... The results are placed into an output file. My question is, is it better to open the output file once, in the beginning and keep it open (I mean with the filehandle), , or open and close it whenever I have something to print into it?
  • Comment on Tip on better performance: Open and close output file or leave it open?

Replies are listed 'Best First'.
Re: Tip on better performance: Open and close output file or leave it open?
by davido (Cardinal) on Sep 07, 2014 at 15:50 UTC

    ...is it better to open the output file once, in the beginning and keep it open ..., or open and close it whenever I have something to print into it?

    Opening is a system call. Closing results in system calls. Both of these acts consume some amount of time greater than zero. The significance of the amount of time each call takes is something that one could only quantify by knowing how many times the open/close cycle are happening in your script, and how much time your script is spent inside of these calls.

    This important information that reveals the significance of the open/close calls within your runtime can be determined by using Devel::NYTProf. But if your script is taking 15 minutes to run with your data set, before using NYTProf, create a sample data set that's about 10-20% of that size while remaining representative of your real data. Then profile, and see where your problems are.


    Dave

Re: Tip on better performance: Open and close output file or leave it open?
by AppleFritter (Vicar) on Sep 07, 2014 at 20:00 UTC

    Generally speaking: when optimizing, measure, don't guess. Try both variants and check which actually runs faster, and by how much.

    That said, opening and closing a file each time you need to write to it seems wasteful to me. Why are you doing that? If you need to ensure that your data hits the disk instead of getting buffered, explicitely flushing the file and/or enabling output autoflush may be a better option (but again: measure, don't guess!). perlfaq5 has some information on how to do this.

    If you're only looking to make your script run faster in general, then (again!) measure where it's slow, don't guess; use a profiler to find the hotspots, and work on optimizing those.

Re: Tip on better performance: Open and close output file or leave it open?
by Anonymous Monk on Sep 07, 2014 at 15:25 UTC

    We don't really know enough about your code to give a definitive answer. I think it's pretty obvious that opening and closing a file each time you write to it will take more CPU cycles than keeping it open the whole time. However, there are cases where you might have to close and re-open a file, for example if other processes are accessing the same file during the run of the script. Please see How do I post a question effectively?

    It also helps to profile your code to check where the bottlenecks really are.

Re: Tip on better performance: Open and close output file or leave it open?
by mr_mischief (Monsignor) on Sep 08, 2014 at 13:37 UTC

    There are a handful of reasons to close and reopen a filehandle repeatedly:

    • you have another process reading the file while you're writing it
    • you want to be really extra sure you get partial output safely to disk in case the program doesn't complete
    • you have to reopen the filehandle redirected from or to multiple sources or destinations over the course of the program

    If none of these apply, you can save some cycles by leaving it open. You may save a small bit of memory while it's not open. Whether either of these is worthwhile to worry about I can't say without measuring. I doubt this is a matter of any concern unless you're closing and reopening in a tight loop.

Re: Tip on better performance: Open and close output file or leave it open?
by Anonymous Monk on Sep 07, 2014 at 21:23 UTC

    Consider, STDOUT/STDERR, they're almost always open for you, but you rarely open/close STDOUT/STDERR frequently from a program -- how does your usage compare?

    Consider flock/File::Lockfile, could some other program read/update the file you're printing to? Maybe you want to open a tempfile() which you rename to final filename when you're done with the file?

Re: Tip on better performance: Open and close output file or leave it open?
by sandy105 (Scribe) on Sep 08, 2014 at 10:05 UTC

    yep common consensus would suggest opening /closing file handles will be wasteful

    but yes go ahead and time it

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1099806]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-24 00:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found