"open" Best Practicesby haukex (Bishop)
|on Jul 11, 2019 at 14:41 UTC||Need Help??|
open Best Practices
TL;DR: open my $fh, '<', $filename or die "$filename: $!";
You will see styles of open such as "open FILE, $filename;" or "open(LOG, ">$filename") || die "Could not open $filename";" in many places. These mainly come from versions of Perl before 5.6.0 (released in 2000), because that version of Perl introduced lexical filehandles and the three-argument open. Since then, these new features have become a best practice, for the reasons below.
1. Use Lexical Filehandles
Instead of open FILE, ..., say: open my $fh, ....
Lexical filehandles have the advantage of not being global variables, and such filehandles will be automatically closed when the variable goes out of scope. You can use them just like any other filehandle, e.g. instead of print FILE "Output", you just say print $fh "Output". They're also more convenient to pass as parameters to subs. Also, "bareword" filehandles like FILE have a potential for conflicts with package names (see choroba's reply for details), and they don't protect against typos like lexical filehandles do! (For two recent discussions on lexical vs. bareword filehandles, see this and this thread.)
2. Use the Three-Argument Form
Instead of open my $fh, ">$filename", say: open my $fh, '>', $filename.
In the two-argument form of open, the filename has to be parsed for the presence of mode characters such as >, <+, or |. If you say open my $fh, $filename, and $filename contains such characters, the open may not do what you want, or worse, if $filename is user input, this may be a security risk! The two-argument form can still be useful in rare cases, but I strongly recommend to play it safe and use the three-argument form instead.
In the three-argument form, $filename will always be taken as a filename. Plus, the mode can include "layers", so instead of having to do a binmode after the open, you can just say e.g. open my $fh, "<:raw", $filename, or you can specify an encoding such as open my $fh, ">:encoding(UTF-8)", $filename. (Note: If you're on Windows, to decode UTF-16 properly, you need to say "<:raw:encoding(UTF-16):crlf", because otherwise the default :crlf layer will incorrectly mangle the Unicode characters U+0D0A or U+0A0D.)
3. Check and Handle Errors
open my $fh, '<', $filename; # Bad: No error handling! open my $fh, '<', $filename || die ...; # WRONG!1 open my $fh, '<', $filename or die "open failed"; # error is missing info open my $fh, '<', $filename or die "$filename: $!"; # good open(my $fh, '<', $filename) or die "$filename: $!"; # good open(my $fh, '<', $filename) || die "$filename: $!"; # works, but risky!1 use autodie qw/open/; # at the top of your script / code block open my $fh, '<', $filename; # ok, but read autodie!
You should check the return value of the open function, and if it returns a false value, report the error that is available in the $! variable. It is best to also report the filename as well, and of course you're free to customize the message as needed (see the tips below for some suggestions).
1 It is a common mistake to use open my $fh, '<', $filename || die ... - because of the higher precedence of ||, it actually means open( my $fh, '<', ($filename || die ...) ). So to avoid mistakes, I would suggest just staying away from || in this case (as is also highlighted in these replies by AM and eyepopslikeamosquito).
Note that open failing does not necessarily have to be a fatal error, see some examples of alternatives here. Also, note that the effect of autodie is limited to its lexical scope, so it's possible to turn it on for only smaller blocks of code (as discussed in kcott's reply).
4. Additional Tips
Fellow Monks: I wrote this so I would have something to link to instead of repeating these points again and again. If there's something you think is worth adding, please feel free to suggest it!
Update 2019-07-12: Added section "Additional Tips", mentioned bareword filehandles, and added a bit more on autodie. Thanks to everyone for your suggestions! 2019-07-13: Added more suggestions from replies, thanks! 2020-04-19: Added mention of typo prevention, as inspired by lexical vs. local file handles. 2020-06-07: Added links to threads about bareword vs. lexical handles and added note about :crlf and UTF-16 interaction on Windows.