Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };

by jeffa (Chancellor)
on Aug 29, 2003 at 13:41 UTC ( #287647=perlmeditation: print w/ replies, xml ) Need Help??

Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };

open FILEHANDLE, 'somefile.txt' or die $!; my $string = do { local $/; <FILEHANDLE> };

The above idiom is a consise way to "slurp" the entire contents of a file into a scalar without using a loop, such as:

open FILEHANDLE, 'somefile.txt' or die $!; my $string = ''; while (<FILEHANDLE>) { $string .= $_; }

How it works

The first piece of this code is the do "function", which returns the value of the last expression executed in it's block. In this case, that last expression is simply <FILEHANDLE>.

The expression <FILEHANDLE> will either return the next line from the filehandle if it is called in scalar context, or it will return the entire contents from the filehandle as a list if called in list context:

my $scalar = <FILEHANDLE>; # one line my @array = <FILEHANDLE>; # whole file

The reason why <FILEHANDLE> only returns one line is because by default, the built-in Perl variable $/ is set to one newline. $/ is the input record seperator, and it can be used to modify the behavior of how many records are read in when you use the diamond operators (<FILEHANDLE>). The docs explain that if $/ is set to undef, then accessing <FILEHANDLE> in scalar context will grab everything until the end of the file:

undef $/; my $scalar = <FILEHANDLE>; # whole file

However, changing Perl's built-in variables can be dangerous. Imagine you wrote a module that others use. Inside this module you set $/ to undef, thinking that everywhere else $/ will be the default value. Well, wrong. You just changed $/ for everyone that uses your module. This is one of those few places where local is the right choice.

Which brings us to the FIRST expression in our do block:

local $/;

This is the same thing as explicitly assigning $/ to undef:

local $/ = undef;

But not the same as:

$/ = undef; # Danger Will Robinson! Danger!

Because we are inside a do block when we use local, the value of $/ is temporarily changed, and we can rest assured that it will not affect code outside of our block (or scope). If we were not inside another block or scope, local $/ will only affect the package it was encountered in, but it's better to contain local $/ inside a temporay scope, unless you enjoy debugging hard to find bugs.


Summary

The do block is used create a temporary scope. Inside this temporary scope, we temporarily assign undef to $/ and retrieve the "next line" from our filehandle. Since $/ is undefined, this "next line" is "everything until End Of File" - hence, the entire file.


Caveats

Memory! Anytime you store an entire file you should be aware of it's potential size. If you only need to deal with one line at a time, then use a loop instead.

Also, a popular use for this idiom is in conjunction with the built-in DATA filehandle:

my $string = do {local $/;<DATA>};

This is handy for scripts that use modules such as Text::Template and HTML::Template, but note that both modules allow you to pass some kind of reference to a file handle, so this idiom is not needed. For example, instead of:

my $data = do {local $/;<DATA>}; my $template = HTML::Template->new(scalarref => \$data);

You can simply say:

my $template = HTML::Template->new(filehandle => \*DATA);

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

Comment on Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
Select or Download Code
Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by liz (Monsignor) on Aug 29, 2003 at 14:51 UTC
    I've ++ed this, but I wonder whether it isn't time to remove glob file handles from documentation and replace them with lexically bound file handles? Instead of:
    open FILEHANDLE,'filename' or die "open of 'filename' failed: $!\n";
    at least show the more modern idiom:
    open my $handle,'<','filename' or die "open of 'filename' failed: $!\n +";
    I realize the title is "Idioms explained", but you're also teaching people new idioms this way. So from a teaching point of view, I should at least mention that there is a new and better way of opening files since Perl 5.6.0 (if I'm not mistaken).

    Liz

Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by Zaxo (Archbishop) on Aug 29, 2003 at 14:52 UTC

    It may be useful to restrict the scope of the file handle as well, since slurping is a one pass approach. That may be done with

    my $string = do { local $/; open local(*FILEHANDLE), 'somefile.txt' or die $!; <FILEHANDLE> };
    or, better if lexical filehandles are available,
    my $string = do { local $/; open my $fh, 'somefile.txt' or die $!; <$fh> };
    since the lexical handle is properly closed when it goes out of scope. Either way ensures no interference with other handles of the same name.

    After Compline,
    Zaxo

      How about opening it in binary mode? I would historically use a binmode after opening, but in 5.8 I should be able to use :raw or something like that. When I tried once, I spent a few minutes trying to make it work, and went back to the old way. What's the polished form of the idiom that uses the new syntax for binary files?

      The truly modern PerlIO approach would be,     open my $fh, '<:raw', '/path/to/somefile.txt' or die $!;
      but I'd do it with binmode still, right after opening.

      After Compline,
      Zaxo

Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by VSarkiss (Monsignor) on Aug 29, 2003 at 15:25 UTC

    One of the best explanations I've heard for $/ is due to TheDamian1:

    $/ tells Perl when to stop reading.
    So, when $/ has its default value of newline, Perl stops reading when it sees the newline character. When it's not defined, Perl doesn't stop reading until the end of the stream.

    1I'm pretty sure of the source, but not 100%. If you have a correction, please let me know.

Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by artist (Parson) on Aug 29, 2003 at 16:47 UTC

      But why get a module and a function just for this two expression idiom?

        But why get a module and a function just for this two expression idiom?

        I guess the main reason would be that it would be fairly self-evident, even to a Perl newbie, what File::Slurp does and how to use it. I think the first time someone comes across the code in the OP (and perhaps the second and third times as well), they will find it a bit mystifying. Remembering how to use it in their own code could also be a bit difficult.

        That's not to say that there's an overwhelming reason to use File::Slurp, just that it could certainly be justified for reasons of code clarity.

        Disclaimer: I've never used File::Slurp. I'm assuming that it doesn't suck. :-)

        Wally Hartshorn

        (Plug: Visit JavaJunkies, PerlMonks for Java)

Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by princepawn (Parson) on Aug 29, 2003 at 17:55 UTC
    I like:
    my $slurp = join '', <FH>;

    Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality... download and use The Emacs Code Browser

      A couple of things. First and foremost, don't you think that ought to be:

      my $slurp = join $/, <FH>;

      Otherwise you can't really distinguish between lines. Next, you're having perl read the file in, split the file on $/, and then rejoin everything and you end up with a string that was exactly what it was before the split. If you just localize $/ and then read the file in, you're done. Of course, I have some nice benchmarks that show the local $/ method to be a little under 5 times faster. As with any other benchmarks, YMMV.

      #!/usr/bin/perl -w use Benchmark qw(cmpthese); # file.txt is about 2200 lines each between 5 and 50 chars long open FH, "file.txt" or die $!; sub scalarcon { seek(FH,0,0); local $/; <FH>; } sub listjoin { seek(FH,0,0); join '',<FH>; } cmpthese(-5,{scalarcon=>\&scalarcon,listjoin=>\&listjoin}); __DATA__ Benchmark: running listjoin, scalarcon for at least 5 CPU seconds... listjoin: 5 wallclock secs ( 5.25 usr + 0.12 sys = 5.38 CPU) @ 30 +.70/s (n=165) scalarcon: 6 wallclock secs ( 1.95 usr + 3.36 sys = 5.30 CPU) @ 15 +3.26/s (n=813) Rate listjoin scalarcon listjoin 30.7/s -- -80% scalarcon 153/s 399% --

      Update: As chromatic has pointed out, I'm a twit.

      Hope this helps.

      antirice    
      The first rule of Perl club is - use Perl
      The
      ith rule of Perl club is - follow rule i - 1 for i > 1

        Your first suggestion would seem to double the input record separator. Did you test it? Do you somehow have autochomp enabled?

        #!/usr/bin/perl -w use strict; my $line = join $/, <DATA>; print "<<$line>>\n"; __DATA__ one two tie my shoe
      I have been using
      my $slurp = join '', <FH>;
      as well, but having had the idiom explained to me, I realize that this approach is potentially doing more work than the idiom. It makes sense to me that reading from a file handle in list context causes a list to be returned, which is then collapsed by the join, and that this approach is probably not as efficient.

      I assume that the idiom is achieving this down in the (fast) IO layer, and the cost of the block, and the localized variable are lower.

Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by BUU (Prior) on Aug 29, 2003 at 18:45 UTC
    What about the 'idiom' my $data = do{local(@ARGV,$/)='file.txt';<>}

      Check here (tye solution) for why this may cause problems in some code.

      Of course, you could shorten his solution slightly:

      my $data= do { local( *ARGV, $/ ) = [ $filename ]; <> };

      Globs are such fun. Hope this helps.

      antirice    
      The first rule of Perl club is - use Perl
      The
      ith rule of Perl club is - follow rule i - 1 for i > 1

Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by wirrwarr (Monk) on Aug 29, 2003 at 22:31 UTC
    Nice explanation, just one remark about
    Because we are inside a do block when we use local, the value of $/ is temporarily changed, and we can rest assured that it will not affect code outside of our block (or scope).
    This "(or scope)" should probably be worded more carefully. If you call another function from the block,then you are outside the scope of the block, but the value if $/ is still undef inside that function. This is one of the "features" of local that's hard to grasp, and information about it should therefore be not misleading.
    $/ = 'uga';sub a { print ":$/:\n"; } a(); { local $/; a(); } a();

    The rest is, as already said, well done!

    daniel.
Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by Juerd (Abbot) on Aug 30, 2003 at 12:45 UTC
Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by skillet-thief (Friar) on Aug 31, 2003 at 06:22 UTC
    Maybe I'm missing something. Besides basic coolness, and getting it down to one line, I don't understand the specific advantage of using "do" as opposed to how I usually slurp:
    my $string; { local $/ = undef; $string = <FH>; }
    That said, I do like the way the proposed idiom looks.

    Cheers,
    s-t
Re: Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> };
by Anonymous Monk on Sep 05, 2003 at 08:07 UTC
    Isn't it simpler by doing...

    -f "file.txt" or die "File not found!\n"; $string=`cat file.txt`;

    This is also true for the external 'sort', which is much faster than perl's own sort.

    Roger.

      You are assuming that the current operating system has the "cat" command, and has it in the current $PATH (or whatever Windows uses).

      It also costs a fork/exec which seems very costly to me compared to opening and slurping in a file.

      It may be fewer characters to type, but it probably gets terrible performance (where that's a concern) and is not portable (where that's a concern).

      I distinctly remember some package having a simple file slurp method which makes this even simpler, but I don't remember if it was specific to an Apache related module or if it could be used in any environment. Not too hard to write one if you do this often, though.

      Oh, yeah, just because a file exists (-f) doesn't mean that it is readable. You might want to change the code to use -r to decrease the risk that the cat command will fail.

      On the topic of "sort", I have been known to recommend running the external command instead of doing a sort inside of Perl. It mainly depends on how large the data set is and how complex the sorting criteria.

      -- Eric Hammond

      Unless you're not on a unix box of course :-)

      You also have to be careful that nasty characters don't get into that filename.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://287647]
Approved by Corion
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2014-07-25 23:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls