Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Cheap idioms

by Juerd (Abbot)
on Oct 13, 2002 at 09:08 UTC ( #204874=perlmeditation: print w/replies, xml ) Need Help??

There are some idioms that make life easier, because you'll have less to type. They're often used in one-liners and throw-away scripts. But there's one that I like more and more, every time I use it. I'm using it in production scripts now, and I'm starting to wonder if that's a good idea.

It's the cheap file slurp that uses the magical *ARGV and the fact that (@foo, $foo) = $bar will always set $foo to undef.

my $contents = do { local (@ARGV, $/) = $filename; <> };
Is this readable and maintainable enough, or do you think I should really stick to creating slurp routines?
sub slurp { my ($filename) = @_; local $/ = undef; open my $fh, $filename or die "$filename: $!"; return <$fh> }
I know File::Slurp exists, but don't like using modules for what can be done with a simple sub or regex. (Or maybe I would use a module if there was one module with a bunch of subs that I often use. Maybe I should release a hmmm :)

Do you think using the short slurping idiom in production code is a problem?

- Yes, I reinvent wheels.
- Spam: Visit eurotraQ.

Replies are listed 'Best First'.
Re: Cheap idioms
by Aristotle (Chancellor) on Oct 13, 2002 at 11:13 UTC
    I like the idiom. It took me a moment to grok what was happening with @ARGV, but now I've seen it I won't need to think again. However I'd still tack a comment on it (just a # slurp). The sub on the other hand is needlessly verbose. Why not take the middle road: sub slurp { local (@ARGV, $/) = shift; <> }

    Now the sub's name is the comment, and its calls are self documenting.

    Putting this in an Idiom::Common among with similar snippets would be nice. :)

    Makeshifts last the longest.

(tye)Re: Cheap idioms
by tye (Sage) on Oct 13, 2002 at 17:34 UTC

    A more robust method is:

    my $data= do { local( *ARGV, $/ ); @ARGV= $filename; <> };
    because your method doesn't localize changes to $ARGV and <ARGV> and so, if used in a subroutine that is used by a program that is using <>, your method can break the outer program.

    Note that prior to Perl v5.6.0 (I think) this idiom didn't work correctly.

    And these are exactly the reasons why I much prefer to use a good module over some idiom. That way improvements can be centralized in one place.

            - tye (see one prior discussion)

      A more robust method is...

      Redundancy :) I don't like typing ARGV twice, but realise localizing *ARGV is necessary. Thanks for the pointer - I think I'm very fortunate to not have been bitten by this yet.

      Since you can assign a reference to a typeglob to set only one data type associated with it, simply adding brackets solves my problem of having to type the four uppercase letters twice. *foo = [ 1, 2, 3 ]; is like @foo = (1, 2, 3);, but with rather different semantics.

      my $contents = do { local (*ARGV, $/) = [$filename]; <> };

      Note that prior to Perl v5.6.0 (I think) this idiom didn't work correctly.

      A quick, possibly broken test (perl5.005 -e'print do { local (@ARGV, $/) = "/etc/passwd"; <> }') shows that the unaltered version works with perl 5.005. But that one is broken. Any version of this idiom localizing *ARGV doesn't work with perl5.005, so your version doesn't play nice with it.

      That is not at all a problem, though. I don't use perls older than 5.6.0 anymore, except when explicitly asked. It's time we moved on to newer versions. There's perl 5.8.0 already, and people are still using 5.005. The _03 release is over three years old now.

      I put use 5.006; in my code to make sure things break as soon as possible. I even use use v5.6; sometimes, just to nag :)

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.

        This won't work for me, although tye's version will.
        #!/usr/bin/perl use strict; use warnings; moose(); sub moose { my $filename = ''; my $contents = do { local (*ARGV, $/) = [ $filename ]; <> }; print $contents; } sub loose { my $filename = ''; my $data= do { local( *ARGV, $/ ); @ARGV= $filename; <> }; print $data; } __END__ readline() on unopened filehandle ARGV at line 8. Use of uninitialized value in print at line 9.
        it opens a script to exemplify ternary ops. Nothing special there. Using: This is perl, v5.6.1 built for MSWin32-x86-multi-thread.
        I have a little question: is that "trick" actually faster than going through all the open , read, close, method? Or is it exactly the same, only it fits in 1 line ?
        I don't like typing ARGV twice, but realise localizing *ARGV is necessary. Thanks for the pointer - I think I'm very fortunate to not have been bitten by this yet.
        It's been a few years now; have you been bitten by not also localizing $^I yet?
Re: Cheap idioms
by ignatz (Vicar) on Oct 13, 2002 at 13:37 UTC
    For me this
    my $contents = do { local (@ARGV, $/) = $filename; <> };
    in production code is a real pain.

    This, on the other hand

    # Slurp file my $contents = do { local (@ARGV, $/) = $filename; <> };
    is just fine.

    Very cool, BTW.

    ADDED: --/me for not noticing Aristotle's post saying the same thing.

Re: Cheap idioms
by DapperDan (Pilgrim) on Oct 13, 2002 at 11:06 UTC
    Hi Juerd-

    Nice post.

    I think maybe the only problem with this is that it's not an idiom yet. The thing about idioms IMHO is that it matters less whether they're readable because you just recognise it and say "oh this is that idiom" and move on. That takes time because people have to start using the idiom. I guess as with every other new idea, you need some early adopters. :)

    But that shouldn't be a problem for long; I will always know what that piece of code means when I see it now.

    I too get tired of coding up little slurps all the time (generally using local $/). I presume I can use this for any handle and not just ARGV?

      Since @ARGV is localized, it doesn't destroy the main program's @ARGV contents. And by using @ARGV you get away with the magic diamond operator, thus avoiding the need to open the file yourself. It's a really beautiful idiom. Juerd++.

      Makeshifts last the longest.

        Thank you for the clarification.
Re: Cheap idioms
by demerphq (Chancellor) on Oct 14, 2002 at 11:07 UTC
    On the subject of cheap idioms...
    select( ( select(STDOUT), $| = 1 )[0] );
    Took me a few minutes to work out, but I like it, a lot actually.


    --- demerphq
    my friends call me, usually because I'm late....

      select( ( select(STDOUT), $| = 1 )[0] );

      This only works if the inner select is evaluated before the assignment is, and I can't find any specification of evaluation order (remember ++$a, $a++, ++$a?)

      That's why I don't dare to use this idiom, although I see it often. I still prefer STDOUT->autoflush(1) (using IO::Handle). It's shorter too :)

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.

        Ah, but you're wrong. In a simple list, as perlop explicitly states, operands are evaluated left-to-right. Otherwise, you couldn't write print("Done.\n"), exit if $done;

        either, but we all know that it works, right? Your example is problematic because is uses pre- and postincrement operators which mess with the order of side effects.

        IO::Handle loads over 1,000 lines of code - if it's just for a single autoflush, what's the point? Especially seeing as the select idiom is known and even delivered with the perldocs, it's safe to assume it isn't cryptic.

        Makeshifts last the longest.

Re: Cheap idioms
by Aragorn (Curate) on Oct 13, 2002 at 17:35 UTC
    Nicely small and concise. I'd only use this in stand-alone scripts, because there aren't any checks on file existence, etc. Burying this kind of code inside libraries is asking for trouble... :-)

      Actually, part of the beauty of this idiom is that you get Perl itself to check for file existence and to report better than average error messages.

                      - tye
        Oh, I like this idiom, but strictly in stand-alone scripts. Bombing out of a program deep inside some library doesn't sit well with me.

        And Perl exits with the following error message if the file doesn't exist using this idiom:
        Can't open /does/not/exist: No such file or directory.
        Uhm. I can do that :-)

Re: Cheap idioms
by broquaint (Abbot) on Oct 14, 2002 at 08:43 UTC
    If I'm ever in the position where I need to slurp a file I'll go with IO::File's handy getcontents() method
    my $contents = IO::File->new($filename)->getcontents();
    I'm not sure how that benchmarks, but I'm of the opinion that it looks much more elegant ;)


      It looks very neat indeed, but it loads over 1,000 lines of Perl code and an XS module in dependencies.. in a short(!) CGI script I wouldn't want that - but that's always a difficult environment. It's too much to type in a oneliner also, but that too is a special case.

      For a large, "proper" script it is too brief - you run the risk of bombing out with a Can't call method "getcontents" on an undefined value since you don't check whether new() succeeded.. it would have to look maybe like this:

      my $contents = do { my $fh = IO::File->new($filename) or die "Failed to open $filename +: $!"; $fh->getcontents(); }
      And now it doesn't look so neat anymore. :-( Juerd's idiom on the other hand has builtin error reporting.

      Makeshifts last the longest.

Re: Cheap idioms
by particle (Vicar) on Oct 20, 2002 at 22:46 UTC

    i've been localizing handles with anonymous subroutines for a while now.

    i think i'd write your snippet as an anonymous sub, giving it a little more muscle compared to a do block. something like:

    #!/usr/bin/perl require 5.006_001; use strict; use warnings; $|++; my %cool_funcs = ( ## slurp a file to a scalar ## pass filename (as scalar) ## returns contents of file (scalar context) slurp_to_scalar => sub{ local( *ARGV, $/ ); @ARGV = @_; <> }, ## slurp one or more files to an array ## pass filename(s) (as scalar) ## returns contents of file(s) (list context) ## returns number of lines (scalar context) slurp_to_array => sub{ local *ARGV; @ARGV = @_; <> }, ); ## to use it: ## create a list of test files my @files = @ARGV; ## get the contents to an array, and the number of lines to a scalar my $no_of_lines = ( my @contents ) = $cool_funcs{slurp_to_array}->( @files ); ## these values match... print $no_of_lines, $/, scalar @contents, $/; ## here's the good stuff... print @contents; ## get file to scalar -- note extra args are ignored my $data = $cool_funcs{slurp_to_scalar}->( @files ); ## here's the file's contents... print $data;

    ~Particle *accelerates*

Re: Cheap idioms
by John M. Dlugosz (Monsignor) on Oct 14, 2002 at 21:11 UTC
    I would write it in its own nice-named function, even if the function body contained the idiom.

    Cute, and I note that the module File::Slurp doesn't do it that way!

    I agree with tye. What if you were doing this in an older version of Perl and the problem was brought to your attention? You would have to find and fix every usage. But if using a module, fix it once. The module might grow and handle accomidations, nits, and other issues. Everyone benifits from that experience.

    BTW, I recall not using File::Slurp somewhere, because, like yours, it doesn't do binmode.

Re: Cheap idioms
by Burak (Chaplain) on May 30, 2008 at 08:30 UTC
    "I'm using it in production scripts now ..."
    Well. That's a hack and a bad idea to use in some *real* code. I suggest adding a long comment before it to explain what the heck it does. But anyway I still think that this code is nasty

    And here is another nasty one (multi dimensional hash emulation):

    C:\>perl -wle "use strict; my %h; $h{'foo','bar', 'baz'} = 'dumb'; pri +nt $h{qw/foo bar baz/}" dumb C:\>perl -V:version version='5.10.0'; C:\>
    achaic but works and hey, who needs references :p

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://204874]
Approved by blakem
Front-paged by Revelation
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2023-11-29 06:15 GMT
Find Nodes?
    Voting Booth?

    No recent polls found