Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

magic-diamond <> behavior -- WHAT?!

by repellent (Priest)
on Oct 29, 2008 at 21:50 UTC ( #720344=perlquestion: print w/ replies, xml ) Need Help??
repellent has asked for the wisdom of the Perl Monks concerning the following question:

I stumbled upon B::Lint's magic-diamond documentation which states that <> (also known as <ARGV>) internally uses perl's two-argument open.

This means that if <> encounters a filename "rm * |  " (just has to end with pipe "|" and optional whitespace), then it executes the shell command 'rm *'. Example:
$ mkdir diamondtest $ cd diamondtest $ touch 'rm * | ' a b c d # create 5 files $ ls # now you see it $ perl -pe 1 * $ ls # now you don't -- no files

Shouldn't this be fixed with 3-argument open? I really like the magic-diamond for quick one-liners, but this just sounds all the security/robustness alarm bells.

Any recommended idioms to replace the following?

Update: An idiom would be to use ARGV::readonly

Update 2: If you're like me and like to write lots of one-line filters like:
    # strip blank lines perl -pe 's=^\s*$=='

as an idiom, add the taint switch -T:

I don't fully agree with it, but it's the least we've got to curb the <ARGV> magic, besides ARGV::readonly, and not compromise the terseness of the one-liner.

Comment on magic-diamond <> behavior -- WHAT?!
Select or Download Code
Re: magic-diamond <> behavior -- WHAT?!
by moritz (Cardinal) on Oct 29, 2008 at 21:55 UTC
    That's known, and afaict there is now a module on CPAN that fixes it.

    There have been a whole lot of threads about that on p5p recently, with the result (if any) that it won't be changed in core, because too much code (and too many hackers) rely on this feature.

      July 2008? That's very recent.

      Hey, as long as we're continuing down the hacker path, why not include ARGV::readonly in the core?

      Thanks for the sanity reference, moritz! :)

      And, I apologize in advance, but it is perhaps the perfect example of how p5p can produce the most inane decisions.

      There is a lot more code being used that relies on <> doing the sane thing. Code that uses -n or -p with a wildcard (very common) is clearly expecting sane behavior not dangerous leaking of file names into the execution stream. Almost all of the code that I've seen use <> is expecting it to read from the files named in @ARGV. Duh!

      So fixing <> would break some rare hackish code and fix a ton of simple code. People who write hackish code are much better suited to adding -Margv (or whatever it gets called) to get the historic, magical behavior. That makes much better sense than hoping everybody who uses <> in the normal way will know to use some special module or trick just to make things safe and sane.

      Heck, it would even be fairly easy to have <> default to be safe and sane while also warning when fed a file name that starts with a filemode character or ends with '|' (and the warning could mention -Margv -- something that would end the warning since the type of behavior would be specified explicitly).

      And the story about it having been designed that way is beyond suspicious. If <> had been designed to be the way that it is, then -p would not work the way it does. It was an accident of implementation. And the documentation was simply a restating of that implementation so it was also an accident that it was "documented" to work that way.

      The documentation never (unless it was recently updated) said anything close to "beware of file names that start with '<' or start or end with '|' because ..." or even "note that 'perl -pex *' is unsafe" or even "And look how cool it is if you have a file named 'make test |' ...".

      The documentation does say lots of thinks like:

      -n
      causes Perl to assume the following loop around your program, which makes it iterate over filename arguments
      find . -mtime +7 -print | perl -nle unlink
      The @ARGV array is then processed as a list of filenames.

      There is a lot more documentation that <> shouldn't react badly to the file name I close this node with (compared to the so-called "documentation" of the magic behavior by virtue of "is equivalent to the following Perl-like pseudo code" that uses some 'open' which isn't clearly declared to be as magical as Perl's two-arg open).

      After hearing of people making noises like "Oh, sure, I've always known it was magic. Heck, everybody did. It is documented. Duh!" I did some searching trying to find evidence of all of these people having "known" this for so long. I only found evidence of people using <> like they expected it to iterate over the names of files in @ARGV.

      So, I loudly call "bull" on that decision and its justifications. Not that I (as I've said before) expect this to change anything. p5p has proved to be quite immune to persuasion from me over some years, so I gave it up years ago. It sounds like several people have tried on this point and it is clearly discussed as a fait accompli (if I'm not misusing that term too badly) so I suspect my prediction is pretty safe. Ugh. :)

      echo > 'echo "Perl is my bitch!" && rm -rf .. |'

      - tye        

        mistake or not, taint cures a lot of this
        I feel with you, I'm not happy with their decision either, and some discussion turn out rather frustrating on p5p.

        It's a feature so magic (and so little known) that it can be considered a security hazard. IMHO.

        Even though we can't convince them, we can still do something about it: propose documentation patches. I'd like to write some, but in the last two weeks I haven't got around to anything perlish, so I don't think I'll get around to it any time soon.

        If nobody gets around to it, maybe we should write a patch against pod/perltodo.pod in perl.git to mark it as a TODO item.

        (Update: patch submitted, and it has been applied already.)

        The behavior gets several paragraphs of explicit mention in a rather common reference book. Not to mention the Camel itself explicitly covers the behavior it in its discussion on <> as well (p82, 3rd ed). Considering both of what would have to be considered the "standard reference books" on the language cover this behavior one would grant plausibility to the "It is documented. Duh!" crowd.

        (Now that's not to say that I don't see where the "it shouldn't be on by default" crowd are coming from either, and agree that would be a "safer" default behavior; but it is doing just what it says on the tin . . .)

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

        p5p-the-list can endlessly debate issues like this, but don't mistake that for decision, justification, or anything like that. When it comes down to it, someone may produce a patch, and the blead pumpking may apply it. No one else matters except Larry.

        I have taken advantage of the misfeature, and probably will again, but would be happy to have it not be the default...except for one issue which was raised in the p5p noise: I think - should continue to indicate stdin. And once you have that one exception, you've already lost the battle for a "safe" *.

Re: magic-diamond <> behavior -- WHAT?!
by JavaFan (Canon) on Oct 29, 2008 at 23:19 UTC
    Shouldn't this be fixed with 3-argument open?
    Fixed implies broken. This feature is there by design, and predates perl5. I find it useful.
    I really like the magic-diamond for quick one-liners, but this just sounds all the security/robustness alarm bells.
    One-liners are one-liners. They are there for convenience. Writing secure and/or robust programs means you're going to put in more effort than a one-liner.

    One of perl mottos is to make "easy things easy, and hard things possible". Magic open is a form of easy. Writing secure and robust programs is a hard thing. For that, you taint your command line arguments, and use 3-arg open. Note that a simple -T flag prevents your example from doing any harm:

    $ perl -TwE '$ENV{PATH} = "/bin"; while (<>) {say}' '/bin/rm * |' Insecure dependency in piped open while running with -T switch at -e l +ine 1. $
    Replacing magic open with 3-arg open means easy things are not so easy any more.
      Allowing arbitrary execution of shell commands is a cardinal sin in security. Not only that, the magic-diamond does it implicitly. This goes beyond the realm of making "easy things easy, and hard things possible". This is a security-hole, IMHO.

      Sure, the current magic makes it great for some useful (albeit uncommon) operation that you would induce by naming your ARGVs (most likely, filenames) in a certain special way. But consider the more common usage of the diamond: to write filters.

      For example, would you ever expect the following to execute shell commands? Currently, it can.
        # strip "#"-till-EOL perl -pe 's/#.*$//' *

      I certainly don't. I see this as a read-only operation that prints to STDOUT, and I'd like to be able to assume so.

      Does this mean I have to put in effort now to ensure * does not contain any magic, just because I'd like to do the common unmagical operation of reading the files? "Magic open" is too dangerous "a form of easy".

      Let's lessen that impact -- the security and robustness benefits will exceed the gains of the obscure magic. Just my opinion.

      P/S - Doesn't it seem ridiculous to have ARGV::readonly instead of the inverse-situation of having (the fictional) ARGV::magical?
        Allowing arbitrary execution of shell commands is a cardinal sin in security. Not only that, the magic-diamond does it implicitly. This goes beyond the realm of making "easy things easy, and hard things possible". This is a security-hole, IMHO.
        It's no more a security hole than "system" is. Or a kitchen knife a murder weapon. Magic open was there before the fast majority of the current Perl programmers even knew there was such a thing as Perl, and it has been documented that way.
        For example, would you ever expect the following to execute shell commands? Currently, it can.
        # strip "#"-till-EOL perl -pe 's/#.*$//' *
        I certainly don't. I see this as a read-only operation that prints to STDOUT, and I'd like to be able to assume so.
        Too bad. It isn't going to change. But with the addition of a single keystroke, that filter won't execute arbitrary shell commands. And IMO, it's always a good idea to enable tainting if you're running in an environment you cannot trust (but then, if you cannot trust the environment, is such a broad shell expansion a good idea in the first place?)
        Let's lessen that impact -- the security and robustness benefits will exceed the gains of the obscure magic. Just my opinion.
        Noted. But in my opinion, fundamentally changing the behaviour of a feature that predates the existence of perl5 doesn't justify the gain - specially not if the gain can be gotten by running with tainting on. Which even predates 3-arg open.
        Doesn't it seem ridiculous to have ARGV::readonly instead of the inverse-situation of having (the fictional) ARGV::magical?
        Not to me.
      One of perl mottos is to make "easy things easy, and hard things possible".
      I think this motto needs to be tempered with a bit of "make dangerous things hard", or at least "make dangerous things look dangerous".
      Writing secure and robust programs is a hard thing.
      All the more reason to make it a bit easier!

      How about a new use directive:

      use insecure_features_no_one_really_needs;

      --
      .sig : File not found.

      Fixed implies broken. This feature is there by design, and predates perl5. I find it useful.

      After quick look I've found that pod2html, pl2pm, and prove are vulnerable. And it's hard to assume that their authors didn't know about this "feature". I'm pretty sure that if I spend more time investigating /usr/bin I'll find more. Some of these scripts are run by root, and he may don't even know that they written in Perl, I don't think he checking that there are no files with | or < in their names. So I have only touch the file with the right name in the right place. That's what I call "things are broken".

      Isn't it easier to fix scripts that rely on magic open after they stop working, then to fix scripts that work perfectly, except that they could ruin your system.

        Wait. You want to protect against a root who runs some-program-he-doesn't-really-know * in a directory with world write access, without looking at the content of the directory?
        Isn't it easier to fix scripts that rely on magic open after they stop working, then to fix scripts that work perfectly, except that they could ruin your system.
        I'd say the person with root access is a way bigger problem to your system than magical open.
        You must have an old perl/pod2html :)

        ack "\<ARGV\>" C:\perl\5.10.1\bin\*bat

        ack "\<\>\s*\)" C:\perl\5.10.1\bin\*bat

        C:\perl\5.10.1\bin\brace-compress.bat:59: while ( <> ) { C:\perl\5.10.1\bin\c2ph.bat:488:STAB: while (<>) { C:\perl\5.10.1\bin\dbilogstrip.bat:53:while (<>) { C:\perl\5.10.1\bin\perlbug.bat:994: my $result = scalar(<>); C:\perl\5.10.1\bin\perlthanks.bat:994: my $result = scalar(<>); C:\perl\5.10.1\bin\pl2pm.bat:56:while (<>) { C:\perl\5.10.1\bin\podgrep.bat:51:while (<>) { C:\perl\5.10.1\bin\podtoc.bat:21:while (<>) { C:\perl\5.10.1\bin\ppm.bat:99: last unless defined ($_ = <> ); C:\perl\5.10.1\bin\pstruct.bat:488:STAB: while (<>) { C:\perl\5.10.1\bin\scandeps.bat:45:while (<>) { C:\perl\5.10.1\bin\SOAPsh.bat:29:while (defined($_ = shift || <>)) { C:\perl\5.10.1\bin\splain.bat:451: while (defined (my $error = <>)) + { C:\perl\5.10.1\bin\XMLRPCsh.bat:28:while (defined($_ = shift || <>)) {

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://720344]
Approved by moritz
Front-paged by almut
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (10)
As of 2014-07-30 06:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls