Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: magic-diamond <> behavior -- WHAT?! (sanity)

by tye (Cardinal)
on Oct 30, 2008 at 07:30 UTC ( #720417=note: print w/ replies, xml ) Need Help??


in reply to Re: magic-diamond <> behavior -- WHAT?!
in thread magic-diamond <> behavior -- WHAT?!

And, I apologize in advance, but it is perhaps the perfect example of how p5p can produce the most inane decisions.

There is a lot more code being used that relies on <> doing the sane thing. Code that uses -n or -p with a wildcard (very common) is clearly expecting sane behavior not dangerous leaking of file names into the execution stream. Almost all of the code that I've seen use <> is expecting it to read from the files named in @ARGV. Duh!

So fixing <> would break some rare hackish code and fix a ton of simple code. People who write hackish code are much better suited to adding -Margv (or whatever it gets called) to get the historic, magical behavior. That makes much better sense than hoping everybody who uses <> in the normal way will know to use some special module or trick just to make things safe and sane.

Heck, it would even be fairly easy to have <> default to be safe and sane while also warning when fed a file name that starts with a filemode character or ends with '|' (and the warning could mention -Margv -- something that would end the warning since the type of behavior would be specified explicitly).

And the story about it having been designed that way is beyond suspicious. If <> had been designed to be the way that it is, then -p would not work the way it does. It was an accident of implementation. And the documentation was simply a restating of that implementation so it was also an accident that it was "documented" to work that way.

The documentation never (unless it was recently updated) said anything close to "beware of file names that start with '<' or start or end with '|' because ..." or even "note that 'perl -pex *' is unsafe" or even "And look how cool it is if you have a file named 'make test |' ...".

The documentation does say lots of thinks like:

-n
causes Perl to assume the following loop around your program, which makes it iterate over filename arguments
find . -mtime +7 -print | perl -nle unlink
The @ARGV array is then processed as a list of filenames.

There is a lot more documentation that <> shouldn't react badly to the file name I close this node with (compared to the so-called "documentation" of the magic behavior by virtue of "is equivalent to the following Perl-like pseudo code" that uses some 'open' which isn't clearly declared to be as magical as Perl's two-arg open).

After hearing of people making noises like "Oh, sure, I've always known it was magic. Heck, everybody did. It is documented. Duh!" I did some searching trying to find evidence of all of these people having "known" this for so long. I only found evidence of people using <> like they expected it to iterate over the names of files in @ARGV.

So, I loudly call "bull" on that decision and its justifications. Not that I (as I've said before) expect this to change anything. p5p has proved to be quite immune to persuasion from me over some years, so I gave it up years ago. It sounds like several people have tried on this point and it is clearly discussed as a fait accompli (if I'm not misusing that term too badly) so I suspect my prediction is pretty safe. Ugh. :)

echo > 'echo "Perl is my bitch!" && rm -rf .. |'

- tye        


Comment on Re^2: magic-diamond <> behavior -- WHAT?! (sanity)
Select or Download Code
Re^3: magic-diamond <> behavior -- WHAT?! (sanity)
by Anonymous Monk on Oct 30, 2008 at 08:34 UTC
    mistake or not, taint cures a lot of this

      Not really. It prevents odd file names from being treated as shell commands, but it dos so by killing your program instead of treating them as the names of files to read as intended.

      It's like fixing a flat tire by removing the car's battery. Sure, you won't ruin your car by driving with a flat. But you also won't be driving your car.

        Its more like pulling over when you get nailed, then you get out, fix your tire.

      No, taint checking is a dang stupid idea of a "fix". It doesn't actually fix anything and it makes lots of parts of your program bring everything to a screaching halt if you don't get a bunch of extra work done just right. And proposing it as a "fix" is a pretty clear demonstration of "you just don't get it at all".

      An actual fix that is also not breaking tons of other parts of your code is simply $_= "< $_" for @ARGV; (done everywhere that @ARGV gets sets for <> to be used, though).

      Now go fix just about every mention of <> in the documentation and hope that every person who ever uses <> non-hackishly jumps through your extra hoops and hope that the huge majority of them who won't (because it has been documented in dozens of places for decades that such hoops are not required) don't run into a truly evily-named file. And be happy that a few hackish programs don't require the slightest modification (even through a deprecation cycle) while every use of <> in the standard documentation is wrong.

      Oh, and have fun fixing the documentation for -i. That even more obviously puts the lie to "it was designed to work that way".

      - tye        

Re^3: magic-diamond <> behavior -- WHAT?! (sanity)
by moritz (Cardinal) on Oct 30, 2008 at 08:44 UTC
    I feel with you, I'm not happy with their decision either, and some discussion turn out rather frustrating on p5p.

    It's a feature so magic (and so little known) that it can be considered a security hazard. IMHO.

    Even though we can't convince them, we can still do something about it: propose documentation patches. I'd like to write some, but in the last two weeks I haven't got around to anything perlish, so I don't think I'll get around to it any time soon.

    If nobody gets around to it, maybe we should write a patch against pod/perltodo.pod in perl.git to mark it as a TODO item.

    (Update: patch submitted, and it has been applied already.)

Re^3: magic-diamond <> behavior -- WHAT?! (sanity)
by Fletch (Chancellor) on Oct 30, 2008 at 16:42 UTC

    The behavior gets several paragraphs of explicit mention in a rather common reference book. Not to mention the Camel itself explicitly covers the behavior it in its discussion on <> as well (p82, 3rd ed). Considering both of what would have to be considered the "standard reference books" on the language cover this behavior one would grant plausibility to the "It is documented. Duh!" crowd.

    (Now that's not to say that I don't see where the "it shouldn't be on by default" crowd are coming from either, and agree that would be a "safer" default behavior; but it is doing just what it says on the tin . . .)

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      I think you may be getting your carte blanc before your Camel. :)

      I read "the Camel" and I don't believe it mentioned any such thing (probably not the same revision of "the Camel" you refer to, of course). And at the time (quite a while ago) of the coming out party of the "It is documented. Duh!" proclaimers, I don't believe it was documented well in a popular book. In any case, I never saw mention of documentation of that in books in that time frame. I'm not at all surprised that it is documented in some books by now. But I also wouldn't be totally shocked if there was a book that covered it well way back then.

      But it is also true that bugs get documented in auxillary reference material. The "It is documented" is more short-hand for the "We can't change it because the standard documentation has always said that it worked that way" claim, and that is the meaning that I call "bull" on.

      - tye        

        Let me quote from the Camel, second edition, dating from 1996, shortly after the release of perl 5.003. On page 54, while discussing the angle operator:
        Here's how it works: the first time <> is evaluated, the @ARGV array is checked, and if it is null, $ARGV[0] is set to "-", which when opened gives you standard input. The @ARGV array is then processed as a list of filenames. The loop:
        while (<>) { ... # code for each line }
        is equivalent to the following Perl-like pseudocode:
        @ARGV = ('-') unless @ARGV; while ($ARGV = shift) { open(ARGV, $ARGV) or warn "Can't open $ARGV: $!\n"; while (<ARGV>) { ... # code for each line } }
        except that it isn't so cumbersome to say, and will actually work. It really does shift @ARGV and put the current filename into variable $ARGV. It also uses filehandle ARGV internally -- <> is just a synonym for <ARGV>, which is magical. (The pseudocode doesn't work because it treats <ARGV> as non-magical.)
        If you then switch to pages 191-194, it discusses the open function, including the effects of leading and trailing pipes in the filenames.
Re^3: magic-diamond <> behavior -- WHAT?! (sanity)
by ysth (Canon) on Nov 04, 2008 at 05:46 UTC
    p5p-the-list can endlessly debate issues like this, but don't mistake that for decision, justification, or anything like that. When it comes down to it, someone may produce a patch, and the blead pumpking may apply it. No one else matters except Larry.

    I have taken advantage of the misfeature, and probably will again, but would be happy to have it not be the default...except for one issue which was raised in the p5p noise: I think - should continue to indicate stdin. And once you have that one exception, you've already lost the battle for a "safe" *.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://720417]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (9)
As of 2014-10-01 21:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (40 votes), past polls