Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: Dangerous diamonds! (accident)

by tye (Sage)
on May 19, 2003 at 15:43 UTC ( #259177=note: print w/replies, xml ) Need Help??

in reply to Dangerous diamonds!

I boggle everytime I see this defended as a feature. Why is it a feature that:

perl -ne '...' *
is broken?? It doesn't process the files matching * as any sane person would expect it to. It can't even handle files that have leading spaces in their names. It is quite simply stupid, dangerous, and counter-intuitive.

Sure, it is cute to have a feature where you can load up @ARGV with qw( >this >that >the >other ) and have the "read files" operator create a bunch of empty files for you. You can even go out of your way to come up with a useful invocation of it.

But that shouldn't come at the expense of breaking the default case of dealing with @ARGV coming from the command line that contains a list of file names! If you want to play such games, then you should be allowed to but you shouldn't break the basic functionality of perl -ne '...' * in order to support that! Especially not in a way that warrants a CERT advisory!

I'd patch this so that the "magic open" is applied to <> iff use open IN=>':magic'; were in effect. I started down that road when I started a patch to fix use open IN=>':raw'; to work on <> because it is currently impossible to use binmode with <>.

But the reaction from p5p made me think that my efforts would be wasted as such a patch would not be accepted.

Someone really should have CERT file a report on this. It is a serious security bug in Perl that should be fixed and should be advertised more.

How this works is clearly an accident of implementation and not an intentional design. The fact the people have come up with creative uses for this accident doesn't mean that the tons and tons of legitimate uses of <> (in an attempt to read the files named in @ARGV) should be left broken just because we didn't realize that they were broken when we were telling people to use <> to iterate over the lines in files named in @ARGV.

Those who really feel that this misfeature should continue to be the default behavior need to update tons of documentation that encourages the use of <> for iterating over a list of files matched by a wildcard.

The fact is that *both* magical and non-magical expectations for <> are documented. Any place that mentions perl -ne '...' * is documenting the sane behavior that was always expected/intended and this is by far the most common desired behavior when <> is used.

Making <> sane by default (instead of magical) would fix more existing code than it would break! And the code that would be broken would be simple to fix (add a single use open IN=>':magic';) and would be code that was written with awareness of how strange <> can behave and so would more be likely to learn of the need for this change.

It isn't hard to find nodes by people who claim to not be surprised by this mis-feature that contain code that seems to clearly indicate that they don't expect magical behavior. I did a quick super search for nodes by merlyn that mention "local" and "*ARGV" and I found a bunch of uses of <> that don't set @ARGV= "< input.file" nor mention the dangers of not doing that.

And if you change all of your code so @ARGV is always populated with "< $filename", then $^I becomes useless! How $^I works clearly indicates that the writers of Perl did not take magical open into account during the design. We just need to fix it, not document how it has always been this way and shame on you for not realizing it (we didn't realize it either, but we refuse to admit that it was a mistake).

                - tye

Replies are listed 'Best First'.
Re: Re: Dangerous diamonds! (accident)
by Juerd (Abbot) on May 19, 2003 at 17:26 UTC

    It doesn't process the files matching * as any sane person would expect it to. It can't even handle files that have leading spaces in their names. It is quite simply stupid, dangerous, and counter-intuitive.

    All of these would be fixed if it used three-arg open with an explicit "<" as its second arg. It's a shame that some people think the current way of handling things is okay. :(

    Juerd # { site => '', plp_site => '', do_not_use => 'spamtrap' }

Re: Re: Dangerous diamonds! (accident)
by sauoq (Abbot) on May 20, 2003 at 21:58 UTC
    I did a quick super search for nodes by merlyn that mention "local" and "*ARGV" and I found a bunch of uses of <> that don't set @ARGV= "< input.file" nor mention the dangers of not doing that.

    Did merlyn go back and fix them? The super search turned up 9 nodes and he sets @ARGV explicitly in each one. (Not to "<input.fil" but there is no danger in forgoing the '<' if the filename is explicit, right?)

    "My two cents aren't worth a dime.";

      Having heard merlyn complain many times about people promoting dangerous memes, I expected to see merlyn at least mention this danger of <> that so many seem to be saying "Oh, sure, I expected that all along; after all it *is* documented" about. I couldn't find a single one. Perhaps I just missed it.

      What I did find was what I described. I picked something I knew I could find with Super Search, merlyn doing local(*ARGV), setting @ARGV, then using <>. I found no use of "< filename" nor any mention of these dangers.

      I had expected that merlyn would realize that posting code that does @ARGV = "filename"; would invite someone to copy and modify his code and end up with @ARGV = $filename; and so realize he was promoting a dangerous meme and address this point somewhere.

      Especially in something like •Re: XML log files, which includes code meant to be copied and modified and was in reply to a node that used $logfn not some hard-coded log file name. So merlyn should have expected "mylogfile" to be replaced with $logfn and yet didn't even mention this risk.

      I didn't expect him to always mention this risk, I was just looking for any indication that he had realized this risk and couldn't find any despite finding several nodes where <> is used and @ARGV is set. That certainly doesn't prove that merlyn hasn't always been keenly aware of this risk. But I think it indicates that even merlyn probably usually thought about @ARGV containing filenames and (at least until the issue was raised recently) usually didn't worry about <> sending filenames to the shell. In any case, I think most users of Perl usually think about @ARGV and <> that way and I have yet to find any evidence of many (any) other people doing otherwise until quite recently.

      So I did some more searching looking for any places where someone has said "oh, and be careful because <> can pass your filenames to the shell for interpolation, of course (everyone knows that, it is spelled out explicitly in the documentation!)". I searched for nodes that contain both '"< ' and '<>' in hopes of finding nodes that use <> defensively. I looked at about half of the matches and none of them were using <> defensively.

      But several of them show evidence of the opposite, of people knowing full well that open FH, $filename is a bad idea and then doing the equivalent Bad Idea™ of @ARGV = $filename; then using <>. That is, nodes that do open FH, "< $file" and yet don't follow the same precaution when using @ARGV and <>.

      I found Dominus (well-respected Perl author) doing this in How do I insert a line into a file?. And Adam (very careful Perl programmer that I respect) doing it (via the command line) in Re: Populating an array. And pjf doing it in Re: Searching a whole directory of databases.

      So I've got hard evidence that people have expected <> to interpret @ARGV as containing names of files to be read and not expressions to be interpretted by 2-argument open, yet still no hard evidence of anyone interpretting the vague documentation as "the above pseudo code used 2-argument open so <> will also behave like it used 2-argument open and do stupid things for files with names beginning with > or |, even though that would be dangerous and, well, stupid". q-:

                      - tye

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://259177]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (12)
As of 2020-01-21 16:51 GMT
Find Nodes?
    Voting Booth?