strict isn't everything

Typically, monks will warn new programmers that they must use strict. This pragma is useful, but if you just don't understand programming, it's a false sense of security. Today, I'm working on some code another (ex) employee wrote. The code allows a user to fill out a form, use an attachment, and email the data to one of our users. How many problems can you find? Here's a blindingly obvious hint on one of them: the programmer never told the admin about the directory he was saving files to.

my $outputFile;
if( $_file_name !~ /^(\s*)$/ ) {
    use constant BUFFER_SIZE   => 16_384;    # Amount of upload file t
+o read at one time
    use constant MAX_FILE_SIZE => 3_145_728; # This is the filesize up
+load limit

    $CGI::DISABLE_UPLOADS      = 0;          # Temporarily reenable up
+loads
    $CGI::POST_MAX             = MAX_FILE_SIZE;
    
    # Path and Filename
    my $file_name = $_file_name;    
    my $file_type = $query->uploadInfo($file_name)->{'Content-Type'};
    
    my $basename = basename($file_name);
    
    if( $file_type =~ /octet-stream/ ) { 
        $errors{ 'file_type' } = ["","","Unrecognize your submitted re
+sume file format."];
        goto Print;
    }    
    $outputFile = $UPLOAD_RESUME_DIRECTORY . $basename ;
    my $buffer = "";    
    open(OUTPUT,">>$outputFile");
    my @stats;
    
    # Need binmode or Win32 systems will convert end-of-line chars
    binmode OUTPUT;
    {
        no strict 'refs';                     
        READ_FILE: while ( read( $file_name, $buffer, BUFFER_SIZE ) ) 
+{
            print OUTPUT $buffer;
            @stats = stat $outputFile;    
            last READ_FILE if ( $stats[7] > MAX_FILE_SIZE )
        }
    }
    close(OUTPUT);
    
    #check the file size
    if ( $stats[7] > MAX_FILE_SIZE || %errors ) {
        $errors{'file_size'} = ["","","Your submitted file's size is o
+ver 3MB."];
        unlink $outputFile;
[download]

I'll post my observations later. Be careful, there are some subtle bugs here.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Comment on strict isn't everything Download Code

Replies are listed 'Best First'.

Re: strict isn't everything
by Dog and Pony (Priest) on Jun 11, 2002 at 17:52 UTC

Here's one I am not sure about:

open(OUTPUT,">>$outputFile");
[download]

This, I wonder if it may have other implications elsewhere?

$CGI::DISABLE_UPLOADS      = 0;          # Temporarily reenable up
[download]

I don't get this one either:

if ( $stats[7] > MAX_FILE_SIZE || %errors ) {
  $errors{'file_size'} = ["","","Your submitted file's size is over 3M
+B."];
[download]

This is an odd check:

if( $file_type =~ /octet-stream/ ) { 
  $errors{ 'file_type' } = ["","","Unrecognize your submitted resume f
+ile format."];
  goto Print;
}
[download]

I'm sure I missed all the real errors, and pointed out perfectly legitimate stuff, but if that is so, I hope to learn something from the grind-my-face-in-the-ground treatment I am gonna get. :)

You have moved into a dark place.
It is pitch black. You are likely to be eaten by a grue.

[reply]
[d/l]
[select]

Re: strict isn't everything
by brianarn (Chaplain) on Jun 11, 2002 at 20:21 UTC

open(OUTPUT,">>$outputFile") or die "Couldn't open $outputFile: $!\n";

$query

$CGI::DISABLE_UPLOADS

$CGI::POST_MAX

$query

DISABLE_UPLOADS will be 1, which means the script will receive no file (not sure if this generates an error upon instantiation of the CGI object or just gives an empty file when trying to write to disk)
POST_MAX will be set to something not expected, which could potentially cause the object to error out (if, say, POST_MAX was only 1MB and the person's resume is 1.5MB, the script will error out when the CGI object is created, even though in the scripter's eyes, this is a legit resume in size

-s $outputFile

unlink $outputFile or die "Couldn't unlink $outputFile: $!\n";

~Brian

[reply]
[d/l]
[select]

Re: strict isn't everything
by Abigail-II (Bishop) on Jun 12, 2002 at 12:44 UTC

Here are some of my remarks. I haven't used CGI.pm in half a dozen years so I won't comment much on proper use of its API. I'd also like to point out that some remarks will be subjective - things I would do different aren't necessarely done wrong here.

my $outputFile;
if( $_file_name !~ /^(\s*)$/ ) {
[download]

$_file_name =~ /\S/

if (defined $_file_name)

    use constant BUFFER_SIZE   => 16_384;    # Amount of upload file t
+o read at
one time
    use constant MAX_FILE_SIZE => 3_145_728; # This is the filesize up
+load limit
[download]

use

    $CGI::DISABLE_UPLOADS      = 0;          # Temporarily reenable up
+loads
    $CGI::POST_MAX             = MAX_FILE_SIZE;
[download]

$CGI::DISABLE_UPLOADS

local

But there is another more serious problem. CGI.pm processes its input when CGI -> new is called. And that's when you need to know when file uploads are enabled or not. Hence, this setting comes to late, it should be done before the $query object is created. And this is true of $CGI::POST_MAX as well.

    # Path and Filename
    my $file_name = $_file_name;
[download]

$file_name

    my $file_type = $query->uploadInfo($file_name)->{'Content-Type'}; 
+ 

    my $basename = basename($file_name);
[download]


    if( $file_type =~ /octet-stream/ ) {
        $errors{ 'file_type' } = ["","","Unrecognize your submitted re
+sume file
format."];
        goto Print;
    }
[download]

goto

%errors

    $outputFile = $UPLOAD_RESUME_DIRECTORY . $basename ;
    my $buffer = "";   
    open(OUTPUT,">>$outputFile");
[download]

$basename

basename

>>

That not checking the return value of open is a serious mistake should not come as a surprise.

 
    my @stats;

    # Need binmode or Win32 systems will convert end-of-line chars
    binmode OUTPUT;
    { 
        no strict 'refs';
        READ_FILE: while ( read( $file_name, $buffer, BUFFER_SIZE ) ) 
+{
            print OUTPUT $buffer;
            @stats = stat $outputFile;
            last READ_FILE if ( $stats[7] > MAX_FILE_SIZE )
        }
    }
    close(OUTPUT);
[download]

no strict 'refs'

read

upload

The OUTPUT handle hasn't been locked so if more than one request is dealt with simultaneously, uploaded data can become interleaved.

Also, OUTPUT is a buffered filehandle. Their might be more data written to it than returned by stat. Finally, the return value of close isn't checked. A full disk can cause a failure of the close.

    #check the file size
    if ( $stats[7] > MAX_FILE_SIZE || %errors ) {
        $errors{'file_size'} = ["","","Your submitted file's size is o
+ver 3MB."]
;
        unlink $outputFile;
[download]

keys %errors

%errors

unlink

Abigail

[reply]
[d/l]
[select]

Re: Re: strict isn't everything

by Ovid (Cardinal) on Jun 13, 2002 at 00:20 UTC

I guess I don't need to add too much commentary as most of the posts have it fairly well. However, you asked about the rest of the code:

    Print:    
    my $template_data;
    if ( $params ) {
        my @required = qw/ file_size file_type /;
        $template_data = {  
            errors          => \%errors,
            required_fields => \@required,
            # more stuff here
    }
    print $query->header;
    $template->process( 'info_request_emp.tmpl', $template_data ) or d
+ie $template->error();
    exit;
}
[download]

The only use of the goto is to skip some code. An else would have been preferred.

As for $file_name and $_file_name, this is a "sometimes" convention used here. When reading form parameters, the variable with the tainted data begins with an underscore and the untainted one drops the underscore. Obviously, if no untainting occurs, this is useless. Personally, I don't like this ad hoc approach.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

[reply]
[d/l]

Re: strict isn't everything

by Abigail-II (Bishop) on Jun 13, 2002 at 09:12 UTC

The only use of the goto is to skip some code. An else would have been preferred.

Hmmm. An else is a typical solution, but it's not something that makes me jump with joy either. It means there's a large block of code that will be indented. And would you have more such cases, the code crawls to the right hand margin.

A continue might solve this:

    if (CONDITION) {{  # Note the *double* opening brace.
        ... some code ...
        next if SOME_CONDITION;
        ... more code ...
        next if OTHER_CONDITION;
        ... even more code ...
        }
        continue {
            ... end stuff ...
        }
    }
[download]

last

next

continue

Abigail

[reply]
[d/l]
[select]

Re: strict isn't everything
by gav^ (Curate) on Jun 11, 2002 at 22:34 UTC

I'll skip what others have mentioned...

Using capturing parentheses and then not using what you captured
Not checking to see if $_file_name was undef
Not stripping out anything naughty from the filename
Creating an unnecessary variable $file_name
Not sure why the no strict 'refs' is used as nothing is done with symbolic refs in that block
It should be %errors > 0 rather than just using %errors in boolean context

gav^

[reply]
[d/l]
[select]

Re: Re: strict isn't everything

by jarich (Curate) on Jun 12, 2002 at 02:40 UTC

Not checking to see if $_file_name was undef

if( $_file_name !~ /^(\s*)$/ ) {
...}
[download]

If $_file_name is undefined it will be treated as the empty string in this regular expression. The empty string will match the regular expression and the if statement's body will never be executed.

Update:gav^'s very right that using warnings will complain about this. Unfortunately many people ignore their server error logs unless something has gone wrong. (Like me today, when I double checked my above paragraph and it worked as I expected.) So this will work if $_file_name is undefined but only at the expense of creating unnecessary lines in your server error logs. End update. :)

Not sure why the no strict 'refs' is used as nothing is done with symbolic refs in that block.

    # Need binmode or Win32 systems will convert end-of-line chars
    binmode OUTPUT;
    {
        no strict 'refs';
        READ_FILE: while ( read( $file_name, $buffer, BUFFER_SIZE ) ) 
+{
            print OUTPUT $buffer;
            @stats = stat $outputFile;
            last READ_FILE if ( $stats[7] > MAX_FILE_SIZE )
        }
    }
[download]

perldoc CGI

      The filename returned is also a file handle.  You can read
      the contents of the file using standard Perl file reading
      calls:

               # Copy a binary file to somewhere safe
               open (OUTFILE,">>/usr/local/web/users/feedback");
               while ($bytesread=read($filename,$buffer,1024)) {
                  print OUTFILE $buffer;
               }

       However, there are problems with the dual nature of the
       upload fields.  If you "use strict", then Perl will com
       plain when you try to use a string as a filehandle.  You
       can get around this by placing the file reading code in a
       block containing the "no strict" pragma.

so the writer is doing this part correctly... except they should use -s instead of stat and check that the file opened and all of those good things that have been mentioned.

It should be %errors > 0 rather than just using %errors in boolean context

%errors > 0 is equivalent to %errors in a boolean context.

++ to your other points though, especially not santitising the filename before reusing it.

Likewise, if more than one person calls their file "resume" then we'll be adding a mis-mash of resumes on to each other. Ew! And without any kind of file locking, two people could upload their resume.(doc|txt|ps|pdf) and have the files interleaved.

Interesting problem. Too bad that code like this is all too common.

jarich

[reply]
[d/l]
[select]

Re: Re: Re: strict isn't everything

by gav^ (Curate) on Jun 12, 2002 at 03:31 UTC

if( $_file_name !~ /^(\s*)$/ ) {

use warnings;

my $_file_name = undef;

if ($_file_name !~ /^(\s*)$/ ) {
    print "ok\n";
}

__END__

Use of uninitialized value in pattern match (m//) at test.pl line 6.
[download]

gav^

[reply]
[d/l]
[select]

(jeffa) Re: strict isn't everything
by jeffa (Bishop) on Jun 11, 2002 at 23:18 UTC

if (0) {
   use constant FOO => 5;
}

print FOO, "\n";

__END__
prints 5
[download]

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

[reply]
[d/l]

Re: strict isn't everything
by crazyinsomniac (Prior) on Jun 12, 2002 at 03:20 UTC

Typically, monks will warn new programmers that they must use strict. This pragma is useful, but if you just don't understand programming, it's a false sense of security.

programmers

perldoc strict

______crazyinsomniac_____________________________
Of all the things I've lost, I miss my mind the most.
perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

[reply]

Re: strict isn't everything
by smitz (Chaplain) on Jun 12, 2002 at 14:51 UTC

GOTO's and LABELS?!?!

[reply]

Re: strict isn't everything

by Abigail-II (Bishop) on Jun 12, 2002 at 17:06 UTC

GOTO's and LABELS?!?!

Yeah, what about them? People easily balk at seeing a goto, but not all goto's are evil. There are two famous papers on the subject, Go To considered harmful, by E. Dijkstra (although it was C. Hoare that give the paper a title), but that paper doesn't say one should never use a goto - it warns for improper use. And then there's of course Structured Programming with Goto. By D. E. Knuth - perhaps the best Computer Scientist that ever lived.

Besides, aren't we all fond of next, last and redo? They are nothing but gloried gotos.

Abigail

[reply]
[d/l]
[select]

Re: Re: strict isn't everything

by brianarn (Chaplain) on Jun 12, 2002 at 16:15 UTC

~Brian

[reply]

Re: strict isn't everything
by cybear (Monk) on Jun 14, 2002 at 18:39 UTC

This type of post will be very useful to me, and to anyone who
is trying to get a better understanding of perl.

It it very difficult to understand how to troubleshoot a problem
unless you have some experience troubleshooting problems.

-Thanks for the post OVID

[reply]

Back to Meditations