Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Handling PUT or PROPFIND methods with CGI turned out - at least for me - to be quite a challenge while doing it under mod_cgi with common helpers like CGI.pm or CGI::Simple.

First, they do not handle less-common methods like PROPFIND well, throw away message body content by pretending to read it or deny answering them at all.
Second, large message bodies are (commonly) slurped in! But PUT and POST requests can get large, even PROPFIND XML message bodies might get bloated - and all these requests might thus pose a DoS effectively.

Please find attached my take on fixing these issues. It's a WIP so please comment!

Further, I'd like a knowledgeable monk to tell me if my attempts to keep STDIN content unprocessed so my script can later do a buffered read are ineffective. Does it really make any difference to keep arriving content on STDIN than copying it to a variable (I think this answers itself, didn't it?)
Or is a closely knit-in Apache handler the only way to process message body data *as it arrives*? Or wouldn't even this do the trick?
In reality it seems to work as intended: with the code snippet at the bottom of this post I can see an uploading file growing in filesystem as data arrives.
package REST::CGI::Simple; # # patched version of CGI::Simple # - PROPFIND method with message body possible, see keyword "patch1" # - no PUT data value parsing for zeros (may also be set by NO_NULL = +0), but hardcoded, see keyword "patch2" # - differentiate handling of medium and very large message bodies, se +e keyword "patch3" ... sub _read_parse { my $self = shift; my $handle = shift || \*STDIN; my $data = ''; my $type = $ENV{'CONTENT_TYPE'} || 'No CONTENT_TYPE received'; my $length = $ENV{'CONTENT_LENGTH'} || 0; my $method = $ENV{'REQUEST_METHOD'} || 'No REQUEST_METHOD received'; # first check POST_MAX Steve Purkis pointed out the previous bug if ( ( $method eq 'POST' or $method eq 'PUT' or $method eq 'PROPFIN +D' ) # patch1 and $self->{'.globals'}->{'POST_MAX'} != -1 and $length > $self->{'.globals'}->{'POST_MAX'} ) { $self->cgi_error( "413 Request entity too large: $length bytes on STDIN exceeds \$ +POST_MAX!" ); # silently discard data ??? better to just close the socket ??? while ( $length > 0 ) { last unless _internal_read( $self, $handle, my $buffer ); $length -= length( $buffer ); } return; } if ( $length and $type =~ m|^multipart/form-data|i ) { my $got_length = $self->_parse_multipart( $handle ); if ( $length != $got_length ) { $self->cgi_error( "500 Bad read on multipart/form-data! wanted $length, got $got +_length" ); } return; } # patch3 elsif ( ( $method eq 'POST' or $method eq 'PUT' or $method eq 'PROP +FIND' ) and $self->{'.globals'}->{'HANDLE_ONLY_ON_LARGE_CONTENT'} == 1 and $length > $self->{'.globals'}->{'HANDLE_ONLY_WHEN_LARGER_AS'} +) { # reading data from STDIN here would mean taking it away from a # script which might process it on its own, so we just signal that # it is still there, untouched $self->_add_param( 'HANDLE_ONLY', 1 ); $ENV{STDIN_WAITING} = 1; } elsif ( $method eq 'POST' or $method eq 'PUT' or $method eq 'PROPFIN +D' ) { # patch1 if ( $length ) { # we may not get all the data we want with a single read on larg +e # POSTs as it may not be here yet! Credit Jason Luther for patch # CGI.pm < 2.99 suffers from same bug _internal_read( $self, $handle, $data, $length ); while ( length( $data ) < $length ) { last unless _internal_read( $self, $handle, my $buffer ); $data .= $buffer; } unless ( $length == length $data ) { $self->cgi_error( "500 Bad read on POST! wanted $length, got " . length( $data ) ); return; } if ( $type !~ m|^application/x-www-form-urlencoded| ) { $self->_add_param( $method . "DATA", $data ); } else { $self->_parse_params( $data ); } } } elsif ( $method eq 'GET' or $method eq 'HEAD' ) { $data = $self->{'.mod_perl'} ? $self->_mod_perl_request()->args() : $ENV{'QUERY_STRING'} || $ENV{'REDIRECT_QUERY_STRING'} || ''; $self->_parse_params( $data ); } else { unless ( $self->{'.globals'}->{'DEBUG'} and $data = $self->read_from_cmdline() ) { $self->cgi_error( "400 Unknown method $method" ); return; } unless ( $data ) { # I liked this reporting but CGI.pm does not behave like this so # out it goes...... # $self->cgi_error("400 No data received via method: $method, type: $t +ype"); return; } $self->_parse_params( $data ); } } ... sub _add_param { my ( $self, $param, $value, $overwrite ) = @_; return () unless defined $param and defined $value; $param =~ tr/\000//d if $self->{'.globals'}->{'NO_NULL'}; @{ $self->{$param} } = () if $overwrite; @{ $self->{$param} } = () unless exists $self->{$param}; my @values = ref $value ? @{$value} : ( $value ); for my $value ( @values ) { next if $value eq '' and $self->{'.globals'}->{'NO_UNDEF_PARAMS'}; # $value =~ tr/\000//d if $self->{'.globals'}->{'NO_NULL'}; # pa +tch2: we could ask for NO_NULL, but this was such a headache # that I hardcoded it as commented out $value = Encode::decode( utf8 => $value ) if $self->{'.globals'}->{PARAM_UTF8}; push @{ $self->{$param} }, $value; unless ( $self->{'.fieldnames'}->{$param} ) { push @{ $self->{'.parameters'} }, $param; $self->{'.fieldnames'}->{$param}++; } } return scalar @values; # for compatibility with CGI.pm request.t }
(Only excerpts, something prevented me from posting the whole..)

With this CGI::Simple variant in effect, I later in my code do:
... my $fh = $fs->open_write( $path ); # Filesys::Virtual::Plain if( $fh ){ if($ENV{STDIN_WAITING}){ File::Copy::copy(\*STDIN,$fh); # rely on File::Copy's + robust buffered GLOB to HANDLE copy }else{ print $fh $self->query->param('PUTDATA'); } $fs->close_write($fh); return 1; }else{ $fs->close_write($fh); return 0; }

In reply to A patched CGI::Simple for REST applications - comments please by isync

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-19 23:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found