Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: On-disk multipart/form-data part extraction

by blue_cowdawg (Prior)
on Dec 24, 2012 at 16:47 UTC ( #1010209=note: print w/ replies, xml ) Need Help??


in reply to On-disk multipart/form-data part extraction

      Today my quest is for an approach to extract body "parts" from an HTTP request body of type multipart/form-data -- but to do it all on-disk and in-place without loading any entire "part" into memory at any time.

eh? Not too sure what you mean by that. As soon as you read from any file a piece of it is in memory...

What have you tried? Is this part of an email? Is it CGI input? Is it alpaca wool?

Have you looked at CGI which is pretty much standard stuff?


Peter L. Berghold -- Unix Professional
Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg


Comment on Re: On-disk multipart/form-data part extraction
Re^2: On-disk multipart/form-data part extraction
by Conquistadog (Novice) on Dec 24, 2012 at 17:06 UTC

    I suppose I should have been clearer!

    Of course it's fine to have pieces of things in memory during processing. I expected that to be taken as given.

    We just can't hold the entirety of any given "part" in memory, since the parts can be (and regularly are) too large for that.

    Clearer? Thanks!

Re^2: On-disk multipart/form-data part extraction
by Conquistadog (Novice) on Dec 24, 2012 at 17:13 UTC

    Also, since you helpfully mentioned it, regarding CGI:: there is seemingly no success for me, either.

    You see, CGI:: can parse multipart bodies with its upload() member, but seemingly only when they come from STDIN or via CGI::Fast -- neither of which are available to me (nor desirable to me, indeed) as a nginx-embedded perl module.

    Hence my search for another way. The perl "staples" do not seem to cover my use case.

      nginx-embedded perl module

      never heard of it :)

      So http://wiki.nginx.org/HttpPerlModule? In that case,

      ;P

      local *STDIN; open STDIN, '<', $r->request_body_file or die $!; binmode STDIN; my $q = CGI->new( \*STDIN );

      :)

      my $body = Nginx_Body ( $r ); my $uploads = $body->upload; # hashref ## cleanup temp files undef $uploads; undef $body; sub Nginx_Body { my $r = shift; my $tmpdir = shift; my $content_type = $r->headers_in('Content-Type'); my $content_length = $r->headers_in('Content-Length'); my $body = HTTP::Body->new( $content_type, $content_length ); $body->tmpdir( $tmpdir ) if $tmpdir; my $length = $content_length; open my($bodyfh), '<:raw', $r->request_body_file or die $!; while ( $length ) { my $read = read( $bodyfh, my $buffer, ( $length < 8192 ) ? $length : 8192 ); my $bufferlength = length($buffer); die "IMPOSSIBLE $read != $bufferlength " if $read != $bufferle +ngth ; $length -= $bufferlength; $body->add( $buffer ); } return $body; }

      Although, after reading a bit from Nginx - full-featured perl support for nginx nginx might have this feature already , or at least it should :)

        Your afterthought speculation is correct. I probably should have pointed out straight away that the Nginx:: module and related others such as are found on CPAN is/are not the same as the interface module distributed with nginx (called simply nginx::, notice lower case). This confused me greatly at first.

        Indeed, If I gather right, the Nginx (capitalized) module is a relic from early nginx distribution days. While now (apparently) deprecated, unfortunately it contained API features not present in the (current) nginx distribution's module. So, for example, there is no $r->upload() method. There's also no $r->headers_in() method (note plural) to enumerate all headers; one must anticipate them in order to use $r->header_in() for each.

        In any case, after seeing your code, and to my optimistic delight, HTTP::Body does indeed look like it will put the uploaded "parts" in temporary files, with metadata information in a $body->upload() hash accessor. So thanks for that suggestion! I'll try that one too before reporting back, too.

          STDIN or via CGI::Fast -- neither of which are available to me

      STDIN not available? EH?? I've never seen an environment where stdin wasn't available. Even if you opened STDIN directly

      close STDIN; open STDIN,"< myfile.text" or die "myfile.txt: $!"; ... do stuff
      as such. Is this a CGI script or what are you up to here?


      Peter L. Berghold -- Unix Professional
      Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1010209]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2014-08-22 03:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (146 votes), past polls