Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Creating Zip Archives on the fly

by fulmar2 (Initiate)
on Oct 04, 2011 at 20:17 UTC ( #929668=perlquestion: print w/ replies, xml ) Need Help??
fulmar2 has asked for the wisdom of the Perl Monks concerning the following question:

Hi - I am trying to generate fairly large zip archives "on the fly." I'd like for the user to be able to download a single zip file containing thousands of files. I have been able to do this successfully, using two methods. The problem with the first method is that the users get timeout errors in their browsers (my guess is that "addTreeMatching" takes too long - there are thousands of files). The problem with the second method is that when you go to unarchive the zip file, only the first file appears (I suspect that this is a problem with the central directory). Any help would be greatly appreciated! Ideally, I'd be able to use some form of the second method, as that gives me the flexibility of renaming the files prior to archiving them.

First Method
use Archive::Zip; my $zip = Archive::Zip->new(); # new instance $zip->addTreeMatching( "/$NewHomePath/Results/$InvoiceNumber", + "$InvoiceNumber", '\.(ab1$|seq$)' ); print "Content-Type:application/zip\n"; print "Content-Disposition:attachment;filename=$FileNameToWrit +e\n\n"; $zip->writeToFileHandle(*STDOUT);
Second Method
use Archive::Zip; my $zip = Archive::Zip->new(); # new instance chdir("/$NewHomePath/Results"); @files = (<$InvoiceNumber/*.ab1>); # files to store @filesA = (<$InvoiceNumber/*.seq>); # files to store push (@files,@filesA); #PRINT HEADER print "Content-Type:application/zip\n"; print "Content-Disposition:attachment;filename=$FileNameToWrit +e\n\n"; #NOW PRINT TO STDOUT foreach $file (@files) { if ($DLWOSC eq "checked"){ $TranslatedName=(split('/',$file))[-1]; $TranslatedName=~s/\;/\_/g; $zip->addFile($file,$TranslatedName); # add files $zip->writeToFileHandle(*STDOUT); }else{ $zip->addFile($file); $zip->writeToFileHandle(*STDOUT); } }

Comment on Creating Zip Archives on the fly
Select or Download Code
Re: Creating Zip Archives on the fly
by pvaldes (Chaplain) on Oct 04, 2011 at 22:53 UTC
    mmh... push (@files,@filesA);? What occurs if you try  print @files;?
    while (@files){ chomp; push @filesA, $_; }
Re: Creating Zip Archives on the fly
by pmqs (Monk) on Oct 05, 2011 at 15:42 UTC

    See thread keeping connection alive while spending time building a zip file where a similar problem with generating large zip files on the fly was discussed.

    Having the zip files already available is the best way to go with this. If that isn't an option, and you can't use one of the other suggestions in that thread, you will have to stream the zip file as it is created to the client.

    That means two things:

    1. You need to use HTTP chunked transfer encoding to send the content to the device as you write it. How you get that working will depend on what web setver you are using.
    2. The zip implementation you use needs to support streaming output.

    I din't think Archive::Zip supports streaming output. Both the command line zip and IO::Compress::Zip can stream a zip file as it creates it.

    Below is a proof of concept code I posted in the other thread that streamed a zip file and chunked it at the same time using IO::Compress::Zip.

    use IO::Compress::Zip qw(:all) ; select STDOUT; $| = 1; my $OUT = \*STDOUT; print <<EOM; Status: 200 OK Content-Type: application/zip Transfer-Encoding: chunked EOM my @files = qw(/tmp/file1 /tmp/file2) ; zip [@files] => '-', FilterEnvelope => sub { # Chunk the output my $length = length($_); $_ = sprintf("%x", $length) . "\r\n" . $_ . "\r\n"; $_ .= "\r\n" unless $length; 1; } ;

    One thing missing from your original requirements is the ability to rename the zip file members as you create the zip file. That is a feature that is under development now for IO::Compress::Zip (I'm the author of the module). I can get you an early rlease if you want.

      THIS, is exactly what I was looking for! Thank you. I might be able to "let go" of the renaming (not as critical as the streaming for our application). Nevertheless, I may eventually be interested in the next release of IO::Compress::Zip. Thanks again. Also, I agree about having the zip file already available... I considered that, but it's a pretty dynamic environment, so the best option will really be to build them on the fly.

        Just a quick followup on this thread. The latest version of IO::Compress::Zip now has the ability to rename zip members on the fly using the FilterName option.

        Here is the proof-of-concept from before updated to include FilterName

        use IO::Compress::Zip qw(:all) ; select STDOUT; $| = 1; my $OUT = \*STDOUT; print <<EOM; Status: 200 OK Content-Type: application/zip Transfer-Encoding: chunked EOM my @files = qw(/tmp/file1 /tmp/file2) ; zip [@files] => '-', FilterName => sub { if ($DLWOSC eq "checked"){ $_=(split('/',$_))[-1]; s/\;/\_/g; } }, FilterEnvelope => sub { # Chunk the output my $length = length($_); $_ = sprintf("%x", $length) . "\r\n" . $_ . "\r\n"; $_ .= "\r\n" unless $length; 1; } ;
      For anyone looking at this comment FilterEnvelope is now FilterContainer

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://929668]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2014-07-23 04:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (133 votes), past polls