Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Create zip files for each sub-directory under a main directory

by mrd1019 (Novice)
on Nov 29, 2016 at 13:52 UTC ( [id://1176813]=perlquestion: print w/replies, xml ) Need Help??

mrd1019 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have a directory with the following structure: nas\data\nonops\common\engine\ingest\sim\mkv\ap0000->files

\ap0001->files

…………..

\ap1335->files

For archiving purposes, I need to create a zip file at the end of tests (when the apxxxx folders are populated). This is my current code. It copies all the directories under nas\data\nonops\common\engine\ingest\sim\mkv into the destination, then creates one zip file out of all of that. The issue is that there is so much data total in all the sub-directories that it is too big to make one zip file. I need to be able to create a separate zip file for each apxxxx file, ideally naming each zip file as apxxxx.zip.

use strict; use warnings ‘all’; use File::Copy::Recursive; use Archive::Zip; use constant AZ_OK =>0; my $mkvingestdir = “//nas/data/nonops/common/engine/ingest/sim/mkv”; my $mkvingestdest = “//nas/shared/group/test/mkv/ingest”; File::Copy::Recursive::dircopy $mkvingestdir, $mkvingestdest or die “C +opy failed: $!\n”; my $mkvingestzip = Archive::Zip->new(); my $mkvingestzipdest = “//nas/shared/group/test/mkv”; $mkvingestzip ->addTree($mkvingestdest); if ($mkvingestzip->writeToFileNamed(‘ingest.zip’) != AZ_OK) { print “Error in archive creation!\n”; } else { print “Archive created successfully!\n”; }

I’m thinking a foreach loop would work? I’m not 100% sure how to go about getting to my end goal. The script above works, but will not work when a full test is ran and there is 10 – 20 GB of data total in the /mkv directory. Any help would be greatly appreciated!

Replies are listed 'Best First'.
Re: Create zip files for each sub-directory under a main directory
by stevieb (Canon) on Nov 29, 2016 at 14:49 UTC

    I believe something like the following should get you close to where you want to be.

    Given this directory structure:

    orig |-a |- a.txt |-b |- b.txt |-c |- c.txt

    ...and running this code:

    use strict; use warnings; use Archive::Zip; use File::Basename; use File::Copy::Recursive; use File::Find::Rule; use constant AZ_OK => 0; my $mkvingestdir = 'orig'; my $mkvingestdest = 'new'; my $zip_dest = 'zipped'; File::Copy::Recursive::dircopy $mkvingestdir, $mkvingestdest or die "copy failed: $!\n"; my @dirs = File::Find::Rule->directory() ->in($mkvingestdest); for my $dir (@dirs){ next if $dir =~ /(?:\.|\.\.)/; next if $dir eq $mkvingestdest; my $zip = Archive::Zip->new; $zip->addDirectory($dir); my $name = basename $dir; if ($zip->writeToFileNamed("$zip_dest/${name}.zip") != AZ_OK){ print "error in archive creation\n"; next; } print "archive created successfully\n"; }

    I get the following zip files in the zipped zip destination directory:

    $ ls zipped/ a.zip b.zip c.zip
Re: Create zip files for each sub-directory under a main directory
by kcott (Archbishop) on Nov 29, 2016 at 15:23 UTC

    G'day mrd1019,

    Welcome to the Monastery.

    "I’m thinking a foreach loop would work?"

    Did you try that? If so, what difficulties did you encounter? If not, then give it a go and, if you encounter difficulties, tell us what they are.

    "I’m not 100% sure how to go about getting to my end goal."

    Again, without knowing what your problem is, it's not really possible to offer a solution.

    I'll take a complete guess that you don't know how to generate the "apxxxx.zip" filenames. For this, you'll probably want the sprintf function. Here's an example with printf (which uses the same formats as sprintf):

    $ perl -e 'printf "ap%04d.zip\n", $_ for (0, 1, 234, 1335)' ap0000.zip ap0001.zip ap0234.zip ap1335.zip

    — Ken

      So, I initially tried Stevieb's suggestion. It created the zip files, but there were no files inside the zip. So, I started playing. Again, I'm getting the zip files created, with the correct names, but there are no files inside the zips. Here is the latest attempt that I made at this:

      use strict; use warnings ‘all’; use File::Copy::Recursive; use Archive::Zip; use constant AZ_OK =>0; use File::Basename; use File::Find::Rule; my $mkvingestdir = “//nas/data/nonops/common/engine/ingest/sim/mkv”; my $mkvingestdest = “//nas/shared/group/test/mkv/ingest”; my @dirs = File::Find::Rule->directory() ->in($mkvingestdir); my @files; foreach my $dir (@dirs) { @files = glob "*.eap"; my $mkvingestzip = Archive::Zip->new(); foreach $_ (@files) { $mkvingestzip->addFile($_); } my $name = basename $dir; if ($mkvingestzip->writeToFileNamed("$mkvingestdest/${name}.zip") + != AZ_OK) { print "Error in archive creation\n"; next; } print "Archive created successfully!\n"; }
        foreach my $dir (@dirs) { @files = glob "*.eap"; my $mkvingestzip = Archive::Zip->new(); foreach $_ (@files) { $mkvingestzip->addFile($_); }

        I think that your problem here is that you are performing the glob in the $CWD which is not $dir and therefore it isn't matching. You could confirm this by adding diagnostic print statements like this:

        foreach my $dir (@dirs) { print "Now processing dir $dir\n"; @files = glob "*.eap"; my $mkvingestzip = Archive::Zip->new(); foreach $_ (@files) { print "Adding file $_\n"; $mkvingestzip->addFile($_); }

        You will likely see the "dir" diagnostics but no "file" ones. In that case you'll need either to chdir inside the outer loop or else prepend the path to the glob argument.

        Printing diagnostics like this is item number 2 on the Basic debugging checklist.

        You should check what's actually in @files. Not just for filenames but also whether they exist and (perhaps also) are of the expected size: see the built-in "File Test Operators".

        I suspect you probably want to glob "$dir/*.eap". With your posted code, you're only looking for "*.eap" in the current directory. Look for other places where you may be referencing the current directory instead of $dir.

        Two other minor points: (1) I wouldn't declare @files globally; (2) foreach and for are synonymous.

        Putting all those points together, instead of:

        my @files; foreach my $dir (@dirs) { @files = glob "*.eap"; ...

        I probably would have written something closer to:

        for my $dir (@dirs) { my @files = glob "$dir/*.eap"; ...

        For the inner loop, $_ is the default. Adding it the way you have (foreach $_ (@files) { ...) makes me, and quite possibly others, wonder if you perhaps had some other intention, e.g. a lexical 'my $_' (which should be avoided anyway - see below). One of these two forms would be more normal (and won't raise eyebrows):

        for (@files) { ...
        for my $file (@files) { ...

        [Use foreach, instead of for, if you want to. It's really just extra (and unnecessary) typing. As stated earlier, the two are synonymous.]

        Using a consistent style of indentation will make your code easier to read and less prone to errors. The choice of coding style is often a very personal one: choose whatever you want. See perlstyle if you want some tips on this.

        Finally, and only because I mentioned it passing above, lexical $_ should be avoided. Here's its history:

        — Ken

        As hippo and kcott have pointed out, I was using glob incorrectly.

        That's what I get for not fully following through to ensure everything did what it was supposed to ;)

Re: Create zip files for each sub-directory under a main directory
by Anonymous Monk on Nov 29, 2016 at 23:40 UTC
    Why do you keep using smart quotes? Both ‘single’ and “double”? Those are syntax errors

      Thanks for all the help guys! I'm new to perl, been trying to teach myself via websites like this and books. Nothing compares to having experienced users help out though!

      I've updated the code, based on a lot of the advice I've been given. The code now creates the zip files correctly (correct name with all the files inside). There are still a couple of things I'd like to change though. First, it creates a zip file of the top level directory (mkv in the $mkvingestdir). I don’t need that zip file. Additionally, the apxxxx.zip files don’t contain just the files, but also contain the directory path (i.e., inside the zip it is not just the .eap files, but mine/MKV/apxxxx/files).

      How would I go about eliminating the MKV zip file that is created, and limit the zip’s to only have the files, and not the directory structure?

      As for the single vs. double quotes that Anonymous mentioned…..as I said, I’ve been learning via websites for the most part, and I see all these different examples. I feel like I’m starting to understand the basics of perl, but there is always so much to learn. I know this code is not perfect, but I’m continually reading/studying to try and make it, and myself better. I really want to thank everyone for all the help with this.

      Having said all that, here is my code as it stands now

      use strict; use warnings ‘all’; use File::Copy::Recursive; use Archive::Zip; use constant AZ_OK =>0; use File::Basename; use File::Find::Rule; my $mkvingestdir = “//nas/shared/mine/MKV” my $mkvingestdest = “//nas/shared/group/test/mkv/ingest”; my @dirs = File::Find::Rule->directory() ->in($mkvingestdir); foreach my $dir (@dirs) { chdir $dir; print “Now processing directory $dir\n”; my @files = glob "$dir/*.eap"; my $mkvingestzip = Archive::Zip->new(); foreach $file (@files) { $mkvingestzip->addFile($file); } my $name = basename $dir; if ($mkvingestzip->writeToFileNamed("$mkvingestdest/${name}.zip") + != AZ_OK) { print "Error in archive creation\n"; next; } print "Archive created successfully!\n"; }

        As for the single vs. double quotes that Anonymous mentioned…..as I said, I’ve been learning via websites for the most part, and I see all these different examples.

        :) syntax errors are syntax errors

        $ perl -MPath::Tiny -MData::Dump -e " dd( path( 'smartquotes.pl' )->li +nes_utf8 ) " ( "\x{FEFF}#!/usr/bin/perl --\r\n", "use warnings \x{2018}all\x{2019};\r\n", "my \$mkvingestdir = \x{201C}//nas/shared/mine/MKV\x{201D};\r\n", ) $ perl smartquotes.pl Unrecognized character \xE2; marked by <-- HERE after warnings <-- HE +RE near column 14 at smartquotes.pl line 2. $ perl -MPath::Tiny -MData::Dump -e " dd( path( 'smartquotes.pl' )->li +nes_raw ) " ( "\xEF\xBB\xBF#!/usr/bin/perl --\r\n", "use warnings \xE2\x80\x98all\xE2\x80\x99;\r\n", "my \$mkvingestdir = \xE2\x80\x9C//nas/shared/mine/MKV\xE2\x80\x9D;\ +r\n", ) $ perl -MPath::Tiny -MData::Dump -e " dd( path( 'not-smartquotes.pl' ) +->lines_raw ) " ( "\xEF\xBB\xBF#!/usr/bin/perl --\r\n", "use warnings 'all';\r\n", "my \$mkvingestdir = '//nas/shared/mine/MKV';\r\n", )

        "U+2018" "U+2019" are not valid syntax like "U+0027"

        "U+201C" "U+201D" are not valid syntax like "U+0022"

        You're posting code that doesn't use '' and "" but the smart quotes ... whatever editor you're using is helping corrupt the code you post

        I was able to get rid of the MKV zip file. Forgot a couple of lines that stevieb had suggested.

        New code is:

        use strict; use warnings ‘all’; use File::Copy::Recursive; use Archive::Zip; use constant AZ_OK =>0; use File::Basename; use File::Find::Rule; my $mkvingestdir = “//nas/shared/mine/MKV” my $mkvingestdest = “//nas/shared/group/test/mkv/ingest”; my @dirs = File::Find::Rule->directory() ->in($mkvingestdir); foreach my $dir (@dirs) { chdir $dir; next if $dir =~ /(?:\.|\.\.)/; next if $dir eq $mkvingestdir; print “Now processing directory $dir\n”; my @files = glob "$dir/*.eap"; my $mkvingestzip = Archive::Zip->new(); foreach $file (@files) { $mkvingestzip->addFile($file); } my $name = basename $dir; if ($mkvingestzip->writeToFileNamed("$mkvingestdest/${name}.zip") + != AZ_OK) { print "Error in archive creation\n"; next; } print "Archive created successfully!\n"; }

        Still trying to figure out how to remove the partial directory structure within the apxxxx.zip files

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1176813]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 21:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found