http://www.perlmonks.org?node_id=588386

neilwatson has asked for the wisdom of the Perl Monks concerning the following question:

I have a directory where all files, except .txt, are gzipped. I want to have Perl gunzip them, process them and then gzip them again. I've stripped out all the middle process to attempt to debug the problem.
#!/usr/bin/perl use warnings; use strict; use Date::Calc qw(:all); my $PRINTFILE_DIRECTORY = "/home/lprjob1/filest"; # Directories to work in #my @dirs = ('dpi', 'dsc', 'dscpprod'); my @dirs = ('dscpprod'); my $dir; my ($year, $month, $day, $control_file, $pday, $pdatestr); # Form date string with arguement or default to today. my $dateform; if (scalar(@ARGV) eq 1 && $ARGV[0] =~ /\d\d\d\d-\d\d-\d\d/) { $dateform = $ARGV[0]; } else { ($year, $month, $day) = Today(); $dateform = sprintf("%d-%02d-%02d",$year,$month,$day); # Create current previous day's date string for later use # This will be needed for the control file. $pday = $day - 1; $year =~ s/^\d{2}(\d{2})/$1/; $pdatestr = sprintf("%02d%02d%02d",$year,$month,$pday); $control_file = "CONTROL_$pdatestr.txt"; } foreach (@dirs){ $dir = $_; print "Should $dir be processed?\n"; #If the control file does not exist then skip next unless ( -e "$PRINTFILE_DIRECTORY/$dir.$dateform/$control_fil +e" ); print "Yes, $dir should be processed?\n"; # Unzip files gzip('unzip', "$PRINTFILE_DIRECTORY/$dir.$dateform"); # Zip files again gzip('zip', "$PRINTFILE_DIRECTORY/$dir.$dateform"); } sub gzip { my $cmd = shift; my $dir = shift; my $file; my $fullfile; print "$cmd files in $dir...\n"; # Make sure command is secure and predictable. if ( $cmd eq 'zip' ){ $cmd = 'gzip'; } elsif ( $cmd eq 'unzip' ){ $cmd = 'gunzip'; } else { warn "zip or unzip command not given $!"; } opendir DIR, $dir or warn "Cannot open $dir $!"; while ( defined ( $file = readdir(DIR) ) ){ $fullfile = "$dir/$file"; if ( !-f $fullfile || $file =~ m/\.txt$/ || $file !~ m/^[\w-]+ +(\.[\w-]+)*$/ ){ print "skipping $fullfile\n"; next; } #print "$cmd $fullfile"; system ( "$cmd $fullfile") == 0 or warn "Cannot $cmd $fullfile + : $!"; } closedir DIR; }
I consistently get seemingly random errors.
[lprjob1@tor-lx-sftp report-printing]$ ./test.pl Should dscpprod be processed? Yes, dscpprod should be processed? unzip files in /home/lprjob1/filest/dscpprod.2006-12-07... skipping /home/lprjob1/filest/dscpprod.2006-12-07/. skipping /home/lprjob1/filest/dscpprod.2006-12-07/.. skipping /home/lprjob1/filest/dscpprod.2006-12-07/CONTROL_061206.txt gunzip: /home/lprjob1/filest/dscpprod.2006-12-07/cns061206: unknown su +ffix -- ignored Cannot gunzip /home/lprjob1/filest/dscpprod.2006-12-07/cns061206 : at + ./test.pl line 75. skipping /home/lprjob1/filest/dscpprod.2006-12-07/print.files.txt skipping /home/lprjob1/filest/dscpprod.2006-12-07/print.archive.txt zip files in /home/lprjob1/filest/dscpprod.2006-12-07... skipping /home/lprjob1/filest/dscpprod.2006-12-07/. skipping /home/lprjob1/filest/dscpprod.2006-12-07/.. skipping /home/lprjob1/filest/dscpprod.2006-12-07/CONTROL_061206.txt gzip: /home/lprjob1/filest/dscpprod.2006-12-07/dd061206.gz already has + .gz suffix -- unchanged Cannot gzip /home/lprjob1/filest/dscpprod.2006-12-07/dd061206.gz : at + ./test.pl line 75. skipping /home/lprjob1/filest/dscpprod.2006-12-07/print.files.txt skipping /home/lprjob1/filest/dscpprod.2006-12-07/print.archive.txt gzip: /home/lprjob1/filest/dscpprod.2006-12-07/dl061206.GGDHSETD.gz al +ready has .gz suffix -- unchanged Cannot gzip /home/lprjob1/filest/dscpprod.2006-12-07/dl061206.GGDHSETD +.gz : at ./test.pl line 75. gzip: /home/lprjob1/filest/dscpprod.2006-12-07/dq061206.new.gz already + has .gz suffix -- unchanged Cannot gzip /home/lprjob1/filest/dscpprod.2006-12-07/dq061206.new.gz : + at ./test.pl line 75. gzip: /home/lprjob1/filest/dscpprod.2006-12-07/hs061206.v4.gz already +has .gz suffix -- unchanged Cannot gzip /home/lprjob1/filest/dscpprod.2006-12-07/hs061206.v4.gz : + at ./test.pl line 75. gzip: /home/lprjob1/filest/dscpprod.2006-12-07/ncdsc061206.gz already +has .gz suffix -- unchanged
I see no pattern to the errors. The files associated with the errors are different each time. What have I missed?

Neil Watson
watson-wilson.ca