Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Iterate Recursively Through Directories

by efaden (Initiate)
on Feb 09, 2014 at 16:21 UTC ( #1074140=perlquestion: print w/ replies, xml ) Need Help??
efaden has asked for the wisdom of the Perl Monks concerning the following question:

Hey All,

So I am trying to write a script to iterate over a set of files and run some commands. The directory/file structure is

Dir1/001.jpg Dir2/002.jpg Dir2/001.jpg Dir2/001.cr2

etc. Basically each folder will have files from 001.ext, ... , n.ext. I would like to iterate over the files and run a program on each of them to make a thumbnail. I, however, only want to run 1 time per file independent of the ext... e.g. I would like to know that 001 has a cr2 and a jpg, but only want to make one thumbnail.

Does this make sense?

Right now I am using File::Find::Rule; and then getting all cr2, jpg, etc.. within the directory.

What is the best way to do this?

This is my current script that I want to modify so it only prints one thumbnail per image (even if there is a JPEG and a RAW)

#!/usr/bin/perl # Install ffmpeg, ufraw, ImageMagick print "Thumnailer, pix2tn\n\n"; my $filename = 'index.html'; $start_dir = shift || '.'; use File::Find::Rule; # find all the .pm files in @INC my @files = File::Find::Rule->file() ->name( '*.jpg', '*.avi', '*.raw', '*.cr +2', '*.jpeg', '*.nef', '*.mov') ->in( @INC ); @nfiles = grep(!/AppleDouble/, @files); my $tnperrow=4; # thumbnails per row my $tnsize=200; # size of thumbnails my $tnquality=40; # quality of thumbnails [0..100] # use small thumbnails with poor quality to speed up your index-page if (-e $filename) { rename $filename, $filename.".bak"; print "I saved the old $filename as $filename.bak\n"; } open (PAGE, ">$filename") || die "Problem: Can't write to filename\n"; # create a directory for the thumbnails system ("mkdir tn") if (!-d "tn"); #system ("mkdir med") if (!-d "med"); # create the index page print PAGE qq* <html><head><title>$title</title></head> <body bgcolor=white><h1>$title</h1> <table cellspacing=10 width="100%"> *; my $counter=0; foreach $_ (@nfiles) { $in = $_; $out = $_; $out =~ s/\//-/g; $out =~ s/\.avi$/.jpg/g; $out =~ s/\.cr2/.jpg/g; $out =~ s/\.nef/.jpg/g; $out =~ s/\.mov/.jpg/g; print $in; if ($in =~ /\.avi$/) { system ('convert', '-resize', $tnsize."x".$tnsize, '-quality', $tn +quality, $in.'[1]', 'tn/'.$out) == 0 || die "Problems with convert: $?\n"; #system ('convert', '-geometry', $medsize."x".$medsize, '-qualit +y', $medquality, $in.'[1]', 'med/'.$out) == 0 #|| die "Problems with convert: $?\n"; } elsif ($in =~ /\.mov/) { system ('convert', '-resize', $tnsize."x".$tnsize, '-quality', $t +nquality, $in.'[1]', 'tn/'.$out) == 0 || die "Problems with convert: $?\n"; } else { system ('convert', '-resize', $tnsize."x".$tnsize, '-quality', $ +tnquality, $in, 'tn/'.$out) == 0 || die "Problems with convert: $?\n"; } print PAGE "<tr valign=bottom>" if (!($counter++%$tnperrow)); print PAGE "<td>"; #<a href="med/$out"><img src="tn/$out" alt="click to enlarge"></a> +<br> @stat = stat $_; print PAGE qq*<center> <img src="tn/$out" alt="click to enlarge"><br>*; if ($in =~ /\.avi$/) { print PAGE "<b>AVI</b><br>"; } elsif ($in =~ /\.mov/) { print PAGE "<b>MOV</b><br>"; } elsif ($in =~ /\.cr2/) { print PAGE "<b>CR2 (RAW)</b><br>"; } elsif ($in =~ /\.jpg/) { print PAGE "<b>JPG</b><br>"; } print PAGE qq* <small><b>$_</b><br>*. localtime($stat[9]).<br>. qq*</small>\n*; print " ... done\n"; } print PAGE qq* </table><hr> Index created on *. localtime(time) .qq* </body></html> *; close PAGE; exit;

Comment on Iterate Recursively Through Directories
Select or Download Code
Re: Iterate Recursively Through Directories
by Kenosis (Priest) on Feb 09, 2014 at 18:04 UTC

    Consider using a hash of arrays (HoA) to track the partial paths (that includes all up to the file extension) and their associated types. For example, given your dataset, the hash would be:

    'Dir1/001' => ['jpg'], 'Dir2/002' => ['jpg'], 'Dir2/001' => ['jpg', 'cr2']

    This way you can track 'uniqueness' and the file types of the images:

    use strict; use warnings; use File::Find::Rule; use File::Basename; my %hash; my $dir = '.'; my @files = File::Find::Rule->file()->in($dir); for my $file (@files) { if ( my ( $partialPath, $type ) = $file =~ /(.+)\.([^.]+)/ ) { push @{ $hash{$partialPath} }, $type; } } for my $partialPath ( keys %hash ) { #my @fileTypes = @{ $hash{$partialPath} }; my $fileTypes = $hash{$partialPath}->[0] ? "@{ $hash{$partialPath} + }" : 'None'; my $baseFilename = basename $partialPath; # make thumbnail # $partialPath contains all up to trailing dot and extension # @fileTypes contains the file type(s) # $baseFilename contains the base file name w/o the extension print "Partial Path: $partialPath\n"; print "File Type(s): $fileTypes\nBasename: $baseFilename\n\n"; }

    Hope this helps!

      I was thinking about something like that... but I want to do one other thing that doing it that way makes it hard to do... I want to mark the image as having a CR2, JPEG, etc so I know that the image had a RAW... I just don't need two thumbnails. Make sense?

      -Eric

        Have updated my answer.

Re: Iterate Recursively Through Directories
by moritz (Cardinal) on Feb 09, 2014 at 18:17 UTC

    FWIW if the directory structure is always DIR$N/001.$EXT, it has a fixed depth and you don't even need File::Find(::Rule), a simple glob is enough:

    use strict; use warnings; my %seen; while (my $file = glob 'Dir*/*') { (my $base = $file) =~ s/\.\w+$//; next if $seen{$base}; # work with $file here }

      I don't know if this helps for anything, this doesn't Iterate Recursively as shown, but I use readdir in the following code to create an array $dots of filenames from a directory, excluding any "." prefix in unix. Then I can grep out anything I don't want to operate on, etc. Or in this case put the contents of all the files into a single array $foo, then grep out or split out whatever I'm looking for. I think it can be modified to do something useful without a module.

      $sd = "../data/data_forthis"; opendir( DIR, $sd) || die; while( ($filename = readdir(DIR))){ next if "$sd\/$filename" =~ /\/\./; push @dots, "$sd\/$filename"; } ## end of while @dots = sort @dots; closedir(DIR); for(my $a=0;$a<@dots;$a++){ open (FILE, $dots[$a]); push @foo, <FILE>; if ($a+1 eq @dots) { close FILE; open (FILE, $dots[$a]); push @foo2, <FILE>; } ## end of if close FILE; } ## end of for ### At this point @foo has everything and $foo2 has just the last file + in the directory...
Re: Iterate Recursively Through Directories
by kevbot (Chaplain) on Feb 09, 2014 at 18:18 UTC
    I threw together a solution that is very similar to what Kenosis has suggested. This code creates a %file_info hash. In the code below, the last file name encountered for a given prefix gets stored in the hash. If you want to do other things (like keep track of file type, etc.) then you could modify the %file_info hash. Basically, you want to create a data structure that contains all the information you want to track.
    #!/usr/bin/env perl use strict; use warnings; use Path::Tiny; use File::Find::Rule; my $dir = shift or die 'No directory given.'; my $dir_path = Path::Tiny->new($dir); my @files = File::Find::Rule->file()->in( $dir_path ); my %file_info; foreach my $file (@files) { my $file_name = path($file)->basename; my $fn_without_ext = $file_name; $fn_without_ext =~ s/\..{3}$//; $file_info{ path($file)->dirname }{ $fn_without_ext } = path($file +)->basename; } foreach my $found_dir (keys(%file_info)) { my $file_name = (values %{$file_info{$found_dir}})[0]; my $file_to_process = path($found_dir, $file_name); #process the file here (e.g. use the 'system' command) print "Processing this file: $file_to_process\n"; } exit;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1074140]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (11)
As of 2014-12-28 03:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (178 votes), past polls