Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Recursive image processing (with ImageMagic)

by wvick (Novice)
on Nov 22, 2012 at 20:50 UTC ( #1005185=perlquestion: print w/ replies, xml ) Need Help??
wvick has asked for the wisdom of the Perl Monks concerning the following question:

Newbie in the house! It's been about 5 years since I did any Perl but I've just been handed a problem which I suspect can be solved elegantly using a relatively short script.

I have a directory which contains other directories - some containing and a large (millions!) number of small PNG images. I need to make a copy of this directory structure into a new location, with all the images converted to greyscale. Yes, grey. I'm British and so am allowed to spell it like that. :-) The utility needs to run on a Windows platform.

I've already pre-selected ImageMagick to do the image processing and was originaly going to write a C/C++ program to do the job. However, when I noticed that ImageMagick has a Perl module, I started to recall that Perl has some nice built-in file system management which can be expressed concisely. I'm also aware there are a lot of modules out there which can do some clever things and save a lot of effort.

So, I'm looking at a utility which accepts a source directory and target directory as parameters. It will then scan through the source directory, recursing into sub-directories, and scan for PNG images. When it finds one, let PerlMagick do the hard work and write the output. Of course, I also need to create sub-directories on the output folder.

I wondered if anyone had created (something like) a recursive file copier which I could adapt to a recursive image processor? Alternatively, any help/guidance on the directory scanning and recursion would be appreciated, especially given the number of PNG images being processed (in each folder and overall). TIA.

Regards,
Warren

Comment on Recursive image processing (with ImageMagic)
Re: Recursive image processing (with ImageMagic)
by jethro (Monsignor) on Nov 22, 2012 at 22:30 UTC

    The standard module for traversing directories is File::Find. You just need to provide a function detailing what to do with each file. The find function of File::Find also calls directories before files so you can just create the target dirs without worrying about the sequence of calls.

      My only problem is using something like File:Find is that I think it's better for finding file "needles in a haystack", rather than processing ALL the files (they're all PNGs in my case). It also makes the directory handling more complicated as I need to check or create (regardless) the output folder for and every file created. Unless, I suppose, File:Find returns files in blocks so I can check the last folder created and only mkdir when needed.

      Do you also feel that since every file/sub-directory is being processed, then a recusrive search would be best. i.e. every time a new directory is handle, I mkdir the same in the destination.

        File:Find is that I think it's better for finding file "needles in a haystack", rather than processing ALL the files

        I don't think so. Without fine tuning, find({wanted => \&wanted, ...}) invokes wanted for each file and directory found, so wanted sees the entire haystack including all needles, piece by piece.

        I need to check or create (regardless) the output folder for and every file created

        No. File::Find also calls wanted for the directories found during the file tree traversal. You need to check and create directories only in that case.

        You could also use the preprocess option, it is invoked exactly when you want to create the target directory:

        The value should be a code reference. This code reference is used to preprocess the current directory. The name of the currently processed directory is in $File::Find::dir. Your preprocessing function is called after readdir(), but before the loop that calls the wanted() function. It is called with a list of strings (actually file/directory names) and is expected to return a list of strings. The code can be used to sort the file/directory names alphabetically, numerically, or to filter out directory entries based on their name alone. When follow or follow_fast are in effect, preprocess is a no-op.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Recursive image processing (with ImageMagic)
by Kenosis (Priest) on Nov 23, 2012 at 03:11 UTC

    I have a vested interest in your question as I have a B&W photography background, and frequently convert color images to grayscale (I'm not British... :). Given this, perhaps the following will assist you--at least to start:

    use strict; use warnings; use File::Find::Rule; use File::Basename; use File::Path qw/mkpath/; use List::Util qw/sum/; use GD; my $sourceDir = 'A'; my $destinDir = 'B'; -d $sourceDir or die 'Error: Source directory does not exist.'; !-d $destinDir or die 'Error: Destination directory already exists.'; my @files = File::Find::Rule->file()->name(qr/\.png$/i)->in($sourceDir +) or die 'No png files found in source directory.'; print "\nNumber of png files to convert to grayscale: ", scalar @files +, "\n\n"; for my $pngFile (@files) { my $destinPath = $pngFile =~ s{^$sourceDir(?=/)}{$destinDir}r; my $dir = dirname($destinPath); if ( !-d $dir ) { mkpath( $dir, { error => \my $err } ); !@$err or die qq{Unable to create directory "$dir"}; } print "Processing: $pngFile"; my $image = GD::Image->new($pngFile); for my $i ( 0 .. $image->colorsTotal() - 1 ) { my @gray = ( ( sum $image->rgb($i) ) / 3 ) x 3; $image->colorDeallocate($i); $image->colorAllocate(@gray); } open my $fh, '>', $destinPath or die $!; binmode $fh; print $fh $image->png; close $fh; print " - Done!\n"; } print "\nJob completed.\n";

    Sample output when running:

    Number of png files to convert to grayscale: 6 Processing: A/adelaide-rosella.png - Done! Processing: A/frog.png - Done! Processing: A/C/chicken_profile.png - Done! Processing: A/C/tux.png - Done! Processing: A/C/D/crowned_crane.png - Done! Processing: A/C/D/cuckoo.png - Done! Job completed.

    There's likely a more efficient way to do this, but it worked well in my tests--although I'm not processing millions of images. You'll notice that I used GD for the actual image processing. If you're keen on using ImageMagick, you could just modify this script for it.

    It will preserve the structure of the source directory, mirroring it in the destination directory, and convert all found png files to grayscale, and then write them into their destination directory. There's an initial check for the destination directory already existing, since files might otherwise be overwritten.

    Hope this helps!

      This looks almost like "job done"! Thank you very much, Kenosis!

        You're most welcome, wvick!

Re: Recursive image processing (with ImageMagic)
by wrinkles (Pilgrim) on Nov 23, 2012 at 07:37 UTC
    I've published a script which uses Image::Magick, File::Find, and Spreadsheet::WriteExcel to convert a list of directories containing images into spreadsheets of thumbnails with dirified filenames, including a link to a larger thumbnail. It is of course a bit different from what you need, but some of the code may be useful.

      I small critic your code , instead of  if (@ARGV == 0)

      if( not @ARGV )

      if( ! @ARGV )

      unless( @ARGV )

      or

      @ARGV or exit print Usage();

        There is nothing unclear about if (@ARGV == 0) which quite intuitively translates to "if the number of arguments is zero." Your alternatives are all fine, too, but there is no reason other than personal preference to prefer one over the other.


        When's the last time you used duct tape on a duct? --Larry Wall
Re: Recursive image processing (with ImageMagic)
by zentara (Archbishop) on Nov 23, 2012 at 08:08 UTC
    Depending on how you process your images, it may be wise to try and reuse a single IM object. This code shows a possible IM glitch which you may run into. It shows the need to clear out the reusable IM object after every use. However, you may get away with a creation/undef usage within a limited scope, where you create a new IM object for every transformation, then undef it. I think that would slow you down with alot of images, so think about a reusable IM object.
    #!/usr/bin/perl use warnings; use strict; use Image::Magick; my $image = Image::Magick->new; umask 0022; my @pics= <*.jpg>; #my @pics= <*.jpg *.gif *.png>; #add all your extensions here foreach my $pic (@pics){ my ($picbasename) = $pic =~ /^(.*).jpg$/; my $ok; $ok = $image->Read($pic) and warn ($ok); my $thumb = $picbasename . '-t.jpg'; $image->Scale(geometry => '100x100'); $ok = $image->Write($thumb) and warn ($ok); undef @$image; #needed if $image is created outside loop print "$pic -> $thumb\n"; }

    And here is a very basic File::Find script to process files.

    #!/usr/bin/perl use File::Find; $|++; my $path = '.'; my $cmd = 'file'; finddepth (\&wanted,$path); # untested regex my $regex = qr/\Q.png$\E/i; sub wanted { return unless -f; #-d for dir ops or comment out for both if ( /$regex/) { print "$File::Find::name\n" #do your ImageMagick processing here } } __END__

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh

      Thanks very much for the code. Please see my earlier reply about a reservation in my mind about using File:Find for "all files" processing, and the difficulties it causes for mkdir in the output. What do you think?

        It also makes the directory handling more complicated as I need to check or create (regardless) the output folder for and every file created. Unless, I suppose, File:Find returns files in blocks so I can check the last folder created and only mkdir when needed.

        Well if you think about it, you have to test each file and directory to find out if you are in the right one, then if each file in that subdir has a .png ending.

        I think to save cpu cycles, in your case, I would run File::Find with a directory test before the file test. $File::Find::prune = 1 will skip processing any files in that subdir. Something like:

        find sub { # first filter out your directories if (-d && $_ !~ / my_target_dirs_regex / ) { $File::Find::prune = 1; return; } # if you make it to here, you are in your target subdir # now do your ImageMagick processing and mkdir # it has to check each file for a .png ending if (-f && $_ =~ / (.*)\.png$ / ) { # do mkdir stuf # do ImageMagick stuff } # or you could just push the files into an array for later processing # push @goners, $File::Find::name; }, @ARGV;

        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh
Re: Recursive image processing (with ImageMagic)
by Cody Fendant (Pilgrim) on Nov 23, 2012 at 10:50 UTC

    I have to admit, if I'd already installed ImageMagick the binary and could do

    convert --whatever --options path/to/file path/to/otherfile
    from the command line, I'd just do this with File::Find and a call to system() in the found sub.

    Ugly, and scary, but quick.

      The easy solution is always worth a try, but in this case I'd be worried that the overhead of calling and starting up the external program would be prohibitive for millions of files.


      When's the last time you used duct tape on a duct? --Larry Wall

      Ugly yes, and I'm not sure it would be quick given millions of PNG images. Also, on Windows, would system() pop up a new console window? If so, my PC would be pretty unusable during the processing with literally millions of windows popping up! I just get the feeling that memory would leak somewhere and the process would not complete.

Re: Recursive image processing (with ImageMagic)
by SuicideJunkie (Priest) on Nov 23, 2012 at 14:43 UTC

    Irfanview should do what you want already.

    i_view32.exe c:\*.png /gray /convert=d:\temp\*.png

    I don't see an obvious option for recursive directories, but the directory diving can be done with a very small script that makes the calls to convert everything in the directory.

      I know Irfanview well, but have never tried it on the command line. Within the Windows UI, attempting to wildcard the folders chokes the app. I assume there's a hardcoded practical batch limit in there somewhere. Better to use something like Perl to discover the PNG files one-on-one, process and move on to the next.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1005185]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2014-09-03 04:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (35 votes), past polls