Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Group file per time interval

by raiten (Acolyte)
on Feb 07, 2013 at 22:37 UTC ( #1017732=perlquestion: print w/ replies, xml ) Need Help??
raiten has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I'm looking for a way to parse a directory and group files which have a create date (or exif create date) at less than n seconds interval between each other (consecutive files in time) I'm trying to preprocess files to generate a list for hugin panoramic generation.

date is in filename (yyyymmdd_hhMM--name) or in exif create date (exiftool). I can extract a primary list because I tagged my files with a keyword but it's not sufficient to get a good output by panostart. So in this list of filenames, I want to find a way to generate sublist of files which have a maximum time interval of n seconds between their datetime (typically 5-10 sec)

Any ideas how to do that ?

Thanks a lot. Cheers

Comment on Group file per time interval
Re: Group file per time interval
by roboticus (Canon) on Feb 08, 2013 at 00:57 UTC

    raiten:

    I'd convert the times to seconds, and then do something like this:

    $ cat t.pl #!/usr/bin/perl use strict; use warnings; use Data::Dumper; ### Get a list of picture times, in seconds my @times; push @times, int 150*rand for 0 .. 30; ### Sort them @times = sort { $a<=>$b } @times; ### Group together the pix whose time is less than MAX_TIME_DIFF secon +ds ### apart my $MAX_TIME_DIFF=10; # Minimum time between photos my $MIN_GRP_SIZE=3; # Minimum "interesting" group size my @groups; my $cur_group = [ shift @times ]; while (@times) { if ($$cur_group[-1]+$MAX_TIME_DIFF >= $times[0]) { # small interval, add to current group push @$cur_group, shift @times; } else { # store last group (if interesting) and start # a new one. push @groups, $cur_group if @$cur_group >= $MIN_GRP_SIZE; $cur_group = [ shift @times ]; } } print Dumper(\@groups); $ perl t.pl $VAR1 = [ [ 32, 36, 39, 39, 40, 42, 48, 53, 55, 56, 57, 58, 59 ], [ 72, 73, 77, 82 ], [ 93, 103, 104, 105, 108, 111, 113, 113, 114, 116 ] ];

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Group file per time interval
by Kenosis (Priest) on Feb 08, 2013 at 04:47 UTC

    This literally builds upon the excellent solution offered by roboticus. The next step is to get the EXIF date/time info from the image files, and convert that into seconds (the directory read is non-recursive):

    use strict; use warnings; use Image::ExifTool qw(:Public); use Date::Calc qw/Date_to_Time/; my $imageDir = './images'; my %timeFiles; ### Read each file's EXIF info and convert to seconds ### A Hash of Arrays (HoA) is used in case images have the same time for my $image (<$imageDir/*>) { # Skip image if no DateTimeOriginal info (highly unlikely) next unless my $dateTime = ImageInfo($image)->{DateTimeOriginal}; # Convert 'YYYY:MM:DD HH:MM:SS' to seconds my $time = Date_to_Time( split /[:\s]/, $dateTime ); push @{ $timeFiles{$time} }, $image; } ### <roboticus code> ### Sort the times from the hash my @times = sort { $a<=>$b } keys %timeFiles; ### Group together the pix whose time is less than MAX_TIME_DIFF secon +ds ### apart my $MAX_TIME_DIFF=10; # Minimum time between photos my $MIN_GRP_SIZE=3; # Minimum "interesting" group size my @groups; my $cur_group = [ shift @times ]; while (@times) { if ($$cur_group[-1]+$MAX_TIME_DIFF >= $times[0]) { # small interval, add to current group push @$cur_group, shift @times; } else { # store last group (if interesting) and start # a new one. push @groups, $cur_group if @$cur_group >= $MIN_GRP_SIZE; $cur_group = [ shift @times ]; } } ### </roboticus code> ### Get each group for my $group(@groups){ # The time in each group for my $time(@$group){ # The hash value is an array ref # $_ takes the value of each array element, viz., a file name print "$_: $time\n" for @{ $timeFiles{$time} }; } # Print blank lines between groups print "\n\n"; }

    Partial output:

    ./images/IMG_7139.CR2: 1336823075 ./images/IMG_7140.CR2: 1336823077 ./images/IMG_7141.CR2: 1336823079 ./images/IMG_7142.CR2: 1336823082 ./images/IMG_7143.CR2: 1336823084 ./images/IMG_7144.CR2: 1336823086 ./images/IMG_7145.CR2: 1336823088 ./images/IMG_7152.CR2: 1336823437 ./images/IMG_7153.CR2: 1336823442 ./images/IMG_7154.CR2: 1336823446 ./images/IMG_7155.CR2: 1336823451 ./images/IMG_7156.CR2: 1336823455 ./images/IMG_7157.CR2: 1336823467 ./images/IMG_7158.CR2: 1336823470 ./images/IMG_7159.CR2: 1336823473 ./images/IMG_7160.CR2: 1336823653 ./images/IMG_7161.CR2: 1336823657 ./images/IMG_7162.CR2: 1336823661 ./images/IMG_7163.CR2: 1336823664 ./images/IMG_7164.CR2: 1336823667

    I actually use Hugin (excellent pano app!) and this directory read shows names of images that Hugin eventually stitched into a wonderful pano.

      Kenosis:

      Nice! I was a bit surprised to see my little bit untouched though. I was expecting someone to have to expand the data structure into something like [ time, 'Filename' ] and tweak the sort and such. It hadn't occurred to me to stuff the filenames into a lookup table by time.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

      Hello,

      sorry for the long-delay answer. Pretty simple but powerful code and a lot of shortcuts I didn't know. I try to make it more advanced w exif analysis. It's still a work in progress.

      Any help appreciated :-)

      #!/usr/bin/perl -w ## $Id: group_panofiles.pl 1702 2013-02-28 18:34:47Z <ME> $ ## script to extract from a group of image list of images which cou +ld be pano ## ## https://groups.google.com/forum/?fromgroups=#!topic/hugin-ptx/hnPFG +y7EyQI ## http://perlmonks.org/?node_id=1017732 ## http://search.cpan.org/~exiftool/Image-ExifTool/ ## http://owl.phy.queensu.ca/~phil/exiftool/ExifTool.html ## need libimage-exiftool-perl libdate-calc-perl/libdatetime-format-st +rptime-perl ## FIXME! how to detect when taking 3 same images (eg: took 3 pictures + believing w bracketing but not): similarity coefficient > 95% ? ## http://www.perlmonks.org/?node_id=907337 ## http://www.simon-cozens.org/content/seeking-images-perl ## http://search.cpan.org/~xern/Image-Signature-0.01/Signature.pm ## ImageMagick compare: http://www.imagemagick.org/Usage/compare/#comp +are ## tag/rename dupe file before w 'findimagedupes' ## FIXME! many false positive if in rafale shooting (animal, road, ... +) ## test on 20130109, 20130209, 20130220 use strict; use warnings; use Image::ExifTool qw(:Public); use POSIX qw(strftime); use Getopt::Long; use Data::Dumper; ## Can either use one of those libraries (strptime seems a little fast +er) use Date::Calc qw/Date_to_Time/; #use DateTime::Format::Strptime; #my $analyzer = DateTime::Format::Strptime->new( pattern => '%Y:%m:%d +%H:%M:%S', on_error => 'croak' ) or die $DateTime::Format::Strptime: +:errmsg; use vars qw($opt_timediff $opt_groupsize $opt_tagsinclude $opt_tagsexc +lude $opt_extension $opt_autotagged $opt_recursive $opt_extended $opt +_ignorebracketing $opt_help $opt_verbose $opt_debug $opt_path); Getopt::Long::Configure("bundling"); GetOptions ("t=i" => \$opt_timediff, "timediff=i" => \$opt_timediff, "m=i" => \$opt_groupsize, "groupsize=i" => \$opt_groupsize, "p=s" => \$opt_path, "path=s" => \$opt_path, "I=s@" => \$opt_tagsinclude, "tagsinclude=s@" => \$opt_tagsinclud +e, "E=s@" => \$opt_tagsexclude, "tagsexclude=s@" => \$opt_tagsexclud +e, "r=s@" => \$opt_extension, "extension=s@" => \$opt_extension, "a=s@" => \$opt_autotagged, "autotagged=s@" => \$opt_autotagg +ed, "R" => \$opt_recursive, "recursive" => \$opt_recursive, "e" => \$opt_extended, "extended" => \$opt_extended, "i" => \$opt_ignorebracketing, "ignorebracketing" => \$opt_ig +norebracketing, "h" => \$opt_help, "help" => \$opt_help, "v" => \$opt_verbose, "verbose" => \$opt_verbose, "d" => \$opt_debug, "debug" => \$opt_debug, ); sub print_help() { print " This script allows you to parse a directory of pictures and try to gro +up files which could be used to form a panoramic picture (with Hugin for example). It is mainly using time difference between picture (normal m +ode) and exif metadata (extended check mode).\n\n"; print_usage(); } sub print_usage () { print "Required arguments: -p directory path to analyze Optional arguments: -t=n Maximum time difference allowed between pictures of a group [d +efault=10] -m=n Minimum group size [default=3] -R Analyze directory recursively [default=no] -e Extended check [default=no] -h Help -v verbose mode -d debug mode If extended check is enabled, -I=tag1,tag2 Include keyword tags: those picture are included (supe +rseded exclusion) -E=tag1,tag2 Exclude keyword tags: those picture are excluded -r=extension File type/extension to parse [default=jpg] -a=autotagged Files which are selected in panoramic group will be a +dded those tags -i Ignore bracketing images [default=select only pictures w exp +osure compensation 0>=n>-1]\n"; } ## FIXME! error 'Unable to read image data from file './20130209_11361 +1--DSC_0828.JPG': 'format 'jpeg' not supported - formats bmp, ico, pn +g, pnm, raw, sgi, tga available for reading - Can't locate Imager/Fil +e/JPEG.pm' at /usr/local/share/perl/5.14.2/Image/Compare.pm line 162. +' sub CompareImages0 { ## from http://www.perlmonks.org/?node_id=907337 ## http://search.cpan.org/~avif/Image-Compare-0.3/Compare.pm use Image::Compare; my $file1 = shift; #some jpegs, or png my $file2 = shift; my $filetype = lc($opt_extension); $filetype =~ s/jpg/jpeg/; my ( $cmp ) = Image::Compare->new(); $cmp->set_image1( img => $file1, type => $filetype ); $cmp->set_image2( img => $file2, type => $filetype ); $cmp->set_method( method => &Image::Compare::THRESHOLD, args => 25, ); #$cmp->set_method( # method => &Image::Compare::EXACT, # ); if ( $cmp->compare() ) { # The images are the same, within the threshold print "same\n"; return 1; } else { # The images differ beyond the threshold print "not same\n"; return 0; } } ## using ImageMagick compare sub CompareImages { my $file1 = shift; #some jpegs, or png my $file2 = shift; if (-f $file1 && -f $file2) { return `compare -metric AE -fuzz 10% $file1 $file2 null 2>&1`; } } my %timeFiles; my $count = 0; if (!$opt_timediff) { $opt_timediff=10; # Minimum time between photos } if (!$opt_groupsize) { $opt_groupsize=3; # Minimum "interesting" group size } if (!$opt_extended) { $opt_extended=0; } if (!$opt_extension) { $opt_extension='JPG'; } if (!$opt_ignorebracketing) { $opt_ignorebracketing=0; } if (!$opt_verbose) { $opt_verbose=0; } if (!$opt_debug) { $opt_debug=0; } if ( $opt_help || (!$opt_help && !$opt_path)) { print "FATAL! missing required arguments.\n"; print_help(); exit 1; } ## TEST #$opt_extended = 1; $opt_verbose = 1; $opt_ignorebracketing= 1; # exclude pictures w ExpComp not in 0<>-1 + to avoid conflict w bracketed images #$opt_path= '.'; ## TEST $opt_tagsinclude = ''; # include only pictures w those tags (supers +eded exclude) ## OK #$opt_tagsinclude = 'pano'; # include only pictures w those tags (s +uperseded exclude) ## NOK #$opt_tagsexclude = 'Cuisine'; # exclude pictures w those tags $opt_tagsexclude = 'plan-explications,Cuisine,ticket,Logement,flore'; + # exclude pictures w those tags $opt_autotagged = '_autotagged, _pano'; ## transform list into regexp, probably not good w very long list ... if ($opt_tagsinclude) { $opt_tagsinclude =~ s/([,\s*]+)/|/g; $opt_tagsinclude =~ s/^(.*)$/($1)/g; } if ($opt_tagsexclude) { $opt_tagsexclude =~ s/([,\s*]+)/|/g; $opt_tagsexclude =~ s/^(.*)$/($1)/g; } my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(t +ime); my ($start, $end); $start = time(); my $perflog = "Time: start at $hour:$min:$sec"; ### Read each file's EXIF info and convert to seconds ### A Hash of Arrays (HoA) is used in case images have the same time ## Note: in perl, exiftool relies on classical perl functions to parse + files/directories print "Building file list" if ($opt_verbose == 1); for my $image (<$opt_path/*$opt_extension>) { ## alternate file listing, faster? ## http://perlmeme.org/faqs/file_io/directory_listing.html ## http://stackoverflow.com/questions/5241692/perl-script-to-recurs +ively-list-all-filename-in-directroy #foreach my $image(glob("*$opt_extension)) { #File::Find->find({ \&process_file, no_chdir => 1 }, @dir); # Skip exiftool backup files next if ($image =~ m/_original$/i); # Skip image if no DateTimeOriginal info (highly unlikely) next unless my $dateTime = ImageInfo($image)->{DateTimeOriginal}; # Convert 'YYYY:MM:DD HH:MM:SS' to seconds my $time = Date_to_Time( split /[:\s]/, $dateTime ); ## another date conversion # my $dt = $analyzer->parse_datetime($dateTime); # my $time = $dt->epoch; if ($opt_extended == 0) { push @{ $timeFiles{$time} }, $image; } else { ## check $tags and IL constraint my ($cur_expcomp,$cur_keywords) = (0, ''); $cur_expcomp = ImageInfo($image)->{ExposureCompensation}; ## IL $cur_keywords = ImageInfo($image)->{Keywords}; if (!$cur_keywords) { $cur_keywords = ''; } ## how to check opt_tagsinclude/opt_tagsexclude in/out cur_keyword +s if ($cur_expcomp<=0 && $cur_expcomp>-1) { # print "DEBUG: current $cur_expcomp, $cur_keywords vs $opt_tag +sinclude / $opt_tagsexclude.\n" if ($opt_debug==1); if ($opt_tagsinclude) { if ($cur_keywords =~ m/$opt_tagsinclude/i) { print "DEBUG: include picture '$image' cf opt_tagsincl +ude '$opt_tagsinclude'\n" if ($opt_debug==1); push @{ $timeFiles{$time} }, $image; } else { print "DEBUG: exclude picture '$image' cf opt_tagsincl +ude\n" if ($opt_debug==1); } } elsif ($opt_tagsexclude) { if (!$cur_keywords || ($cur_keywords && $cur_keywords !~ m +/$opt_tagsexclude/i)) { print "DEBUG: include picture '$image' cf opt_tagsexcl +ude '$opt_tagsexclude'\n" if ($opt_debug==1); push @{ $timeFiles{$time} }, $image; } } else { print "DEBUG: include picture '$image' cf expcomp\n" if ($ +opt_debug==1); push @{ $timeFiles{$time} }, $image; } } else { print "DEBUG: excluding picture '$image' cf '$cur_expcomp', '$ +cur_keywords'\n" if ($opt_debug==1); } } print "." if ($opt_verbose == 1); $count++; } print " End.\n" if ($opt_verbose == 1); $perflog .= ", list files+".(time()-$start); ### <roboticus code> ### Sort the times from the hash my @times = sort { $a<=>$b } keys %timeFiles; print Dumper(\@times) if ($opt_debug==1); ### Group together the pix whose time is less than opt_timediff second +s ### apart my @groups; ## FIXME! losing first file (test 20130209 w or w/o extended check) => + why this shift ??? disabled #my $cur_group = [ shift @times ]; my $cur_group = [ @times ]; my (@cur_ref, @cur_file); sub set_cur_exif { my $image = shift; if ($image && -f $image) { my @cur = [ $image, ImageInfo($image)->{ExposureCompensation}, + ## IL ImageInfo($image)->{ExposureProgram}, ImageInfo($image)->{FocusMode}, ImageInfo($image)->{Lens}, ImageInfo($image)->{FocalLength}, ImageInfo($image)->{ISO}, ImageInfo($image)->{Orientation}, ImageInfo($image)->{Quality} ]; $cur[0][5] =~ s/ mm//; return @cur; } else { print "DEBUG: Warning! no image filename '$image'.\n"; } } sub log_exif { my @cur = shift; my $opt_debug = shift; my $txt = shift; my $extra_txt = shift; if (@cur) { print "DEBUG: setting $txt exif settings w '$cur[0][0]'.\n" i +f ($opt_debug==1); print " (".$cur_ref[0][0].', '.$cur_ref[0][1].', '.$cur_ref[0] +[2].', '. $cur_ref[0][3].', '.$cur_ref[0][4].', '.$cur_ref[0][5].', +'. $cur_ref[0][6].', '.$cur_ref[0][7].', '.$cur_ref[0][8].")$ +extra_txt.\n" if ($opt_debug==1); #print Dumper(\@cur_ref) if ($opt_debug==1); } else { print "DEBUG: Warning! no exif data.\n"; } } print "Group per time" if ($opt_verbose == 1); while (@times) { print "---> $times[0]\n" if ($opt_debug==1); if ($$cur_group[-1]+$opt_timediff >= $times[0]) { print "--> $times[0]\n" if ($opt_debug==1); if ($opt_extended == 0) { # small interval, add to current group, no extended check push @$cur_group, shift @times; } else { my $image = ${ $timeFiles{$times[0]} }[0]; print "-> $times[0], $image\n" if ($opt_debug==1); print "DEBUG: '$image' image file\n" if ($opt_debug==1); if (!@cur_ref) { @cur_ref = set_cur_exif($image); push @$cur_group, shift @times; log_exif(@cur_ref, $opt_debug, 'cur_ref', ''); } @cur_file = set_cur_exif($image); #log_exif(@cur_file, $opt_debug, 'cur_file', " & distance=".ab +s(1-$cur_ref[0][5]/$cur_file[0][5].' & similar='.CompareImages($cur_r +ef[0][0], $cur_file[0][0]))); log_exif(@cur_file, $opt_debug, 'cur_file', " & distance=".abs +(1-$cur_ref[0][5]/$cur_file[0][5])); if ( $cur_ref[0][0] eq $cur_file[0][0] ) { print "DEBUG: same ref & current_file. skipping.\n" if ($o +pt_debug==1); ## Note: we don't add to current group but no shifting els +e loose valid group. ##shift @times; } elsif ( ## meta should be same ## Note: NOK comparing full @ NOK => switch to separate # @cur_ref == @cur_file ## Note: ?OK comparing each value separately NOK (ex: 4, 6 +) ## FIXME! still some matching and things not equal. ex: 20 +130109_083736--DSC_0908.JPG/ExpProgram or FocalLength distance or Ori +entation $cur_ref[0][1] == $cur_file[0][1] && ## Exposure Compen +sation $cur_ref[0][2] eq $cur_file[0][2] && ## Exposure Progra +m $cur_ref[0][3] eq $cur_file[0][3] && ## FocusMode $cur_ref[0][4] eq $cur_file[0][4] && ## Lens abs(1-$cur_ref[0][5]/$cur_file[0][5]) < 0.11 && # same +focal length at 10% margin $cur_ref[0][7] eq $cur_file[0][7] && ## Orientation $cur_ref[0][8] eq $cur_file[0][8] ## Quality ## meta could be different: aperture, shutterspeed, ISO (auto +mode) ) { push @$cur_group, shift @times; print "DEBUG: image '$image' matchs same exif settings tha +n '$cur_ref[0][0]'.\n" if ($opt_debug==1); } else { print "DEBUG: image '$image' does not match same exif sett +ings than '$cur_ref[0][0]'.\n" if ($opt_debug==1); ## new group print "DEBUG: -> new group w '$image'.\n" if ($opt_debug== +1); push @groups, $cur_group if @$cur_group >= $opt_groups +ize; $cur_group = [ shift @times ]; if ($opt_extended == 1) { undef @cur_ref; } @cur_ref = set_cur_exif($image); } undef @cur_file; } } else { # store last group (if interesting) and start # a new one. print "DEBUG: -> save group, start a new one as time difference (" +.($$cur_group[-1]+$opt_timediff)." < $times[0]).\n" if ($opt_debug==1 +); push @groups, $cur_group if @$cur_group >= $opt_groupsize; $cur_group = [ shift @times ]; if ($opt_extended == 1) { undef @cur_ref; } } print "." if ($opt_verbose == 1); } print " End.\n" if ($opt_verbose == 1); $perflog .= ", grouping files+".(time()-$start); ### </roboticus code> ### Get each group my $gcount = 0; my ($filefirst,$filelast,$groupname); for my $group(@groups){ # print Dumper(\$group); $filefirst = ${ $timeFiles{ @{$group}[0] } }[0]; $filefirst =~ s/.$opt_extension$//i; $filefirst =~ s/^.*\///i; $filelast = ${ $timeFiles{ @{$group}[ (@$group-1) ] } }[0]; $filelast =~ s/.$opt_extension$//i; $filelast =~ s/^.*\///i; $groupname = "$filefirst--$filelast-pano.panolist"; print "FILENAME: $groupname, ".(@{$group})."\n" if ($opt_verbose== +1); open (FOUTPUT, ">$groupname") or die "ERROR: Can't open output fil +e '$!'\n"; # The time in each group for my $time(@$group){ # The hash value is an array ref # $_ takes the value of each array element, viz., a file name for my $image (@{ $timeFiles{$time} }) { my $exifdata = "Exif: ".ImageInfo($image)->{ExposureCompensati +on}.', '. ImageInfo($image)->{ExposureProgram}.', '. ImageInfo($image)->{FocusMode}.', '. (ImageInfo($image)->{Lens}?ImageInfo($image)->{Lens}:' +undef').', '. ## not always initialized? ImageInfo($image)->{FocalLength}.', '. ImageInfo($image)->{ISO}.', '. ImageInfo($image)->{Orientation}.', '. ImageInfo($image)->{Quality}; if (ImageInfo($image)->{Keywords}) { $exifdata .= ', '.ImageInfo($image)->{Keywords}; } print "$image: $time. $exifdata\n" if ($opt_verbose==1); ## FIXME/TEST! add autotagged pano to files consider as panora +mic source ## add a keyword without replacing existing keywords in the fi +le ## http://www.sno.phy.queensu.ca/~phil/exiftool/ExifTool.html# +SetNewValue #($success, $errStr) = $exifTool->SetNewValue(Keywords => $opt +_autotagged, AddValue => 1); } print FOUTPUT "$_\n" for @{ $timeFiles{$time} }; } # Print blank lines between groups print "\n\n" if ($opt_verbose==1); close(FOUTPUT); $gcount++; } $end = time(); $perflog .= ", output+".(time()-$start); print "Image count: $count, Group count: $gcount.\n" if ($opt_verbose= +=1); print "$perflog.\n" if ($opt_verbose==1);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1017732]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2014-09-17 22:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (100 votes), past polls