http://www.perlmonks.org?node_id=1014365


in reply to Detect file sequences in File:Find results

What does the preprocess sub do? This is my solution without it:
#!/usr/bin/perl use warnings; use strict; use File::Find; use File::stat; my $dir = shift; my %result; find ( {'wanted' => sub { my $file = $File::Find::name; my ($n, $fn) = (getpwuid (stat($file)-> uid))[0, 6]; if (-f && (/^[^.]/) ) { if( my ($pre, $num, $suff) = $file =~ /(.*)\.([0-9]+)\ +.(.*)/ ) { push @{ $result{$pre}{$suff}{"$n:$fn"} }, $num; } } }, }, $dir); for my $pre (keys %result) { for my $suff (keys %{ $result{$pre} }) { for my $user (keys %{ $result{$pre}{$suff} }) { my @nums = sort { $a <=> $b } @{ $result{$pre}{$suff}{$use +r} }; my $first = shift @nums; my ($from, $to, @ranges) = ($first, $first); for (@nums) { if ($_ == $to + 1) { $to = $_; } else { push @ranges, [$from, $to]; ($from, $to) = ($_, $_); } } push @ranges, [$from, $to]; for my $r (@ranges) { print "$pre." . ($r->[0] == $r->[1] ? $r->[0] : "[$r->[0]-$r->[1]]") . ".$suff:$user\n"; } } } }
لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re^2: Detect file sequences in File:Find results
by aes1972 (Initiate) on Jan 21, 2013 at 01:33 UTC
    --->What does the preprocess sub do?

    It sorts the directory results so that the unpadded sequences comes back in the correct order. Without it the sequence comes back looking like this:
    image_sequenceA.1.tif image_sequenceA.11.tif image_sequenceA.12.tif image_sequenceA.13.tif image_sequenceA.14.tif image_sequenceA.15.tif image_sequenceA.16.tif image_sequenceA.17.tif image_sequenceA.18.tif image_sequenceA.19.tif image_sequenceA.2.tif image_sequenceA.20.tif image_sequenceA.3.tif
    Which would make the sequence detection incorrect. It also handles the cases where there are files with the same name but different extension like this:

    image_sequenceA.1.tif image_sequenceA.2.tif image_sequenceA.3.tif image_sequenceA.1.exr image_sequenceA.2.exr image_sequenceA.3.exr
    So the preprocess sub is necessary in my case unless I can do the same sort somewhere else
      so that the unpadded sequences comes back in the correct order

      You may have a reason for the order you chose but I would have thought it made more sense order things so that the number part of the filename was sorted numerically, not lexically. That way you get two runs rather than one run and four singletons from the example you post.

      image_sequenceA.1.tif image_sequenceA.2.tif image_sequenceA.3.tif image_sequenceA.11.tif image_sequenceA.12.tif image_sequenceA.13.tif image_sequenceA.14.tif image_sequenceA.15.tif image_sequenceA.16.tif image_sequenceA.17.tif image_sequenceA.18.tif image_sequenceA.19.tif image_sequenceA.20.tif

      I hope this is of interest.

      Cheers,

      JohnGG