Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Build a list of files that are NOT in a list

by amelinda (Friar)
on Jul 03, 2002 at 19:11 UTC ( #179270=perlquestion: print w/replies, xml ) Need Help??

amelinda has asked for the wisdom of the Perl Monks concerning the following question:

So, I have a big directory full of images. The naming pattern is <Number>.<Counter>.jpg (where counter is 000, 001, 002...). I have a list of numbers which match up with Numbers that I want to keep (so for a given Number in the list, keep Number.000.jpg, Number.001.jpg, etc).

As far as I can tell, the following code should work, but the numbers it reports don't match up. Any idea what's going on?

The current output is:

33600 images in photos (8321 unique). 6571 numbers in numbers_only.txt. 5999 numbers slated for deletion.
#!/usr/bin/perl -w use strict; my $num_file = "numbers_only.txt"; my %currentlist; my $i = 0; open(CURRFILE, "<$num_file") or die "couldn't open the num file: $!\n" +; while (<CURRFILE>) { chomp; $currentlist{$_}++; $i++; } close CURRFILE; my $dir = "images/photos"; opendir FILEDIR, $dir or die "Couldn't open $dir: $!\n"; my @filelist = grep !/^\.\.?$/, readdir FILEDIR; closedir FILEDIR; my %uniquelist; my %deletelist; my $h = 0; foreach my $pic (@filelist) { next unless ($pic =~ /jpg$/); $pic =~ s/^(\d+).*/$1/; $h++; $uniquelist{$pic}++; next if ($currentlist{$pic}); $deletelist{$pic}++; } my $output = "delete_list.txt"; open OUT, ">$output" or die "Couldn't open output: $!\n"; my $j = 0; foreach my $del (sort keys %deletelist) { print OUT "$del\n"; $j++; } close OUT; my $k = 0 + keys %uniquelist; print "$h images in photos ($k unique). $i numbers in numbers_only.txt +. $j numbers slated for deletion.\n"; exit;

Replies are listed 'Best First'.
Re: Build a list of files that are NOT in a list
by dws (Chancellor) on Jul 03, 2002 at 19:22 UTC
    In a big directory, it's easy to miss that a malformed filename has slipped in. Were this my code, I'd be more defensive about
    $pic =~ s/^(\d+).*/$1/; $h++; $uniquelist{$pic}++; next if ($currentlist{$pic}); $deletelist{$pic}++;
    and would test whether the regexp actually matched before doing the bookkeeping.

Re: Build a list of files that are NOT in a list
by stajich (Chaplain) on Jul 03, 2002 at 20:00 UTC
    Are you sure you want to do
    If I understand your name construction properly you expected NNNN.NNNN.jpg

    I would write the code slightly more carefully (your 1st foreach loop) note the \.

    my %uniquelist; my %deletelist; my $h = 0; foreach my $pic (@filelist) { next unless ($pic =~ /\.jpg$/); if( $pic =~ /^(\d+)\.(\d+)/) { my ($num,$counter) = ( $1,$2); $h++; $uniquelist{$num}++; next if ($currentlist{$num}); $deletelist{$pic}++; } else { print STDERR "saw a photo file ($pic) that did not match the expect +ed naming pattern!\n"; } }
    Code suggestion, you might want to replace my $k = 0 + keys %uniquelist; with my $k = scalar keys %uniquelist;, as some may feel this is more explicit.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://179270]
Approved by VSarkiss
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2020-10-24 16:55 GMT
Find Nodes?
    Voting Booth?
    My favourite web site is:

    Results (246 votes). Check out past polls.