Hi again,
when I announced in my reply to Anonymous Monk the solution of all my problems was found, I was to hasty.
The script worked well within the setting I developed to test it. When I started it under real conditions I was surprised that many files had been copied, whose filenames didn't include the IDs listed in the text file. This was my thought at least at a first glance. Then I realized that they did, i.e.: The operation "$_ =~ /^\Q$file/" on ID "I C 17" finds not only "I C 17.jpg" and "I C 17 -A.jpg" but also "I C 170.jpg" and "I C 1778 - A.jpg" etc. It's a characteristic of operating with regex which is well known and named as "greedy".
Now here is my next step: a snippet, which deals exclusively with the greedyness of regex:
#Script tests matching of IDs with a list of filenames
#Should find matches without being too greedy
#For testing input is given as two arrays, defined within the script
#Input will be a a list of IDs in a text file and a scan of a director
+y containing the image files
#
#
use strict;
use warnings;
my @dir = ("I C 17.jpg", "I C 17 a.jpg", "I C 17 a,b -A x.jpg", "I C 1
+70.jpg", "I C 171 a,b -A x.jpg", "I C 171 a,b -B x.jpg");
my @ids = ("I C 17", "I C 171");
foreach my $a (@ids) {
my $ext = "[^0-9]*\.jpg";
my $a_ext=$a.$ext;
foreach my $b (@dir) {
if ($b =~ m/($a_ext)/) {
print "Found file: $b\n";
}
}
}
All I have to do now, is to implement this into the main script
I hope, if this is done, the routine for importing files will work
better    (annoying this dull play on words, isn't it?)
update: I implemented this nontoogreedy matching into the main script and it works better than before. But it's getting even more complicated, because IDs named "I C 17 <1>" refer to image files named "I C 17 _1_ -A.jpg". So I have to replace the the brackets before matching. |