First, thank you all for your suggestions. The problem has been one of algorythym. I am iterating @select files from and @allfile loop, and parsing for equality conditions.
As the actual code is over 300 lines, I have included an edited snippet. This code is derived from an earlier script where I needed to parse for reexes in files, not necessarily exact matches , and not necesaarily at the beginning. parsing @selectexpr against @allfiles made sense there.
Hashes are an excellent idea. with them I can parse foo-bar-baz.doc as as hash directly against all foo keys, with proper splitting and filtering, of course. This would allow me to scale up more efficiently.
I would howver prefer, if possible to keep the matching to a regexp rather an an equality operator.
Please ignore syntax errors in the code below, it has been abbreviated.
if ($mymobi =~ m/($myepub)/) {print "DUPLICATE FOUND !\n" ;
&movetodir($myfilt,$dupdir ); }
#Does NOT work
if ($mymobi eq $myepub) {print "DUPLICATE FOUND !\n" ;
&movetodir($myfilt,$dupdir ); }
#Works
For an author-title pair,the matching would be done in the title(value) portion rather than the key, which would be expected to identical (though there might be exceptions ).
I need to hit the books on hashes here, as i havent really dealt much with them outside of a 20,000+ listing database with about 2 dozen hash fields.
opendir(DIR, $dir2 ) or die $!;
while ( $file = readdir(DIR)) {
if (-f $file) { # read only files
chomp($file);
$file =~ s/^\s+|\s+$//g;
$filenam = "" ;
push ( @srcarray, $file) ;
if ($file =~ m/\.mobi$/ig ) {
&typefiles($file, "mobifile");
}
if ($file =~ m/\.azw3$/ig ) {
&typefiles($file, "azw3file");
}
sub typefiles( $tfile , $filetype ) {
($tfile, $filetype ) = @_ ;
if ($filetype eq "mobifile" ) {
push ( @mobiarray, $file) ; } # End mobifiles
# Main body - parsing directory listing and performing actions
foreach $authf (@srcarray){
if ($authf =~ m/\.pl$/) {
next; }
if ($authf =~ m/\.epub/ig ) {
our $authf2 = $authf ;
foreach my $myfilt (@mobiarray){
my $mymobi = $myfilt;
my $myepub = $authf2;
$mymobi = &extfilter($mymobi);
$myepub = &extfilter($myepub);
sub extfilter($line) {
($line) = @_;
$line =~ s/\.mobi//ig ;
$line =~ s/\.epub//ig ;
$line =~ s/^\s+|\s+$//g;
$line = lc $line;
return $line;
}