Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^5: ignore list of files using readdir function

by hdb (Prior)
on Jul 23, 2013 at 09:07 UTC ( #1045802=note: print w/ replies, xml ) Need Help??


in reply to Re^4: ignore list of files using readdir function
in thread ignore list of files using readdir function

Your code can be optimized a lot. Here is some proposal but I cannot test it as I do not have your directories at hand. I added comments to explain what I am doing so I hope it helps:

my @infiles = &GetINDirFiles($inpath); # instead of having an array with the outfiles use a hash for faster l +ookup # also remove suffix at this stage already, no need to do it again and + again in the loop # you need to escape your suffix variable in \Q...\E for special chara +cters such as the dot # only remove the suffix at the end, no need for (.*) my %outfiles = map { s/\Q$outsuffix\E$//; $_ => 1 } &GetOUTDirFiles($o +utpath); my $index = 0; # index used to get string position in array foreach my $infile (@infiles) { # see above re the replacement $infile =~ s/\Q$insuffix\E$//; # remove suffix to do comp +aration # Added by me # instead of loop through array of outfiles do hash lookup push (@delindex, $index) if exists $outfiles{$infile}; $index += 1; }

UPDATE: Forget my code above. You can write this as:

my %outfiles = map { /(.*)\Q$outsuffix\E$/; $1 => 1 } &GetOUTDirFiles( +$outpath); my @infiles = grep { /(.*)\Q$insuffix\E$/; not exists $outfiles{$1} } +&GetINDirFiles($inpath); print "@infiles\n";

and it should be fast.

UPDATE 2: Here is the full story.

use strict; use warnings; sub GetINDirFiles { my ($path) = @_; opendir my $dir, $path or die $!; return grep {!/\_ACK_/} readdir $dir; } sub GetOUTDirFiles { my ($path) = @_; opendir my $dir, $path or die $!; return grep {/\_ACK.xml$/} readdir $dir; } # Main my $inpath = "./IN"; my $outpath = "./OUT"; my $outsuffix = "_ACK.xml"; my $insuffix = ".xml"; my $timethreshold = 900; # set time threshold in seconds (900 se +conds equal 15 minutes) my %outfiles = map { /(.*)\Q$outsuffix\E$/; $1 => 1 } &GetOUTDirFiles( +$outpath); my @infiles = grep { /(.*)\Q$insuffix\E$/; $1 and not exists $outfiles +{$1} } &GetINDirFiles($inpath); my $currenttime = time; # get current time from system (epoch t +ime) @infiles = grep { -f "$inpath/$_" and ( $currenttime - (stat "$inpath/ +$_" )[9] ) > $timethreshold } @infiles; # now you have all input files w/o corresponding output file that are +older than 15 minutes for (@infiles) { print "File $_ in $inpath directory was created ". ( $currentt +ime - (stat "$inpath/$_" )[9] )/60.0 ."minutes ago.\n"; # put your action here }


Comment on Re^5: ignore list of files using readdir function
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1045802]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2015-07-02 06:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (30 votes), past polls