Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^5: ignore list of files using readdir function

by hdb (Prior)
on Jul 23, 2013 at 09:07 UTC ( #1045802=note: print w/ replies, xml ) Need Help??


in reply to Re^4: ignore list of files using readdir function
in thread ignore list of files using readdir function

Your code can be optimized a lot. Here is some proposal but I cannot test it as I do not have your directories at hand. I added comments to explain what I am doing so I hope it helps:

my @infiles = &GetINDirFiles($inpath); # instead of having an array with the outfiles use a hash for faster l +ookup # also remove suffix at this stage already, no need to do it again and + again in the loop # you need to escape your suffix variable in \Q...\E for special chara +cters such as the dot # only remove the suffix at the end, no need for (.*) my %outfiles = map { s/\Q$outsuffix\E$//; $_ => 1 } &GetOUTDirFiles($o +utpath); my $index = 0; # index used to get string position in array foreach my $infile (@infiles) { # see above re the replacement $infile =~ s/\Q$insuffix\E$//; # remove suffix to do comp +aration # Added by me # instead of loop through array of outfiles do hash lookup push (@delindex, $index) if exists $outfiles{$infile}; $index += 1; }

UPDATE: Forget my code above. You can write this as:

my %outfiles = map { /(.*)\Q$outsuffix\E$/; $1 => 1 } &GetOUTDirFiles( +$outpath); my @infiles = grep { /(.*)\Q$insuffix\E$/; not exists $outfiles{$1} } +&GetINDirFiles($inpath); print "@infiles\n";

and it should be fast.

UPDATE 2: Here is the full story.

use strict; use warnings; sub GetINDirFiles { my ($path) = @_; opendir my $dir, $path or die $!; return grep {!/\_ACK_/} readdir $dir; } sub GetOUTDirFiles { my ($path) = @_; opendir my $dir, $path or die $!; return grep {/\_ACK.xml$/} readdir $dir; } # Main my $inpath = "./IN"; my $outpath = "./OUT"; my $outsuffix = "_ACK.xml"; my $insuffix = ".xml"; my $timethreshold = 900; # set time threshold in seconds (900 se +conds equal 15 minutes) my %outfiles = map { /(.*)\Q$outsuffix\E$/; $1 => 1 } &GetOUTDirFiles( +$outpath); my @infiles = grep { /(.*)\Q$insuffix\E$/; $1 and not exists $outfiles +{$1} } &GetINDirFiles($inpath); my $currenttime = time; # get current time from system (epoch t +ime) @infiles = grep { -f "$inpath/$_" and ( $currenttime - (stat "$inpath/ +$_" )[9] ) > $timethreshold } @infiles; # now you have all input files w/o corresponding output file that are +older than 15 minutes for (@infiles) { print "File $_ in $inpath directory was created ". ( $currentt +ime - (stat "$inpath/$_" )[9] )/60.0 ."minutes ago.\n"; # put your action here }


Comment on Re^5: ignore list of files using readdir function
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1045802]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (11)
As of 2014-12-28 02:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (178 votes), past polls