I am looking for either suggestions for improvement, or ways to use existing modules like File::Find::*.
I wrote a utility to search through file archives organized according to a direction/date/topic structure. I usually know which direction and topic to search, but the transaction may have been archived on a range of days. I wrote my own very limited File::Find (code below) in order to implement this search.
For instance, I am looking for a transaction we sent containing the string "12345678", I know it's for CustomerD, and I'm pretty sure we sent it this week, so it could be in:
outbound/20151027/CustomerD
outbound/20151026/CustomerD
...
outbound/20151021/CustomerD
outbound/2015nnnn/ will have many subdirectories, and some of them will have hundreds of files. As a result, if I can't supply the topic, I run the search in the background and work on something else. But if I can, the response is quick enough.
So why explore modules if I have a working solution? Learning what's in CPAN, and how to better use it, is to my benefit.
Source code:
#!/home/edi/perl/perl
use strict;
use warnings;
use Getopt::Std;
use Date::Calc qw/Today Add_Delta_Days/;
getopts('ior:s:d:b:');
our ($opt_i, $opt_o, $opt_r, $opt_s, $opt_d, $opt_b);
my ($mode, $search_regex, $days, $business_process);
die "Usage: search_si_archive.pl [-[io]] (-s searchstring | -r regex)
+[-d daysback] [-b bpname]\n" unless ($opt_s || $opt_r);
$mode = 'inbound';
$mode = 'outbound' if ($opt_o);
$search_regex = qr/$opt_r/ if ($opt_r);
$search_regex = qr/\Q$opt_s\E/ if ($opt_s);
$days = defined($opt_d) ? $opt_d : 7;
if ($opt_b) {
$business_process = '*' . $opt_b . '*'
} else {
$business_process = '*'
}
my ($year, $month, $day) = Today();
# for each day from today back $days days
while ($days >= 0) {
my ($y, $m, $d) = Add_Delta_Days($year, $month, $day, -$days--);
my $datestring = sprintf("%d%02d%02d", $y, $m, $d);
my $directory = sprintf("/edi_store/archive/%s/%s/%s",$mode,$dates
+tring,$business_process);
my @dirlist = grep { -d } glob($directory);
foreach my $dir (@dirlist) {
opendir DIR, $dir;
search_file($dir, $_) for (grep { -f $dir . '/' . $_ } readdir
+ DIR);
closedir DIR;
}
}
sub search_file {
my $fname = sprintf("%s/%s",@_);
open my $fh, '<', $fname;
while (<$fh>) {
if (m/$search_regex/) {
print "$fname\n";
last;
}
}
close($fh);
}
__END__
=pod
=head1 Search SI Archive
Search through SI archive directories for a string or regex, restricte
+d by age and/or BP.
=head1 USAGE
search_si_archive.pl -[io] -[sr STRING] [-d DAYS|7] [-b BPNAME]
=over
=item -i
INBOUND - search will start in /edi_store/archive/inbound/ directory t
+ree.
If neither -i or -o is indicated, this will be the default.
=item -o
OUTBOUND - search will start in /edi_store/archive/outbound/ directory
+ tree.
=item -s STRING
SEARCH - files will be searched for this literal string.
Either this or -r must be specified.
=item -r STRING
REGEX - files will be searched for this regular expression.
Either this or -s must be specified.
=item -d DAYS
DAYS BACK - search will start in today's tree. If this value is specif
+ied, the search
will be repeated this number of times, moving backward in time one day
+ with each iteration.
If today is Monday, 3 would search today, Sunday, Saturday, and Friday
+. If no value is
specified, it will search 7 days back.
=item -b NAME
BUSINESS PROCESS - only directories whose name contains this string wi
+ll be searched.
If no value is specified, all directories will be searched.
=back
=head1 Examples
=over
=item 1.
search_si_archive.pl -i -s DEPOT -d 0 -b AS2
Files in subdirectories of /edi_store/archive/inbound/YYYYMMDD whose n
+ame contains the string
AS2 will be searched for the string DEPOT.
=back
=head1 Author
Howard Parks