Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Looking up elements of an array in another array!

by better (Acolyte)
on Mar 14, 2013 at 21:39 UTC ( #1023562=perlquestion: print w/ replies, xml ) Need Help??
better has asked for the wisdom of the Perl Monks concerning the following question:

Hallo, this script is thought to be a first step of building a file importing routine. But it is not (entirely) working. I have a list of IDs written into a text file. The script should scan a directory and search for alle the filenames, which contain the ID, listed in the text file. It works, if I define the array with the listed IDs directly into the script, but not, if I open and read the text file. I'm working with cygwin on WinXP

! /usr/local/bin/perl -w #Skript searches for filenames in a spec. directory #referring to a text file which contains a list of Ids $dir = $ARGV[0]; #Scan a directory and write filenames into an array opendir (SOURCE, $dir)or die "Cannot open dir: $!\n"; foreach $file (readdir SOURCE) { push (@dirFile, $file); #more than 6000 filenames } closedir SOURCE; #Open a filehandle and read a list from a text file $ref = "indexDS.txt"; open(FH, $ref) or die "Cannot open file: \n$!"; @ds = <FH>; close (FH); chomp @dirFile; chomp @ds; #Test: it works if array is defined like this: #@ds = ("I C 7702", "I C 7710", "I C 7713"); #Search for element of @ds as regex in @dirFile foreach $line (@ds) { @result = grep (/$line/, @dirFile); foreach (@result) { print "$_\n"; } }

Comment on Looking up elements of an array in another array!
Download Code
Re: Looking up elements of an array in another array!
by Anonymous Monk on Mar 14, 2013 at 22:00 UTC

    #!/usr/bin/perl --
    use strict; use warnings;
    use autodie;
    use File::Find::Rule;
    use File::Slurp;
    my $startdir = shift or die Usage();
    my $fnames = join '|', map quotemeta, read_file('indexDS.txt' , qw/ chomp 1 /);
    my @fnames = find( file => name => qr{$fnames}, in => $startdir );
    print "$_\n" for @fnames;

      Hi Anonymus Monk, thank you very much for your fast reply!

      As you might already guessed, your script gives me a really tough nut to crack. Anyway! I tried 'paste and copy' and got an error message, that Find/File/Rule.pm is not included in @INC.... So I will find out, how I can fix this problem. Thanks again. I'll come back and report

      better

        got an error message, that Find/File/Rule.pm is not included in @INC.... So I will find out, how I can fix this problem.
        If you want to use File::Find::Rule, you need to install it. cpan is one way.

      Hi Anonymous Monk,

      as none of the proposals worked for me, I started wondering, why? The clue seems to be the text file, which serves as a reference. I created it by copying a coloumn out of an excel worksheet into a notepad file. After I created a new text file with perl your script worked perfectly well. It seems that "Copy and Paste" from microsoft excel has caused the trouble.

      Now I will add the next step, which is copying the matching files into another directory

      Thanks again

      better

Re: Looking up elements of an array in another array!
by Kenosis (Priest) on Mar 14, 2013 at 22:36 UTC

    Here's another option:

    use strict; use warnings; my $dir = shift; my $IDs = join '|', map { /(.+)/; "\Q$1\E" } <>; my @results = grep /$IDs/, <$dir/*>; print "$_\n" for @results;

    Usage: perl script.pl '/the/dir/to/scan' indexDS.txt [>outFile]

    The target directory is (implicitly) shifted off @ARGV and saved for later. The first <> notation reads the index file. map takes each line from the file, and the regex in it matches all characters except the newline. The captured line is surrounded by \Q ... \E to quote any meta-characters in the ID. The results, e.g., "\Q$1\E", are joined with the alternation symbol |, effectively creating an "or" type regex that's used in the grep. A file glob's used to read the directory files and only those names which contain one of the IDs are passed to @results.

    Hope this helps!

      Hello Kenosis,

      thanks a lot for your help. Unfortunately it doesn't help. The last entry (>outFile) causes an error. But if I skip that, nothing happens. (But the Ids given in indexDS.txt are matching filenames of the scanned directory) Any suggestions?

      better

        replace the [>outFile] with >outFile ... this is redirecting stdout to the file outFile in the current directory.

        kielstirling ++ is right on target, and my apologies for not explaining the [>fileOut] parameter. This designates an optional argument that's used to direct output to a file, instead of the screen. So, for example, you could do the following on the command line:

        perl script.pl '/the/dir/to/scan' indexDS.txt >results.txt

        Thus, instead of printing to the screen, output's directed to a file. Omitting the last parameter will just print the results to the screen.

Re: Looking up elements of an array in another array!
by 2teez (Priest) on Mar 15, 2013 at 10:55 UTC

    You can try this:

    I modified your OP using your description below:

    use warnings; use strict; use Cwd qw(abs_path); die "Please give the directory to the file on CLI" unless @ARGV == 2; my ( $dir, $file_to_reference ) = @ARGV; my @ds; open my $fh, '<', $file_to_reference or die "can't open file: $!"; while (<$fh>) { chomp; push @ds, $_; } close $fh or die "can't close file: $!"; $dir = abs_path($dir); chdir $dir or die "No such directory: $!"; my @result; opendir my $dh, $dir or die "can't open dir: $!"; while ( my $file = readdir $dh ) { for (@ds) { push @result, $_ if $_ eq $file; } } closedir $dh or die "can't close directory: $!"; print $_, $/ for @result;
    Usage:your_perl_script.pl directory_to_be_checked indexDS.txt > output_file.txt

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me

      hI

      thanks for that proposal.

      I think "push @result, $_ if $_ eq $file;" doesn't do the job, because the filenames in $dh are not exactly identical with the IDs given in @ds.

      The ID is only a part of a filename (ID: I C 7710; filename: I C 7710 -A.jpg)

      best

      better

        From your last reply:
        ..Scan a directory and return all filenames containing the IDS I C 7700 -A.jpg I C 7700 -B.jpg I C 7700 a,b -KK
        Then change this line:
        push @result, $_ if $_ eq $file;
        to
        push @result, $_ if $_=~/^\Q$file/;

        If you tell me, I'll forget.
        If you show me, I'll remember.
        if you involve me, I'll understand.
        --- Author unknown to me

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1023562]
Approved by igelkott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (15)
As of 2014-11-26 15:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (171 votes), past polls