Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Looking up elements of an array in another array!

by 2teez (Priest)
on Mar 15, 2013 at 10:55 UTC ( #1023664=note: print w/ replies, xml ) Need Help??


in reply to Looking up elements of an array in another array!

You can try this:

I modified your OP using your description below:

use warnings; use strict; use Cwd qw(abs_path); die "Please give the directory to the file on CLI" unless @ARGV == 2; my ( $dir, $file_to_reference ) = @ARGV; my @ds; open my $fh, '<', $file_to_reference or die "can't open file: $!"; while (<$fh>) { chomp; push @ds, $_; } close $fh or die "can't close file: $!"; $dir = abs_path($dir); chdir $dir or die "No such directory: $!"; my @result; opendir my $dh, $dir or die "can't open dir: $!"; while ( my $file = readdir $dh ) { for (@ds) { push @result, $_ if $_ eq $file; } } closedir $dh or die "can't close directory: $!"; print $_, $/ for @result;
Usage:your_perl_script.pl directory_to_be_checked indexDS.txt > output_file.txt

If you tell me, I'll forget.
If you show me, I'll remember.
if you involve me, I'll understand.
--- Author unknown to me


Comment on Re: Looking up elements of an array in another array!
Select or Download Code
Re^2: Looking up elements of an array in another array!
by better (Acolyte) on Mar 15, 2013 at 22:00 UTC

    hI

    thanks for that proposal.

    I think "push @result, $_ if $_ eq $file;" doesn't do the job, because the filenames in $dh are not exactly identical with the IDs given in @ds.

    The ID is only a part of a filename (ID: I C 7710; filename: I C 7710 -A.jpg)

    best

    better

      From your last reply:
      ..Scan a directory and return all filenames containing the IDS I C 7700 -A.jpg I C 7700 -B.jpg I C 7700 a,b -KK
      Then change this line:
      push @result, $_ if $_ eq $file;
      to
      push @result, $_ if $_=~/^\Q$file/;

      If you tell me, I'll forget.
      If you show me, I'll remember.
      if you involve me, I'll understand.
      --- Author unknown to me

        Hi again,

        when I announced in my reply to Anonymous Monk the solution of all my problems was found, I was to hasty.

        The script worked well within the setting I developed to test it. When I started it under real conditions I was surprised that many files had been copied, whose filenames didn't include the IDs listed in the text file. This was my thought at least at a first glance.

        Then I realized that they did, i.e.: The operation "$_ =~ /^\Q$file/" on ID "I C 17" finds not only "I C 17.jpg" and "I C 17 -A.jpg" but also "I C 170.jpg" and "I C 1778 - A.jpg" etc. It's a characteristic of operating with regex which is well known and named as "greedy".

        Now here is my next step: a snippet, which deals exclusively with the greedyness of regex:

        #Script tests matching of IDs with a list of filenames #Should find matches without being too greedy #For testing input is given as two arrays, defined within the script #Input will be a a list of IDs in a text file and a scan of a director +y containing the image files # # use strict; use warnings; my @dir = ("I C 17.jpg", "I C 17 a.jpg", "I C 17 a,b -A x.jpg", "I C 1 +70.jpg", "I C 171 a,b -A x.jpg", "I C 171 a,b -B x.jpg"); my @ids = ("I C 17", "I C 171"); foreach my $a (@ids) { my $ext = "[^0-9]*\.jpg"; my $a_ext=$a.$ext; foreach my $b (@dir) { if ($b =~ m/($a_ext)/) { print "Found file: $b\n"; } } }

        All I have to do now, is to implement this into the main script

        I hope, if this is done, the routine for importing files will work

        better    (annoying this dull play on words, isn't it?)

        update: I implemented this nontoogreedy matching into the main script and it works better than before. But it's getting even more complicated, because IDs named "I C 17 <1>" refer to image files named "I C 17 _1_ -A.jpg". So I have to replace the the brackets before matching.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1023664]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-11-27 17:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (186 votes), past polls