Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: Looking up elements of an array in another array!

by 2teez (Priest)
on Mar 16, 2013 at 05:40 UTC ( #1023812=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Looking up elements of an array in another array!
in thread Looking up elements of an array in another array!

From your last reply:
..Scan a directory and return all filenames containing the IDS I C 7700 -A.jpg I C 7700 -B.jpg I C 7700 a,b -KK
Then change this line:
push @result, $_ if $_ eq $file;
to
push @result, $_ if $_=~/^\Q$file/;

If you tell me, I'll forget.
If you show me, I'll remember.
if you involve me, I'll understand.
--- Author unknown to me


Comment on Re^3: Looking up elements of an array in another array!
Select or Download Code
Re^4: Looking up elements of an array in another array!
by better (Acolyte) on Mar 23, 2013 at 06:33 UTC

    Hi again,

    when I announced in my reply to Anonymous Monk the solution of all my problems was found, I was to hasty.

    The script worked well within the setting I developed to test it. When I started it under real conditions I was surprised that many files had been copied, whose filenames didn't include the IDs listed in the text file. This was my thought at least at a first glance.

    Then I realized that they did, i.e.: The operation "$_ =~ /^\Q$file/" on ID "I C 17" finds not only "I C 17.jpg" and "I C 17 -A.jpg" but also "I C 170.jpg" and "I C 1778 - A.jpg" etc. It's a characteristic of operating with regex which is well known and named as "greedy".

    Now here is my next step: a snippet, which deals exclusively with the greedyness of regex:

    #Script tests matching of IDs with a list of filenames #Should find matches without being too greedy #For testing input is given as two arrays, defined within the script #Input will be a a list of IDs in a text file and a scan of a director +y containing the image files # # use strict; use warnings; my @dir = ("I C 17.jpg", "I C 17 a.jpg", "I C 17 a,b -A x.jpg", "I C 1 +70.jpg", "I C 171 a,b -A x.jpg", "I C 171 a,b -B x.jpg"); my @ids = ("I C 17", "I C 171"); foreach my $a (@ids) { my $ext = "[^0-9]*\.jpg"; my $a_ext=$a.$ext; foreach my $b (@dir) { if ($b =~ m/($a_ext)/) { print "Found file: $b\n"; } } }

    All I have to do now, is to implement this into the main script

    I hope, if this is done, the routine for importing files will work

    better    (annoying this dull play on words, isn't it?)

    update: I implemented this nontoogreedy matching into the main script and it works better than before. But it's getting even more complicated, because IDs named "I C 17 <1>" refer to image files named "I C 17 _1_ -A.jpg". So I have to replace the the brackets before matching.

      I'd suggest:

      my $a_ext=quotemeta($a).$ext;

      ... because this will cope better when $a contains "special" characters such as "[" or "+".

      package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

        Thanks, this might help with my problem reported in the last update. I go and try.

        better

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1023812]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2014-09-22 12:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (191 votes), past polls