Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^4: Looking up elements of an array in another array!

by better (Acolyte)
on Mar 23, 2013 at 06:33 UTC ( #1025000=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Looking up elements of an array in another array!
in thread Looking up elements of an array in another array!

Hi again,

when I announced in my reply to Anonymous Monk the solution of all my problems was found, I was to hasty.

The script worked well within the setting I developed to test it. When I started it under real conditions I was surprised that many files had been copied, whose filenames didn't include the IDs listed in the text file. This was my thought at least at a first glance.

Then I realized that they did, i.e.: The operation "$_ =~ /^\Q$file/" on ID "I C 17" finds not only "I C 17.jpg" and "I C 17 -A.jpg" but also "I C 170.jpg" and "I C 1778 - A.jpg" etc. It's a characteristic of operating with regex which is well known and named as "greedy".

Now here is my next step: a snippet, which deals exclusively with the greedyness of regex:

#Script tests matching of IDs with a list of filenames #Should find matches without being too greedy #For testing input is given as two arrays, defined within the script #Input will be a a list of IDs in a text file and a scan of a director +y containing the image files # # use strict; use warnings; my @dir = ("I C 17.jpg", "I C 17 a.jpg", "I C 17 a,b -A x.jpg", "I C 1 +70.jpg", "I C 171 a,b -A x.jpg", "I C 171 a,b -B x.jpg"); my @ids = ("I C 17", "I C 171"); foreach my $a (@ids) { my $ext = "[^0-9]*\.jpg"; my $a_ext=$a.$ext; foreach my $b (@dir) { if ($b =~ m/($a_ext)/) { print "Found file: $b\n"; } } }

All I have to do now, is to implement this into the main script

I hope, if this is done, the routine for importing files will work

better    (annoying this dull play on words, isn't it?)

update: I implemented this nontoogreedy matching into the main script and it works better than before. But it's getting even more complicated, because IDs named "I C 17 <1>" refer to image files named "I C 17 _1_ -A.jpg". So I have to replace the the brackets before matching.


Comment on Re^4: Looking up elements of an array in another array!
Download Code
Re^5: Looking up elements of an array in another array!
by tobyink (Abbot) on Mar 23, 2013 at 07:05 UTC

    I'd suggest:

    my $a_ext=quotemeta($a).$ext;

    ... because this will cope better when $a contains "special" characters such as "[" or "+".

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

      Thanks, this might help with my problem reported in the last update. I go and try.

      better

        No, it doesn't. I will have to replace the brackets.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1025000]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (8)
As of 2015-07-04 19:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls