Re: Pattern Matching in Arrays

by ciderpunx (Vicar)
in reply to Pattern Matching in Arrays

One way, which would be slow for directories with lots of files would be something like this. Sort the array of filenames, then use the pars before the _ as hash keys, the elements just overwrite the previous values.
#!/usr/bin/perl use strict; use warnings; my @data = qw/12345_-v1.pdf 12345_Av1.pdf 123456_-v1.pdf 123456_Av1.pd +f 123456_Bv1.pdf/; my %hash; for my $f (sort @data) { my ($key,$version) = split /_/,$f; $hash{$key} = $version; } for my $key (keys %hash) { print $key . '_' . $hash{$key} . "\n"; }
Re^2: Pattern Matching in Arrays
by aj.kohler (Initiate) on May 12, 2009 at 13:49 UTC

    Thank you for you help

    I came across an issue after reviewing the data

    The core number is the 12345 portion, as assumed. However, there are some filenames that have multiple underscore characters in the name, before it gets to the version value.



    The -v1, -v2, -v1, -1, A2 are the version keys, and everything before that is the core document number. Any thoughts on how to address these?

      In such case, you have to know where the end is to determine, where the last `_' is...

      I'll assume that the filename ends with .pdf:

      if ($filename =~ /^(.*)_(.*)\.pdf$/) { # do something with $1 and $2 here }

      Because of the Perl "longest-first" matching, the underscore will match the last underscore in the filename.

        I liked the split method recommended above.
        Can you use the OR in the split command to pattern match at the time of the split?

        There is a new wrinkle in the mix as well.
        There are filenames without "_": 12345.pdf

        All comments are welcomed!

