http://www.perlmonks.org?node_id=763586


in reply to Re^3: Pattern Matching in Arrays
in thread Pattern Matching in Arrays

I liked the split method recommended above.
Can you use the OR in the split command to pattern match at the time of the split?

There is a new wrinkle in the mix as well.
There are filenames without "_": 12345.pdf

All comments are welcomed!

Replies are listed 'Best First'.
Re^5: Pattern Matching in Arrays
by przemo (Scribe) on May 12, 2009 at 22:44 UTC
    I liked the split method recommended above.

    Unfortunately it doesn't work with multiple underscores.

    If you want to stick with regexp, then try this:

    if (/^(.*?)(?:_([^_]*))?\.pdf$/) { # do something with $1 and $2 }

      Let me try explaining the overall goal of the program:

      1. Read in directory of file names

        1. Most file names hava a Root number, Revision, Extension of pdf: 12345_-v1.pdf
        2. Most names have an underscore as the seperator between Root Number and Revision: 3873215a_-v1.pdf
        3. Many names have more than 1 underscore used as text seperators: 21586_rework_bore_tool_-v2.pdf
        4. Some names have no underscores in them: 4148-t-2.pdf
      2. I need to display only the latest revision of the file, so if the following exist in the directory:
        21586_rework_bore_tool_-v1.pdf
        21586_rework_bore_tool_-v2.pdf

        Only display 21586_rework_bore_tool_-v2.pdf
      3. I am unable to capture all of the combinations listed above...

      I appreciate all recommendations...

      AJ

Re^5: Pattern Matching in Arrays
by ciderpunx (Vicar) on May 13, 2009 at 17:01 UTC
    I liked the split method recommended above.
    ty
    Can you use the OR in the split command to pattern match at the time of the split?
    You want the version number to be the last element of the list that split returns, so you could reverse the list, grab the last element, slurp the rest into an array, reverse that array back the right way round again and join the elements with underscores.
    There are filenames without "_": 12345.pdf
    Yuk, make an @unversioned array and push them into that if you only get one element from split?
    #!/usr/bin/perl use strict; use warnings; my @data = qw/12345.pdf 12345_-v1.pdf 12345_Av1.pdf 123456_-v1.pdf 123456_Av1.pdf 123456_Bv1.pdf g05495_1_-v1.pdf zprt0019548_wiper_die-nc_-1.pdf zprt0019548_wiper_die-nc +_-2.pdf zprt0016809_fg_tooling_A2.pdf zprt0016809_fg_tooling_A +3.pdf/; my %hash; my @unversioned; for my $f (sort @data) { my ($version,@key) = reverse split /_/,$f; my $key = join "_",reverse @key; if($key) { $hash{$key} = $version; } else { push @unversioned,$version; } } for my $key (keys %hash) { print $key . '_' . $hash{$key} . "\n"; } print join "\n",@unversioned,q{};
    --
    Linux, perl, punk rock, cider: charlieharvey.org.uk.