Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Matching array elements against file

by Mark.Allan (Sexton)
on Jan 14, 2014 at 10:51 UTC ( [id://1070535]=perlquestion: print w/replies, xml ) Need Help??

Mark.Allan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, just wondered if you can guide me on a more accurate way and a better solution of matching array elements against file content in below scenario. I have an array with data as follows:-

ARRAY $VAR1 = [ 'server=bar116aix,bar117aix:FS=/tmp', 'server=cvmfsser1, cvmfsser2:FS=/opt/apps', 'server=cvmfsser3::FS=/opt/apps' 'FS=/bar/cut/data03:NO', 'server=baraix665, baraix666, baraix667:FS=/data/hp/feeds', 'server=vmdprd:FS=/opt/tuxedo', 'server=testserver1:FS=/data/repository' ];

I also have a file, in the following format as follows

FILE <server>::<fs>::<team> vmaprd::/opt/vmaprd/application::intel vmclprd::/opt/vmclprd/apps/oracle::db_support vmdprd::/opt/vmdprd/db2::db_support cvmfsser1::/opt/apps/::app_support cvmfsser3::/opt/apps/::app_support d87ser1::/opt/db2ese::db_support aix5server::/bar/cut/data03::aix_sup linuxvm001::/bar/cut/data04::linux_sup

What I am looking for is an accurate match, if the line is complete match (ie server and fs) exist in the line of the file then this is a MATCH, if only server or file system exists in the line then this is PART MATCH and if server or FS does not exist then NO MATCH

Example

So looking through the array the following would match, part-match or not match against the file

'server=bar116aix,bar117aix:FS=/tmp', NO MATCH 'server=cvmfsser1, cvmfsser2:FS=/opt/apps', PART MATCH 'server=cvmfsser3::FS=/opt/apps' MATCH 'FS=/bar/cut/data03:NO', MATCH 'server=baraix665:FS=/data/hp/feeds', NO MATCH 'server=vmdprd:FS=/opt/tuxedo', PART MATCH 'server=testserver1:FS=/data/repository' NO MATCH

Here's where I am at present but its not working as I want, because the bind operator matches for example just the /opt of a file system even if the fs /opt//db2ese does not exist in the array

#OPEN FILE open(FH,"<$fs") || die ("cannot open file"); while (<FH>) { ($host,$cfg,$sup) = split /::/,$_; $cfg =~ s/\*//g; push @FS,$cfg}; #LOOP THROUGH ARRAY and look for MATCH foreach my $item (@config){ print "$item\n"; foreach my $item2 (@FS){ if ($item =~ /$item2/){ print "MATCH = $item ------> $item2\n";} } }

Replies are listed 'Best First'.
Re: Matching array elements against file
by kcott (Archbishop) on Jan 14, 2014 at 14:29 UTC

    G'day Mark.Allan,

    There are a number of issues with what you posted. One or more of these may be the cause of your problems. You'll need to adjust your problem description and/or data and/or expected output as described here:

    • In $VAR1, you have "'server=cvmfsser3::FS=/opt/apps'" which has no terminal comma. This generates a syntax error ("String found where operator expected"). I've made the assumption that this is a typo and added the comma in my code below; however, see the next point which may indicate some other problem here.
    • In $VAR1, you have "'FS=/bar/cut/data03:NO',". Note the issue with the previous element (from the last point). You have no "server=..." in this element: that may be correct or there could be missing data. Regardless, that element cannot be a "MATCH" (in your expected output) unless the problem description "... complete match (ie server and fs) ..." is wrong. Either change the data, the problem description, or the expected output.
    • In $VAR1, you've used a single colon as a separator throughout except in "'server=cvmfsser3::FS=/opt/apps'". This raises yet another question mark regarding the accuracy of this data element. I've coded around this discrepancy but you'll need to look at this and fix if necessary.
    • In some places you show pathnames with a trailing slash ('/'), in others this is absent. Again, I don't know whether this is a real issue or a typo. I've coded around this by canonicalising the filesystem value.

    If you copy and paste your code and data, we will see exactly what you're seeing. Trying to make multiple guesses as to what is and isn't a typo is both annoying and error-prone. Please bear this in mind with any future posts.

    I believe this code should be close to what you want. You may need to make changes where the assumptions and guesses I've documented are incorrect.

    #!/usr/bin/env perl use strict; use warnings; my @checks = ( 'server=bar116aix,bar117aix:FS=/tmp', 'server=cvmfsser1, cvmfsser2:FS=/opt/apps', 'server=cvmfsser3::FS=/opt/apps', 'FS=/bar/cut/data03:NO', 'server=baraix665, baraix666, baraix667:FS=/data/hp/feeds', 'server=vmdprd:FS=/opt/tuxedo', 'server=testserver1:FS=/data/repository', ); my %data; while (<DATA>) { my ($server, $fs) = (split /::/)[0, 1]; $fs = canonicalise_fs($fs); $data{both}{join '::' => $server, $fs} = undef; $data{server}{$server} = undef; $data{fs}{$fs} = undef; } for (@checks) { print "'$_',\t"; my $server = /server=([^:]+)/ ? $1 : ''; my $fs = /FS=([^:]+)/ ? $1 : ''; $fs = canonicalise_fs($fs); if (exists $data{both}{join '::' => $server, $fs}) { print ''; } elsif (exists $data{server}{$server} or exists $data{fs}{$fs}) { print 'PART '; } else { print 'NO '; } print "MATCH\n"; } sub canonicalise_fs { my $fs = shift; $fs .= '/' unless length $fs and $fs =~ /\/$/; return $fs; } __DATA__ vmaprd::/opt/vmaprd/application::intel vmclprd::/opt/vmclprd/apps/oracle::db_support vmdprd::/opt/vmdprd/db2::db_support cvmfsser1::/opt/apps/::app_support cvmfsser3::/opt/apps/::app_support d87ser1::/opt/db2ese::db_support aix5server::/bar/cut/data03::aix_sup linuxvm001::/bar/cut/data04::linux_sup

    Output:

    'server=bar116aix,bar117aix:FS=/tmp', NO MATCH 'server=cvmfsser1, cvmfsser2:FS=/opt/apps', PART MATCH 'server=cvmfsser3::FS=/opt/apps', MATCH 'FS=/bar/cut/data03:NO', PART MATCH 'server=baraix665, baraix666, baraix667:FS=/data/hp/feeds', NO MATC +H 'server=vmdprd:FS=/opt/tuxedo', PART MATCH 'server=testserver1:FS=/data/repository', NO MATCH

    -- Ken

      Hi Ken, cutting and pasting the whole array and file would of been two large so I just tried to create snippets of the data which included all scenarios of possibilities. You are correct, the data source was sloppy I appologies.

      After observing your code (and thanks for the assistance) and digesting your points. Point 1 is a typo, correct. Point 2 the data was correct, there could be cases where server= is a missing attribute

      Point 3 you're right

      'server=cvmfsser3::FS=/opt/apps'

      This should be single :

      Your code is nearer enough what I need but one thing seems to be throwing it out at present

      'server=cvmfsser1, cvmfsser2:FS=/opt/apps',

      even if two servers appear as comma seperated, this should indeed part match but against FS and Server (server being that there is one of the two in the comma seperated list whihc exist in the file. Your code part matches against the fileystem but would not part match against the server if the filesystem didnt exist

      cvmfsser1::/opt/apps/::app_support cvmfsser3::/opt/apps/::app_support
      . As you can see in the data output, it is only showing up as a part match against the fileysstem. If the filesystem didnt exist, Id need to to part match against
      'server=cvmfsser1

      I hope I have explained myself well enough. Thanks in advance

        This seems to be yet another problem with your description.

        Perhaps 'server=' would be better as 'servers='. You could then split its value on '/,\s*/' and then treat these 3 elements:

        'server=bar116aix,bar117aix:FS=/tmp', 'server=cvmfsser1, cvmfsser2:FS=/opt/apps', 'server=baraix665, baraix666, baraix667:FS=/data/hp/feeds',

        as if they were these 7 elements:

        'server=bar116aix:FS=/tmp', 'server=bar117aix:FS=/tmp', 'server=cvmfsser1:FS=/opt/apps', 'server=cvmfsser2:FS=/opt/apps', 'server=baraix665:FS=/data/hp/feeds', 'server=baraix666:FS=/data/hp/feeds', 'server=baraix667:FS=/data/hp/feeds',

        Here I'm making further assumptions:

        • "server" in ARRAY has a different meaning to "server" in FILE:
          • ARRAY: one or more servers
          • FILE: exactly one server
        • Records in FILE never look like: "bar116aix,bar117aix::/tmp::team_name"

        -- Ken

Re: Matching array elements against file
by Anonymous Monk on Jan 14, 2014 at 11:07 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1070535]
Approved by hdb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2024-04-23 12:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found