http://www.perlmonks.org?node_id=1004981


in reply to Regular expression help

From what I can tell, you are checking everything after the first field matches. The best way to do this is to put the rows of the first file into a hash and then use the rows of the second file to do a lookup to see if the hash entry exists.

The following code is not guaranteed to run (I had a long night last night!) but should show the general idea....

#!/usr/bin/perl -w use strict; open (INA, $ARGV[0]) || die "cannot to open gene file"; open (INB, $ARGV[1]) || die "cannot to open coding_annotated.var files +"; my @sample1 = <INA>; my @sample2 = <INB>; # use map for this maybe? foreach my $line (@sample1) { my ($id, $rest) = split( '\t', $line, 2); chomp ($rest); $hash1{$rest} = $id; } foreach my $line (@sample2) { my ($id, $rest) = split( '\t', $line, 2); chomp( $rest); if (exists($hash1{$rest}) { print "Match: $line\n"; } }
A Monk aims to give answers to those who have none, and to learn from those who know more.