The following code works on my current version of perl. Array taxR could have 4,000 entries, there will be about 5 million different curEntry. The purpose of this is to check the taxonomic code of the current protein being read to see if it is included in an array of taxonomic codes to allow.
my @taxR = ("PLRV1", "PMTVS", "PVXHB");
my $curEntry = "PMTVS";
if($curEntry ~~ @taxR){
print "do rest of stuff";
}
With this code my entire program takes about 20 seconds to run on my test data set and 30 minutes on the real thing.
I've tried this
my @taxR = ("PLRV1", "PMTVS", "PVXHB");
my $curEntry = "PMTVS";
if( first { $_ eq $curEntry } @taxR ){
print "do rest of stuff";
}
but the test data takes 3 minutes to run, so the real set would be unusably long.
I have draconian IT guys that will never agree to upgrade perl on the Macs, version 5.8.8, so I was hoping you could help me find a replacement method that doesn't take a million years to run.