Using a hash is probably a better way to do it. Something like this may be what you want:
use warnings;
use strict;
my %Ids;
open (IDS, "$ARGV[0]") or die "unable to open file $!\n";
while (<IDS>)
{
chomp;
$Ids {$_} = undef;
}
close IDS;
open (GENES, "$ARGV[1]") or die "unable to open file $!\n";
my @genes;
while (<GENES>)
{
if (/^>/)
{
unshift @genes, substr "$_ ", 1;
}
else
{
$genes[0] .= $_;
}
}
close GENES;
foreach my $gene (@genes)
{
my ($id) = $gene =~ /^(.*?)[,\s]/g;
next if ! defined $id;
++$Ids{$id};
}
foreach (sort keys %Ids)
{
print "$_\n" if defined $Ids{$_};
}
Assumes sample data given in original node.
dbj|BA000040|:2701685-2702539
gi|11995001:156374-156649
Perl is Huffman encoded by design.