There is a lot of repetition, most of the if blocks do the same thing and each block repeats its list of headers. And that list of headers it's pretty hard to read. I'd suggest that you look at factoring out much of that duplication. Something like
my %inputfiles = (
gene2accession => join( "\t",
qw( Taxon
GeneID
Status
RNA_Nucleotide_Accession
RNA_Nucleotide_gi
Protein_Accession
Protein_gi
Genomic_Nucleotide_Accession
Genomic_Nucleotide_gi
Genomic_Accession_Start_Pos
Genomic_Accession_End_Pos
Orientation
Assembly
)
),
gene2go => join( "\t",
qw(
Taxon
GeneID
GO_ID
Evidence
Qualifier
GO_term
PubMedID
Category
)
),
...
);
foreach my $file ( keys %inputfiles ) {
...
while ( my $line = <INPUT> ) {
if ( $linecount == 0 ) {
print OUTFILE "$file\n";
print SUMMARY "Field lengths for file $file\n";
print SUMMARY "$inputfiles{$file}\n";
if ( $file eq 'hiv_interactions' ) {
# do special processing
}
}
}
...
}
seems to be easier to read. It is obvious that the header is the same in both the summary and out files and that hiv_interactions gets special treatment. The headers are also easier to read. At least something to consider for your next script;)