The input is a numerical string like this GO:007983, and a text string e.g. 'transport'. Duplicate instances of both are present in the input array (so I can't just swop the keys and values round).
The desired output is a listing of all of the duplicate numerical strings, along with their associated text strings, which I will print to file. I will probably also try to generate some simple statistics based on these, but that comes later.
if ($overlap_cluster_terms_resplit_line =~ /^\s*(GO:\d+)\s\w+/g)
{
push (@overlap_cluster_terms_nothashed_keysarray, $overlap_clu
+ster_desc);
$overlap_cluster_terms = $1;
$overlap_cluster_terms_hash{$overlap_cluster_terms} = $overlap
+_cluster_desc;
# unless exists $overlap_cluster_terms_hash{$overlap_cluster_terms};
if (exists $overlap_cluster_terms_hash{$overlap_cluster_terms}
+)
{
push (@overlap_debug, $overlap_cluster_desc);
print OVERLAP_OUTPUT $overlap_cluster_terms;
print OVERLAP_OUTPUT "\t";
print OVERLAP_OUTPUT $overlap_cluster_desc;
print OVERLAP_OUTPUT "\n\n";
}
BTW: The $overlap_cluster_desc comes from another if statement within the foreach loop. Both not shown.
|