in reply to Finding Overlapping Regions on Genome
Hi all,
I am so sorry. I wont repeat the mistake. I do understand the policy.
The question is
how to get the overlapping coordinates pairs if there any? If yes take the smallest start position and largest stop position from overlapping coordinates and if no overlap for a pair, report as it is?
The example I gave was a small part of a huge file. I was manually doing it, that is why I missed some overlapping coordinates.
I did write a code for it.
here is the code which I wrote,
open $fh,"genome_pred_exons.bed"; use List::Util qw( min max ); while(<$fh>) { chomp; @lh = split /\t/,$_; $ph{$lh[0]."\t".$lh[1]."\t".$lh[2]} .= $lh[3]."\t".$lh[4]."NNN +NNN"; } close($fh); foreach $k(sort keys %ph) { @c = sort { $a <=> $b } split ("NNNNNN",$ph{$k}); @g = (); foreach $m(@c) { @r = split /\t/,$m; if (@g == 0) { push @g, [$r[0]+0, $r[1]+0]; next; } else { for ($i = 0;$i <= $#g;$i++) { if(($r[0] + 0) <= ($g[$i][1] + 1)) { $g[$i][0] = min($g[$i][0]+0, $r[0]+0); $g[$i][1] = max($g[$i][1]+0, $r[1]+0); last; } elsif ( $i == $#g ) { push @g, [$r[0]+0, $r[1]+0]; last; } } } } } print Dumper \@g;
This was an adapted script from internet. I changed it, but it did not give me any output.
I hope this is a better way to ask than the last one.
thank you, Deepak
|
---|
In Section
Seekers of Perl Wisdom