|Problems? Is your data what you think it is?|
Comparing 2-D co-ordinatesby aging acolyte (Pilgrim)
|on Jul 31, 2003 at 14:20 UTC||Need Help??|
aging acolyte has asked for the
wisdom of the Perl Monks concerning the following question:
More of a conceptual problem than a Perl problem per se.
I am parsing the output of a program (non-perl and not editable) that compares two DNA sequences. The results are given as sets of 2-D co-ordinates.
where field 1 is the name
field2 is the start coordinate
and field3 is the end coordinate
A simple sort allows me to sort the above and calculate the coverage
All well and good for the above case. But parts of my data are overlapping e.g.
Here my code breaks down and I cannot for the life of me think of a data structure that will allow adequately deal with both cases (i.e a collection of distinct hits, a collection of overlapping hits) or worse a combination of the two.
When they overlap I need to pick the longest hit. (i.e. the first array above).
Am I just having a brain metldown today - Can someone point me in the direction of an obvious solution. Thanks a lot