in reply to Re: Find overlap
in thread Find overlap
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: Find overlap
by Anonymous Monk on Oct 14, 2012 at 02:47 UTC | |
i'm not sure if i get the problem right but this script will find the min of all values in column 2 of all given files and the max of all values in column 3. you just call it with the file names as arguments.
| [reply] [d/l] |
Re^3: Find overlap
by Anonymous Monk on Oct 14, 2012 at 03:37 UTC | |
for the given input: file1: file2: file3: file4: it would output:
| [reply] [d/l] [select] |
by Kenosis (Priest) on Oct 14, 2012 at 07:53 UTC | |
Your solution is an excellent use of a HoH! Here are a few items to consider within your foreach:
Hope this is helpful. | [reply] [d/l] [select] |
by Anonymous Monk on Oct 14, 2012 at 16:28 UTC | |
The line of the first file does not overlap with the other 3 files, so i dont want to find the minima and maxima. I want to find only the minima and maxima of the regions in which the lines of all four files overlap each other. So the minima and maxima of the other regions in the files. Minima = 10 maxima = 80 | [reply] [d/l] |
by Cristoforo (Curate) on Oct 17, 2012 at 15:42 UTC | |
It also prints out the number of overlapping records, (merged 3 says 3 records overlapped this range). You can change the print statement to not print this if you (probably) want. For each chromosome, the records are sorted in order from the smallest beginning position to the largest beginning position. This simplifies the logic and makes a solution possible. The code that does this is: my ($first, @in_order) = sort {$a->[0] <=> $b->[0]} @{ $data{$chr} }; And, this is the program. Update: Just a note - you don't need the 'nsort' function from the Sort::Naturally module for the program to process correctly. This would merely order your chromosomes in your output. And, if you can't use List::Util for the max function, you could easily define it yourself. I used the module just to save some additional code. | [reply] [d/l] [select] |