#! perl -slw
use strict;
my @data = map{ [ split "','", substr( $_, 1, -2 ) ] } <DATA>;
my $mask = ' ' x 10_000;
my $max = 0;
# Sort the data to put longest ranges first (could be more efficient:)
@data = sort{
$b->[2] - $b->[1] <=> $a->[2] - $a->[1]
} @data;
for( @data ) {
# Calc range
my $len = $_->[2] - $_->[1] +1;
# Test to see if this range is already covered?
if( substr( $mask, $_->[1], $len ) !~ m[ ] ) {
$_ = undef; # if so, delete
next;
}
# Other wise add it to the mask
substr( $mask, $_->[1], $len ) = 'x' x $len;
# Remeber the highest pos.
$max = $_->[2] if $_->[2] > $max;
}
# Trim the string
$mask = substr( $mask, 0, $max );
# Count coverage
my $coverage = $mask =~ tr[x][x];
# Calc. percentage
printf "cover = %.1f%% \n", $coverage / length($mask) *100;
print 'Using:';
# Display contributing ranges having discarded deleted elements.
print "@$_" for grep{ $_ } @data;
__DATA__
'NM_176827','618','692'
'NM_176827','621','710'
'NM_176827','622','692'
'NM_176827','629','710'
Output P:\test>279587-3
cover = 13.0%
Using:
NM_176827 621 710
NM_176827 618 692
I've also got a version that does this using vec instead of substr, which saves 7/8 of the space, but runs much more slowly.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
|