G'day ccelt09,
Here's my take on a solution.
I've output to a hash (with possibly unneeded, additional data): easily modified for printing.
I've kept the interval data unchanged.
I've made slight changes to the line data: most lines match multiple ranges; some match none.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
use Inline::Files;
my (@range, %capture);
my $line_re = qr[^(?:\d+\s+){3}(\d+)];
while (<INTERVALS>) {
my ($min, $max) = (split)[1,2];
push @range => [ $min, $max ];
$capture{range}{$min} = $max;
}
while (<FRAME>) {
chomp;
collate($_);
}
{
local $Data::Dumper::Indent = 1;
local $Data::Dumper::Sortkeys = 1;
print Dumper \%capture;
}
sub collate {
my $line = shift;
my ($key) = $line =~ $line_re;
for (@range) {
last if $_->[0] > $key;
next if $_->[1] < $key;
push @{$capture{data}{$_->[0]}} => $line;
}
}
__INTERVALS__
chrX 1 1000001
chrX 100001 1100001
chrX 200001 1200001
chrX 300001 1300001
chrX 400001 1400001
chrX 500001 1500001
chrX 600001 1600001
chrX 700001 1700001
chrX 800001 1800001
__FRAME__
0 25 27 260692 2 2 3 2 2 3 3 3 2 3 1 1 2 1 2 2 3 3 2 1 2 1 1 1 2 3
0 19 33 160466 2 2 3 2 2 2 3 3 3 3 1 1 2 1 2 3 3 3 2 1 2 3 2 2 3 3
0 25 27 60454 2 2 3 2 2 3 3 3 2 3 1 1 2 1 2 2 3 3 2 1 2 1 1 1 2 3
0 25 27 3260882 2 2 3 2 2 3 3 3 2 3 1 1 2 1 2 2 3 3 2 1 2 1 1 1 2 3
0 50 2 460727 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 48 4 860814 1 1 1 1 1 1 2 1 1 3 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1
0 46 6 1660866 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 2
0 48 4 6460888 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 2
0 50 2 60909 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Output: