This will produce blocks where consecutive values are within range, but not necessarily where all values are within range - for example, the second block in the output ranges from 1 to 2 for a required maximum deviation of 0.5.
To produce the desired result, you need to keep track of the maximum and minimum values within the candidate set, and test against these.
Also, if you drop the whole candidate set as soon as the latest value doesn't fit, you will miss sequences which start in the middle of the current set but don't include the early values - you need to add the latest value to the end and then chop values off the beginning until you get back to a qualifying set. This may end up being just the latest value, but it could be larger.
The following code shows one way to do this. It uses the same changed block size and deviation as tangent's example, and adds some more data to illustrate overlapping qualifying sets
use strict;
use warnings;
# Set block size and deviation
my $block_size = 4;
my $deviation = 0.5;
# Initialise the candidates, maximum and minimum values
my $candidates = [[split(' ', <DATA>)]];
my $maxval = my $minval = $candidates->[0][3];
# Loop through the rest of the DATA
while (<DATA>) {
my $current = [split];
if (within_range($current->[3], $maxval, $minval)) {
push(@$candidates, $current);
$maxval = $current->[3] if $maxval < $current->[3];
$minval = $current->[3] if $minval > $current->[3];
}
elsif (@$candidates >= $block_size) {
print_block($candidates);
push(@$candidates, $current);
($candidates, $maxval, $minval) = trim_candidates($candidates);
}
else {
push(@$candidates, $current);
($candidates, $maxval, $minval) = trim_candidates($candidates);
}
}
# deal with stragglers
if ( @$candidates >= $block_size ) {
print_block( $candidates );
}
sub within_range {
my ($testval, $testmax, $testmin) = @_;
return 0 if $testmax - $testval > $deviation;
return 0 if $testval - $testmin > $deviation;
return 1;
}
sub print_block {
my ( $lines ) = @_;
print "BLOCK\n";
for my $line ( @$lines ) {
print join(' ', @$line ), "\n";
}
}
sub trim_candidates {
my $worklist = shift;
# drop the first entry
shift @$worklist;
# Check if the remaining worklist qualifies
my $workmax = my $workmin = $worklist->[0][3];
foreach my $item (@$worklist) {
return trim_candidates($worklist)
unless within_range($item->[3], $workmax, $workmin);
$workmax = $item->[3] if $workmax < $item->[3];
$workmin = $item->[3] if $workmin > $item->[3];
}
return ($worklist, $workmax, $workmin);
}
0;
__DATA__
1 10492 rs55998931 0.272727272727273 0.4375
1 13418 . 0.25 0.0625
1 13752 . 0.153846153846154 0.25
1 13813 . 0.0357142857142857 0.2
1 13838 . 0.0357142857142857 0.2
1 14907 rs79585140 0.5 0.555555555555556
1 14930 rs75454623 0.535714285714286 0.611111111111111
1 14933 rs199856693 0.0357142857142857 0.0555555555555556
1 14948 rs201855936 0.107142857142857 0
1 10492 rs55998931 1 0.4375
1 10492 rs55998931 1.5 0.4375
1 10492 rs55998931 1.9 0.4375
1 10492 rs55998931 2 0.4375
1 10492 rs55998931 2.6 0.4375
1 13418 blah 20.0 blah
1 13418 blah 20.1 blah
1 13418 blah 20.2 blah
1 13418 blah 20.3 blah
1 13418 blah 20.4 blah
1 13418 blah 20.5 blah
1 13418 blah 20.6 blah
1 13418 blah 20.7 blah
1 13418 blah 30.5 blah
1 13418 blah 30.0 blah
1 13418 blah 30.0 blah
1 13418 blah 30.5 blah
1 13418 blah 30.6 blah
1 13418 blah 30.9 blah
1 13418 blah 30.6 blah
1 13418 blah 30.9 blah
Output:
BLOCK
1 10492 rs55998931 0.272727272727273 0.4375
1 13418 . 0.25 0.0625
1 13752 . 0.153846153846154 0.25
1 13813 . 0.0357142857142857 0.2
1 13838 . 0.0357142857142857 0.2
1 14907 rs79585140 0.5 0.555555555555556
BLOCK
1 13418 blah 20.0 blah
1 13418 blah 20.1 blah
1 13418 blah 20.2 blah
1 13418 blah 20.3 blah
1 13418 blah 20.4 blah
1 13418 blah 20.5 blah
BLOCK
1 13418 blah 20.1 blah
1 13418 blah 20.2 blah
1 13418 blah 20.3 blah
1 13418 blah 20.4 blah
1 13418 blah 20.5 blah
1 13418 blah 20.6 blah
BLOCK
1 13418 blah 20.2 blah
1 13418 blah 20.3 blah
1 13418 blah 20.4 blah
1 13418 blah 20.5 blah
1 13418 blah 20.6 blah
1 13418 blah 20.7 blah
BLOCK
1 13418 blah 30.5 blah
1 13418 blah 30.0 blah
1 13418 blah 30.0 blah
1 13418 blah 30.5 blah
BLOCK
1 13418 blah 30.5 blah
1 13418 blah 30.6 blah
1 13418 blah 30.9 blah
1 13418 blah 30.6 blah
1 13418 blah 30.9 blah