Benchmarking left to someone who cares :)
#!/usr/bin/perl
# https://perlmonks.org/?node_id=11101225
use strict;
use warnings;
my $sentence = "this is the text to play with";
my $ngramWindow_MIN = 2;
my $ngramWindow_MAX = 3;
my ($low, $high) = ($ngramWindow_MIN - 1, $ngramWindow_MAX - 1);
$sentence =~ /(?<!\S)\S+(?: \S+){$low,$high}?(?!\S)(?{
print "START INDEX: @{[$` =~ tr| || ]} : $&\n"
})(*FAIL)/;
Outputs (same lines, slightly different order) :
START INDEX: 0 : this is
START INDEX: 0 : this is the
START INDEX: 1 : is the
START INDEX: 1 : is the text
START INDEX: 2 : the text
START INDEX: 2 : the text to
START INDEX: 3 : text to
START INDEX: 3 : text to play
START INDEX: 4 : to play
START INDEX: 4 : to play with
START INDEX: 5 : play with