Keep It Simple, Stupid PerlMonks

### Re^2: Split range 0 to M into N non-overlapping (roughly equal) ranges.

by ELISHEVA (Prior)
 on Mar 13, 2011 at 05:22 UTC ( #892900=note: print w/replies, xml ) Need Help??

FYI, there are occasional one-off rounding error issues with this algorithm (possibly platform dependent), 2.5% of the time partitioning ranges of 1..100 or less. Depending on your goals that may or may not matter, but here are some examples :

```\$M,\$N         final array element (should be \$M)
(14,11) ==> final=13
(14,13) ==> final=13

(29,11) ==> final=28
(29,13) ==> final=28
(29,22) ==> final=28
(29,26) ==> final=28

(59,11) ==> final=58
(59,13) ==> final=58
(59,22) ==> final=58
(59,26) ==> final=58
(59,44) ==> final=58
(59,52) ==> final=58
(59,55) ==> final=58

(100,49) ==> final=99
(100,98) ==> final=99
(100,99) ==> final=99

Test script

```use strict;
use warnings;

#========================================
# Algorithms
#========================================

sub ranges_javafan {
my (\$M, \$N) = @_;
return [ map {[int(\$_*(\$M+1)/\$N)
, int((\$_+1)*((\$M+1)/\$N))-1]
} 0..\$N - 1];
}

#----------------------------------------

sub ranges_javafan_buk {
my (\$M, \$N) = @_;
my \$STEP = ( \$M + 1 ) / \$N;
return [ map [int( \$_ * \$STEP ), int( ( \$_+1 ) * \$STEP ) -1]
, 0 .. \$N - 1];
}

#----------------------------------------

sub ranges_elisheva_ikegami {
my (\$M, \$N) = @_;
my \$r = (\$M+1)%\$N;
my \$q = (\$M+1-\$r)/\$N;
my \$iEnd=0;
return [ map { my \$iStart=\$iEnd; [\$iStart, (\$iEnd+=\$_) - 1]
} ((\$q+1) x \$r, (\$q) x (\$N-\$r)) ];
}

#========================================
# Check of final element of range
#========================================

sub roundCheck {
my (\$cr, \$M) = @_;
my \$iTotal=0;
my \$iErrors=0;
for my \$m (1..\$M) {
for my \$n (1..\$m) {
\$iTotal++;
my \$aRanges = \$cr->(\$m, \$n);
my \$iFinal = \$aRanges->[-1][1];
if (\$iFinal != \$m) {
\$iErrors++;
# Uncomment here to see individual errors
# ---------------------------------------
# print "(\$m,\$n) ==> final=\$iFinal\n";
}
}
}
printf "Error rate: %s\n", (\$iErrors*100/\$iTotal);
}
#-------------------------------------------------------

print "Javafan:\n";
roundCheck(\&ranges_javafan, 100);
print "Javafan_buk:\n";
roundCheck(\&ranges_javafan_buk, 100);
print "elisheva_ikegami\n";
roundCheck(\&ranges_elisheva_ikegami, 100);

#outputs (Debian Linux - Lenny, perl 5.10.0,  32-bit)
Javafan:
Error rate: 2.47524752475248
Javafan_buk:
Error rate: 2.47524752475248
elisheva_ikegami
Error rate: 0

Replies are listed 'Best First'.
Re^3: Split range 0 to M into N non-overlapping (roughly equal) ranges.
by BrowserUk (Pope) on Mar 13, 2011 at 15:42 UTC

Good catch! As originally coded, the error rate across all M/N combinations (< 2**32), seems to come out at ~ 1 in 15 (6.66%).

```#! perl -slw
use strict;
use List::Util qw[ min ];
use Math::Random::MT qw[ rand ];

\$|++;

sub check {
my( \$m, \$n ) = @_;
my \$step = ( \$m +1 ) / \$n;
my \$f = int( \$n * \$step ) -1;
return if  \$f != \$m;
return 1;
}

my \$trials = 0;
my \$fails = 0;
for ( 1 .. 1e6 ) {
my \$m = int( rand 2**32 );
for ( 1 .. min( \$m, 1000 ) ) {
++\$trials;
my \$n = 1+int( rand \$m );
check( \$m, \$n ) or ++\$fails;
}
printf "\r\$_ : %f%%", \$fails *100 / \$trials;
}
__END__
C:\test>ranges
76977645 : 6.620262%

However, a simple fudge floating point rounding correction factor of 0.000001 seems to sort things out nicely:

```#! perl -slw
use strict;
use List::Util qw[ min ];
use Math::Random::MT qw[ rand ];

\$|++;

sub check {
my( \$m, \$n ) = @_;
my \$step = ( \$m +1.000001 ) / \$n;
my \$f = int( \$n * \$step ) -1;
#    warn( "m:\$m n:\$n f:\$f\n" )
return if  \$f != \$m;
return 1;
}

my \$trials = 0;
my \$fails = 0;
for ( 1 .. 1e6 ) {
my \$m = int( rand 2**32 );
for ( 1 .. min( \$m, 1000 ) ) {
++\$trials;
my \$n = 1+int( rand \$m );
check( \$m, \$n ) or ++\$fails;
}
printf "\r\$_ : %f%%", \$fails *100 / \$trials;
}

__END__
C:\test>ranges
6783635 : 0.000000%

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

For the record, although developed independently my version at Re: Split range 0 to M into N non-overlapping (roughly equal) ranges. has the same sort of bug, triggered by at least one of the same input pairs. I guess when two people hit the same flaw, we at least know we're close to the right solution.

Since the fix is already discussed for JavaFan's map version, I'm sure the translation of that fix to the for loop can be left as an exercise. Personally, my fix would probably be to pull in bignum, which works well since it's a rounding error.

Broken:

Fixed:

```[chris@pineapple ranges-892828]\$ perl -Mbignum ranges -M=14 -N=11
0 : from       0 to       0 (      1)
1 : from       1 to       1 (      1)
2 : from       2 to       3 (      2)
3 : from       4 to       4 (      1)
4 : from       5 to       5 (      1)
5 : from       6 to       7 (      2)
6 : from       8 to       8 (      1)
7 : from       9 to       9 (      1)
8 : from      10 to      11 (      2)
9 : from      12 to      12 (      1)
10 : from      13 to      14 (      2)
[chris@pineapple ranges-892828]\$ perl -Mbignum ranges -M=13 -N=12
0 : from       0 to       0 (      1)
1 : from       1 to       1 (      1)
2 : from       2 to       2 (      1)
3 : from       3 to       3 (      1)
4 : from       4 to       4 (      1)
5 : from       5 to       6 (      2)
6 : from       7 to       7 (      1)
7 : from       8 to       8 (      1)
8 : from       9 to       9 (      1)
9 : from      10 to      10 (      1)
10 : from      11 to      11 (      1)
11 : from      12 to      13 (      2)

It's maybe slower than fudging, but I have more faith in it. It's easy enough to add a line at the top of the program to let someone else deal with the intricacies since this isn't going to be the resource-hungry part of your code anyway.

It's maybe slower than fudging, but I have more faith in it.

Having called it "fudging", there is actually very sound math sitting behind it. I'm not competent to explain the math, but I can explain my testing. I've run all permutations of M & N from 1 through 10,000; and 500 million random selections of M & N = 1 .. 2**32 without seeing a single error.

Given that in the original form, there was a 1 in 15 error rate, the odds that if there were any errors with the "fudged" algorithm, that I have not detected 1 of them after 600 million trials, are so vanishingly small as to be ... well, very, very unlikely.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Split range 0 to M into N non-overlapping (roughly equal) ranges.
by JavaFan (Canon) on Mar 13, 2011 at 11:08 UTC
Yeah, they are rounding errors. Rearranging the calculation fixes it (at least, on my platform):
```sub ranges_javafan {
my (\$M, \$N) = @_;
return [ map {[int(\$_/\$N*(\$M+1))
, int(((\$_+1)/\$N)*(\$M+1))-1]
} 0..\$N - 1];
}
The most important is the final (\$_+1)/\$N*(\$M+1) - that should return \$M+1 for \$_ == \$N-1. The impact of other rounding errors is that a range is 1 shorter than expected (but making the next one one larger - it shouldn't skip numbers).
Re^3: Split range 0 to M into N non-overlapping (roughly equal) ranges.
by BrowserUk (Pope) on Mar 13, 2011 at 06:19 UTC

What?

```[ 6:16:11.35] C:\test>ranges -M=14 -N=11
0 : from       0 to       0 (      1)
1 : from       1 to       1 (      1)
2 : from       2 to       3 (      2)
3 : from       4 to       4 (      1)
4 : from       5 to       5 (      1)
5 : from       6 to       7 (      2)
6 : from       8 to       8 (      1)
7 : from       9 to       9 (      1)
8 : from      10 to      11 (      2)
9 : from      12 to      12 (      1)
10 : from      13 to      13 (      1)
========================================
14

[ 6:16:42.76] C:\test>ranges -M=14 -N=13
0 : from       0 to       0 (      1)
1 : from       1 to       1 (      1)
2 : from       2 to       2 (      1)
3 : from       3 to       3 (      1)
4 : from       4 to       4 (      1)
5 : from       5 to       5 (      1)
6 : from       6 to       7 (      2)
7 : from       8 to       8 (      1)
8 : from       9 to       9 (      1)
9 : from      10 to      10 (      1)
10 : from      11 to      11 (      1)
11 : from      12 to      12 (      1)
12 : from      13 to      13 (      1)
========================================
14

[ 6:17:01.29] C:\test>ranges -M=29 -N=11
0 : from       0 to       1 (      2)
1 : from       2 to       4 (      3)
2 : from       5 to       7 (      3)
3 : from       8 to       9 (      2)
4 : from      10 to      12 (      3)
5 : from      13 to      15 (      3)
6 : from      16 to      18 (      3)
7 : from      19 to      20 (      2)
8 : from      21 to      23 (      3)
9 : from      24 to      26 (      3)
10 : from      27 to      28 (      2)
========================================
29

[ 6:17:30.04] C:\test>ranges -M=100 -N=49
0 : from       0 to       1 (      2)
1 : from       2 to       3 (      2)
2 : from       4 to       5 (      2)
3 : from       6 to       7 (      2)
4 : from       8 to       9 (      2)
5 : from      10 to      11 (      2)
6 : from      12 to      13 (      2)
7 : from      14 to      15 (      2)
8 : from      16 to      17 (      2)
9 : from      18 to      19 (      2)
10 : from      20 to      21 (      2)
11 : from      22 to      23 (      2)
12 : from      24 to      25 (      2)
13 : from      26 to      27 (      2)
14 : from      28 to      29 (      2)
15 : from      30 to      31 (      2)
16 : from      32 to      34 (      3)
17 : from      35 to      36 (      2)
18 : from      37 to      38 (      2)
19 : from      39 to      40 (      2)
20 : from      41 to      42 (      2)
21 : from      43 to      44 (      2)
22 : from      45 to      46 (      2)
23 : from      47 to      48 (      2)
24 : from      49 to      50 (      2)
25 : from      51 to      52 (      2)
26 : from      53 to      54 (      2)
27 : from      55 to      56 (      2)
28 : from      57 to      58 (      2)
29 : from      59 to      60 (      2)
30 : from      61 to      62 (      2)
31 : from      63 to      64 (      2)
32 : from      65 to      67 (      3)
33 : from      68 to      69 (      2)
34 : from      70 to      71 (      2)
35 : from      72 to      73 (      2)
36 : from      74 to      75 (      2)
37 : from      76 to      77 (      2)
38 : from      78 to      79 (      2)
39 : from      80 to      81 (      2)
40 : from      82 to      83 (      2)
41 : from      84 to      85 (      2)
42 : from      86 to      87 (      2)
43 : from      88 to      89 (      2)
44 : from      90 to      91 (      2)
45 : from      92 to      93 (      2)
46 : from      94 to      95 (      2)
47 : from      96 to      97 (      2)
48 : from      98 to      99 (      2)
========================================
100

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
I'm seeing the same numbers as you. But what is your output for: ranges -M=20 -N=3 ?

I get: 0-6, 7-13, 14-20.

I expect the last index to be 19.
I expect the last index to be 19.

Why? The range is 0 .. 20 inclusive:

```[ 6:17:58.02] C:\test>ranges -M=20 -N=3
0 : from       0 to       6 (      7)
1 : from       7 to      13 (      7)
2 : from      14 to      20 (      7)
========================================
21

[ 7:10:34.53] C:\test>ranges -M=20 -N=2
0 : from       0 to       9 (     10)
1 : from      10 to      20 (     11)
========================================
21

[ 7:11:36.69] C:\test>ranges -M=20 -N=1
0 : from       0 to      20 (     21)
========================================
21

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Create A New User
Node Status?
node history
Node Type: note [id://892900]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2020-05-24 21:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
If programming languages were movie genres, Perl would be:

Results (141 votes). Check out past polls.

Notices?