Hi, TIMTOWTDI.
This approach uses two phases:
Phase-1: Find a combination of bins with at maximum two capacity classes.
Ideally use minimum number of bins with maximum capacity. If that doesn't work,
find greatest number of bins with maximum capacity and the rest of the bins will
have minimum capacity (choose min-capacity as big as possible - by configuration).
E.g., 31 bins: 31 bins a 6 capacity = 186 elements. 32 bins: 26 bins a 6 + 6 bins a 5 = 186 elements.
(I understand that it is the OP's intention to keep the bin capacity
maximised and homogeneous.)
After this phase the bin geometry is fixed (e.g. use bins with capacity 5-6) and the bins
will be filled naively - without respecting the second constraint. That will be optimised in ...
Phase-2: Compute an error measure that identifies how good the current solution is.
Now select groups of bins that have a sum that is too big or too small. Pick random elements from
these groups and cross-exchange items iff the error measure will decrease. Fall back to any
group if there is no group that satisfies the constraint (i.e. sum too big/small). Stop if all restrictions
are satisfied. Periodically, check how far we are. Stop, if the solution has not improved. Instead of stopping the simulation, one could introduce a little mutation and increase the error by randomly swapping elements. However, I noticed that it is sufficient to restart the simulation with a new set of
freshly shuffled items...
Concerning the target value to optimse for (e.g. sum of a bin shall be 852), I used something along
bin-size * mean value. When there are two capacity classes, the mean capacity is used. This leads to an evenly distributed sum per bin even when two bin-capacities are in use
(e.g. smaller bins get less but bigger items).
I hope, these assumptions were made in the OP's intention...
Here's a sample output:
partitions: 31x6 0x6
mean-cap. : 6.000
sum : 26403
target-sum: 852
mean : 141.951612903226
inital error: 67.0183766996175
=== SOLUTION FOUND ===
runtime : 0s, loops:144, improv.:57
ranges : low: 840, target: 852, high: 900
capacity: min: 6, meancap: 6, max: 6
error. : 0.593394689997349 bins: 31
1) 858 OK (sz=6, trgt= 852.0, [840-900]) 123, 131, 132, 137
+, 137, 198
2) 856 OK (sz=6, trgt= 852.0, [840-900]) 121, 123, 127, 133
+, 157, 195
3) 842 OK (sz=6, trgt= 852.0, [840-900]) 107, 109, 134, 157
+, 164, 171
4) 855 OK (sz=6, trgt= 852.0, [840-900]) 126, 128, 136, 141
+, 147, 177
5) 848 OK (sz=6, trgt= 852.0, [840-900]) 98, 121, 122, 147
+, 173, 187
6) 844 OK (sz=6, trgt= 852.0, [840-900]) 112, 129, 129, 152
+, 154, 168
7) 849 OK (sz=6, trgt= 852.0, [840-900]) 117, 129, 139, 144
+, 160, 160
8) 858 OK (sz=6, trgt= 852.0, [840-900]) 88, 91, 99, 141
+, 143, 296
9) 845 OK (sz=6, trgt= 852.0, [840-900]) 95, 117, 136, 139
+, 142, 216
10) 844 OK (sz=6, trgt= 852.0, [840-900]) 105, 121, 133, 148
+, 154, 183
11) 845 OK (sz=6, trgt= 852.0, [840-900]) 113, 114, 132, 151
+, 153, 182
12) 848 OK (sz=6, trgt= 852.0, [840-900]) 99, 122, 124, 157
+, 158, 188
13) 855 OK (sz=6, trgt= 852.0, [840-900]) 105, 116, 150, 151
+, 161, 172
14) 876 OK (sz=6, trgt= 852.0, [840-900]) 130, 135, 138, 143
+, 149, 181
15) 856 OK (sz=6, trgt= 852.0, [840-900]) 92, 126, 139, 140
+, 167, 192
16) 843 OK (sz=6, trgt= 852.0, [840-900]) 99, 111, 112, 116
+, 144, 261
17) 871 OK (sz=6, trgt= 852.0, [840-900]) 103, 109, 113, 131
+, 158, 257
18) 857 OK (sz=6, trgt= 852.0, [840-900]) 120, 122, 143, 152
+, 154, 166
19) 851 OK (sz=6, trgt= 852.0, [840-900]) 103, 105, 124, 166
+, 173, 180
20) 852 OK (sz=6, trgt= 852.0, [840-900]) 113, 122, 131, 135
+, 141, 210
21) 850 OK (sz=6, trgt= 852.0, [840-900]) 103, 116, 126, 161
+, 167, 177
22) 858 OK (sz=6, trgt= 852.0, [840-900]) 96, 132, 134, 160
+, 162, 174
23) 850 OK (sz=6, trgt= 852.0, [840-900]) 98, 130, 145, 150
+, 150, 177
24) 844 OK (sz=6, trgt= 852.0, [840-900]) 112, 121, 124, 145
+, 146, 196
25) 841 OK (sz=6, trgt= 852.0, [840-900]) 115, 118, 122, 124
+, 165, 197
26) 841 OK (sz=6, trgt= 852.0, [840-900]) 114, 130, 135, 144
+, 156, 162
27) 840 OK (sz=6, trgt= 852.0, [840-900]) 97, 111, 118, 126
+, 193, 195
28) 852 OK (sz=6, trgt= 852.0, [840-900]) 99, 122, 126, 160
+, 163, 182
29) 844 OK (sz=6, trgt= 852.0, [840-900]) 86, 105, 115, 141
+, 198, 199
30) 872 OK (sz=6, trgt= 852.0, [840-900]) 103, 120, 135, 155
+, 156, 203
31) 858 OK (sz=6, trgt= 852.0, [840-900]) 89, 111, 114, 115
+, 118, 311
use strict;
use warnings;
use List::Util 'shuffle';
# A lot could be done better ... (less globals, more structure, etc.)
# - of course, I am in a hurry! ;-)
my @a = qw(
121 182 111 160 105 113 121 97 123 157 133 161 141 135 137 145
133 137 151 118 126 141 174 181 154 109 198 114 122 162 91 99
116 122 195 199 150 192 163 88 112 157 182 210 124 105 144 166
144 257 164 156 173 154 193 142 143 126 118 130 107 86 131 154
131 147 134 118 115 135 141 158 129 143 126 128 134 129 167 130
135 117 127 146 96 117 99 99 139 152 149 136 105 124 136 160
160 139 177 115 123 103 150 183 132 171 121 114 111 113 131
144 122 141 111 139 145 109 114 122 103 160 153 147 172 155 122
296 124 112 161 124 311 99 157 122 120 198 152 140 162 177 98
138 156 177 103 180 187 173 150 135 168 132 196 112 195 126 113
116 105 116 151 216 188 158 121 166 148 132 89 197 92 115 98
130 103 120 261 143 126 167 203 95 165 129
);
#-- model data
# my ($bin_no, $min_capacity, $max_capacity, $low, $high) = (38, 4,5,
+ 694,750);
# my ($bin_no, $min_capacity, $max_capacity, $low, $high) = (32, 5,6,
+ 824,900);
my ($bin_no, $min_capacity, $max_capacity, $low, $high) = (31, 6,6,
+ 840,900);
# my ($bin_no, $min_capacity, $max_capacity, $low, $high) = (30, 6,7,
+ 840,900);
#-- simulation control
my $Improvements; # count how often we could improve (reduce error) t
+he
# current model state during the last watchdog inte
+rval
my $maxloops = 250_000; # give up after that many tries
my $watchdog = 50_000; # check current simulation state
my $loops = 0; # current number of simulation itera
+tions
my $sum; $sum += $_ for @a;
my $mean = $sum / @a;
#-- find biggest partition of bins with max. capacity
# assumption: only two partitions with bins of max- and min-capacity
my $j_max = find_biggest_partition();
print "partitions: ${j_max}x${max_capacity} ", $bin_no - $j_max, "x$mi
+n_capacity\n";
#-- balance partitions -> mean capacity:
my $mean_cap = ($max_capacity * $j_max + $min_capacity * ($bin_no - $j
+_max)) / $bin_no;
printf "mean-cap. : %.3f\n", $mean_cap;
my $target_sum = int( $mean * $mean_cap + 0.99 );
die "reduce low ($low) below $target_sum" if $low > $target_sum;
print "sum : $sum\n";
print "target-sum: $target_sum\n";
print "mean : $mean\n";
#-- create bins
my @bins = map { [ 0, [] , $target_sum, $low, $high ] } (0..$bin_no-1)
+;
# bin-element-legend:
# [0]: sum of [1]
# [1]: [ array of items ]
# [2]: individual target / used for error-computation
# [3]: individual low -limit for [0]
# [4]: individual high-limit for [0]
# - note: individual fields [2..4] might be tweaked - currently it is
+overkill
# shuffle input - allows another try if a simulation fails
@a = shuffle @a;
#-- PHASE 1 - satisfy first constraint (bin-"geometry")
# first, fill biggest partition with $max_capacity items
my $idx = 0;
foreach my $i (0..$j_max-1) {
push @{$bins[$i]->[1]}, @a[ $idx .. $idx + $max_capacity - 1 ];
$bins[$i]->[0] += $_ for (@{$bins[$i]->[1]}); # sum up
$idx += $max_capacity; # advance pointer
}
# fill rest (if any) with $min_capacity items
foreach my $i ($j_max..$bin_no-1) {
push @{$bins[$i]->[1]}, @a[ $idx .. $idx + $min_capacity - 1 ];
$bins[$i]->[0] += $_ for (@{$bins[$i]->[1]}); # sum up
$idx += $min_capacity; # advance pointer
# $bins[$i]->[2] = $indi_sum; # we could assign an indiv. target su
+m
# $bins[$i]->[3] = $indi_lo; # we could assign an indiv. low-limit
# $bins[$i]->[4] = $indi_hi; # we could assign an indiv. high-limi
+t
}
die "assertion: failed to partition \@a" if $idx != @a;
print "inital err.: ", error_ind(), "\n";
# PHASE 2 - try to optimise bins by swapping elements
while (1) {
my @wrongs = find_invalid();
last unless @wrongs; # ready!
# simulation-control
$loops++;
unless ($loops % $watchdog) {
print_result();
die "no solution - tune and re-start\n" if ($Improvements == 0 or
+$loops >= $maxloops);
$Improvements = 0;
# here, one could start a little mutation and see if that helps -
+feel free to experiment
# for (1..5) {
# my ($i,$j) = (pick_a_bin(\@bins), pick_a_bin(\@bins));
# swap_bin_element($bins[$j], $bins[$i]);
# }
}
#-- ideal: take it from the rich and give it to the poor ;-)
# (aka Robin Hood distribution)
my @poor = grep { $_->[0] < $_->[3] } @bins; # bins where sum is to
+o low
my @rich = grep { $_->[0] > $_->[4] } @bins; # bins where sum is to
+o high
if (!@rich || !@poor) {
# ok: a list of elements that fulfill size requirements
my @ok = grep { $_->[0] >= $_->[3] and $_->[0]<=$_->[4]} @bins;
@rich = @ok unless @rich;
@poor = @ok unless @poor;
}
# fall-back to any (can this happen?)
@rich = @bins unless @rich;
@poor = @bins unless @poor;
# finally try a swap that might lead to an improvement
my ($i,$j) = (pick_a_bin(\@poor), pick_a_bin(\@rich));
swap_bin_element($rich[$j], $poor[$i]);
}
print "=== SOLUTION FOUND ===\n";
print print_result();
exit;
sub find_invalid {
# check: bins out of [low <= sum <= high] range
my @inv_bins = grep { $_->[0] < $_->[3] || $_->[0]>$_->[4] } @bins;
return @inv_bins if @inv_bins;
# check: bins out of capacity-range
@inv_bins = grep { @{$_->[1]} < $min_capacity || @{$_->[1]}>$max_
+capacity } @bins;
return @inv_bins;
}
sub pick { # pick a random element from a given bin-ref
return int rand( @{$_[0]->[1]} );
}
sub pick_a_bin { # pick a random element from an array
return int rand( @{$_[0]} );
}
#-- compute some error measure / indication
# (strictly, this is not really a variation-coefficient)
sub error_ind {
my ($deltasum2, $d);
for (@bins) {
$d = ($_->[0] - $_->[2]); # deviation from target
$deltasum2 += ($d * $d); # deviation squared
}
return $deltasum2 / ( $mean * $#bins );
}
#-- swap two bin elements if the swap yields to an improvement
sub swap_bin_element {
my ($b1, $b2) = @_;
return if $b1 == $b2;
my ($i1, $i2) = (pick($b1), pick($b2)); # random strategy
my $last_err = error_ind(); # (todo: candidate for caching)
my $val1 = $b1->[1]->[$i1];
my $val2 = $b2->[1]->[$i2];
$b1->[0] -= $val1;
$b2->[0] -= $val2;
$b1->[0] += $val2;
$b2->[0] += $val1;
if (error_ind() < $last_err) { # improvement --> move (*)
$Improvements++; # we're getting better
($val1) = splice @{$b1->[1]}, $i1 , 1;
($val2) = splice @{$b2->[1]}, $i2 , 1;
push @{$b1->[1]}, $val2;
push @{$b2->[1]}, $val1;
} else { # undo
$b1->[0] -= $val2;
$b2->[0] -= $val1;
$b1->[0] += $val1;
$b2->[0] += $val2;
}
# (*) - nice location to introduce some mutations
# (e.g. ... or rand(1000)<$mutation_threshold ...)
}
sub find_biggest_partition {
## generate / check partitioning / assume exactly two partitions
my $j_max = -1; # max no. of bins with max. capacity
for my $i (0..int( @a/$min_capacity )) {
for my $j (0..int(@a/$max_capacity)) {
$j_max = $j if ( $i * $min_capacity + $j * $max_capacity == @a
+ # covers @a
and $i + $j == $bin_no # sums to bins
and $j > $j_max); # best solution
}
}
die "cannot find a partition using min-/max-capacites "
. "$min_capacity/$max_capacity\n" if $j_max < 0;
return $j_max;
}
sub print_result {
my $runtime = time() - $^T;
my $bin;
print "runtime : ${runtime}s, loops:$loops, improv.:$Improvements\n
+";
print "ranges : low: $low, target: $target_sum, high: $high\n";
print "capacity: min: $min_capacity, meancap: $mean_cap, max: $max
+_capacity\n";
print "error. : ", error_ind()," bins: $bin_no\n";
for (@bins) {
my $sum = $_->[0];
my $err = ( $sum >= $low and $sum <= $high) ? "OK" : "ERR";
printf("%2d) %4d %-4s (sz=%d, trgt=%6.1f, [%3d-%3d]) %s\n",
++$bin,
$sum,
$err,
0+@{$_->[1]},
$_->[2],
$_->[3],
$_->[4],
join(", ", map { sprintf("%4d", $_) } sort { $a <=> $b } @{$_->
+[1]} )
);
}
}
# Update: spelling/grammar/comments
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.