http://www.perlmonks.org?node_id=834698

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

gskoli posted Help Me in following code ::, but the question wasn't very clear due to english not being his first language. And with the posted code being rather long, it didn't receive much attention. We've now clarified what he needs, and I'm posting this on his behalf to try and get him an answer.

He has the following array of values:

my @a = qw( 121 182 111 160 105 113 121 97 123 157 133 161 141 135 137 145 133 137 151 118 126 141 174 181 154 109 198 114 122 162 91 99 116 122 195 199 150 192 163 88 112 157 182 210 124 105 144 166 144 257 164 156 173 154 193 142 143 126 118 130 107 86 131 154 131 147 134 118 115 135 141 158 129 143 126 128 134 129 167 130 135 117 127 146 96 117 99 99 139 152 149 136 105 124 136 160 160 139 177 115 123 103 150 183 132 171 121 114 111 113 131 144 122 141 111 139 145 109 114 122 103 160 153 147 172 155 122 296 124 112 161 124 311 99 157 122 120 198 152 140 162 177 98 138 156 177 103 180 187 173 150 135 168 132 196 112 195 126 113 116 105 116 151 216 188 158 121 166 148 132 89 197 92 115 98 130 103 120 261 143 126 167 203 95 165 129 );

And he wishes to pack them into groups of 6, (or 5 or 4 at a pinch), such that each group totals to between 840 and 900 inclusive.

It is a variation on the bin packing problem, that is not catered for by either of the modules I've looked at. Algorithm::BinPack & Algorithm::Bucketize.

Can any of you CS guys offer him a solution? Canned or otherwise.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re: Bin packing problem variation repost (see[834245])
by jethro (Monsignor) on Apr 14, 2010 at 16:22 UTC

    I assume a solution should have no leftovers, i.e. every number should be in exactly one group and only valid groups (with totals in the range 840 to 900) should exist.

    One question is whether it can be assumed that such a solution always exists? In any case there should be a time limit after which the program stops and declares failure. My guess is that the more numbers are given the less likely it is that no solution exists. But you can always construct a not working data sample of arbitrary length, for example the array (101)x $n for any value of $n

    It sounds to me like a genetic algorithm or at least a stochastic algorithm (or whatever it is called in CS) might be good at this:

    Basically you create an initial packing (without looking really for a correct solution), then adapt that packing randomly but sensibly until you find a solution. A real genetic algorithm would work with many packings and cross the "best" of them. A simpler stochastic algorithm would only work on one packing but switch between incremental optimization (to get to a local "best" packing) and random destruction (to get out of local paths without a solution).

    I began with the programming but now I have to do something else, maybe I can finish the script tonight. There are "only" a few subs to fill out if anyone wants to do it. I post it here to show how it can be done, read the comments in the main loop

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @a = qw( 121 182 111 160 105 113 121 97 123 157 133 161 141 135 137 145 133 137 151 118 126 141 174 181 154 109 198 114 122 162 91 99 116 122 195 199 150 192 163 88 112 157 182 210 124 105 144 166 144 257 164 156 173 154 193 142 143 126 118 130 107 86 131 154 131 147 134 118 115 135 141 158 129 143 126 128 134 129 167 130 135 117 127 146 96 117 99 99 139 152 149 136 105 124 136 160 160 139 177 115 123 103 150 183 132 171 121 114 111 113 131 144 122 141 111 139 145 109 114 122 103 160 153 147 172 155 122 296 124 112 161 124 311 99 157 122 120 198 152 140 162 177 98 138 156 177 103 180 187 173 150 135 168 132 196 112 195 126 113 116 105 116 151 216 188 158 121 166 148 132 89 197 92 115 98 130 103 120 261 143 126 167 203 95 165 129 ); # count the sum of all numbers in a and derive from that the minimum a +nd maximum number of packs my $sum; $sum+= $_ for @a; my $minimum= int(($sum-1)/900)+1; my $maximum= int($sum/840); print "Sum is $sum, minimal $minimum packs, maximal $maximum packs\n"; die "No solution possible\n" if ($maximum<1); # put numbers into packs (an array of arrays) without looking much at +correctness my @packs=[]; my @packsum; while (@a) { $_= shift @a; push @{$packs[$#packs]}, $_; $packsum[$#packs]+= $_; if ($packsum[$#packs]>840 and @a) { push @packs, []; } } # print Dumper(\@packs,\@packsum); # now the loop with optimization and destruction my $steps; my $wanttogrow=0; while (1) { # if not enough packs, remove values from other packs to create a +new one while (@packs<$minimum) { grow_another_pack(\@packs,\@packsum); } # too much packs can't happen, we avoid this #switch two numbers, if possible between an overfilled pack and an + underfilled pack optimized_switching(\@packs,\@packsum); if (finished(\@packsum)) { last }; # if we did lots of incremental optimization without success, turn + to destruction $steps++; if ($wanttogrow) { if (@packs==$maximum) { $wanttogrow= 0; } else { grow_another_pack(\@packs,\@packsum); } } else { if (@packs==$minimum) { $wanttogrow= 1; } else { remove_a_pack(\@packs,\@packsum); } } #just randomly switch some numbers, check for success even though +unlikely if (random_switching(\@packs,\@packsum)) { last }; if (finished(\@packsum)) { last }; }
      Thanks
      Also try to show index of picked number
        And what will be left for you? :) You can even parse the solution and compute the indexes back.
Re: Bin packing problem variation repost (see[834245])
by MidLifeXis (Monsignor) on Apr 14, 2010 at 15:24 UTC

    Which is most important? Having 6 bins, or being within 840 and 900 inclusive (allowing 4 or 5 bins)?

    This brings up a few couple solutions:

    • Optimum solution: a minimum of ceil(scalar(@a) / 6) bins if possible - potentially exhaustive search - minimizing the empty slots across all bins
    • "Easy" solution: add items to bucket until 840 <= sum(@bucket) <= 900 and 4 <= scalar(@bucket) <= 6

    There may be a couple of other interpretations of this as well, but I think that these are the two that make sense, with the first, as I read it, being the most likely expensive.

    My wife is walking for a cure for MS. Please consider supporting her.

      My interpretation (gskoli can correct me if I got it wrong, when he is around), is that the OP would prefer 31 groups of 6, but if that is impossible--or perhaps if it would take/is taking too long to find--he would accept a fw groups of 5 or 4 to use up the remaining numbers.

      At this point, it is not entirely clear to me if the dataset is a one off? Or if it will vary in size etc.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Bin packing problem variation repost (see[834245])
by Fletch (Bishop) on Apr 14, 2010 at 15:25 UTC

    For some reason the partitioning example from the first chapter of Higher Order Perl (section 1.8.2, p35 in the PDF version) comes to mind.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      It not clear to me how to extend the partitioning (2 equal) solution to handle 31 partitions?

      The 3 partition problem is already much harder than the 2 partition. I suspect that extending that to 31 partitions would be impractical.

        Oop, you're quite right. I'd remembered it was about splitting lists of items but wasn't thinking that it's only into two groupings.

        Never mind.

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Bin packing problem variation repost (see[834245])
by SuicideJunkie (Vicar) on Apr 14, 2010 at 16:01 UTC

    Given that you are allowed a reasonable range on the total values, a looser algorithm may be appropriate.

    Try shuffling the input values, then assign them to groups arbitrarily. Sort the groups by total. Then make some trades between the highest groups and the lowest groups. (If most/all of the groups are too high, add a group with 6 zeroes in it to bring the average down and create some 4/5 item groups after the trading is done)

    If there are often many solutions, this should find one of them pretty quickly. Worst case behavior is horrible, but depending on the input you may never see such a case. If you do end up with a bad problem, you can always bail out to try something more rigorous if it fails to make progress.

Re: Bin packing problem variation repost (see[834245])
by Old_Gray_Bear (Bishop) on Apr 14, 2010 at 18:05 UTC
    This sounds like a variation on one of Dr. R. E.D. Woolsey's scheduling problems -- given a list of tasks requiring a constrained resource, what is an optimal schedule to get the most bang for the buck.

    The original is set in terms of a machine-shop with five heavy drill-presses: build the proper order of the incoming work so as to keep the presses always busy during the shift (down-tune is to be minimized). But, the presses can not be run for more than seven hours per shift with out maintenance.

    The approach (if memory serves) is:

    1. Sort the list in descending order of the amount of resource needed.
    2. Remove the first N items and assign them to the each of the N buckets.
    3. Iteration Loop:
      1. The top work-item is removed from the list, checked against the existing list of buckets, and added to the first bucket that can contain it, subject to the constraint.
      2. If the work-item won't fit any bucket in the current bucket set, then the item starts a new starts a new bucket.
    4. Repeat until the work-list is exhausted.
    At the end of the process, you have a list of K buckets with M items in each bucket.

    If K is greater than N, the number of servers (machines), then you a) have more work than you can handle and b) a measure of the value of having another server available.

    For each bucket, the sum of the resource consumed by the M items gives you that amount of additional work you could perform before running out of gas (the 'head-room').

    In addition to giving you an optimal schedule for your servers, this gives you ammunition when Management asks pointed questions. ("Do you really need so many servers?", "Do we have enough capacity?", "How many more servers will we need to add if we double the work load?", etc)

    ----
    I Go Back to Sleep, Now.

    OGB

      Simple greedy algorithms like that have a lot of ways to fail:

      Example 1: 5,3,3,3,2,2 with capacity 9

      Greedy algorithm gets: [5+3] [3+3+2] [2] Optimal is instead: [3+3+3] [5+2+2]

      The problem in general is not easy!

      Additionally, the OP problem involves different requirements; having a small number of large items in the first few bins and a whole lot of small items in the last few bins is undesirable.

Re: Bin packing problem variation repost (see[834245])
by choroba (Cardinal) on Apr 17, 2010 at 15:22 UTC

    After reading jethro's reply, I tried kind of a random solution. It is far from complete, but at least finds a solution to this particular problem :)

    And one of the solutions:

    840 [103,118,130,143,158,188] 840 [88,261,154,123,116,98] 841 [156,181,139,126,112,127] 842 [165,134,135,124,116,168] 842 [174,112,162,107,146,141] 842 [89,124,137,152,173,167] 844 [182,155,122,137,126,122] 846 [103,118,130,143,160,192] 846 [113,124,136,151,171,151] 847 [103,118,130,143,160,193] 847 [177,141,136,97,139,157] 848 [203,117,149,109,129,141] 849 [115,140,156,180,92,166] 850 [142,177,134,111,122,164] 850 [95,126,139,177,147,166] 853 [103,120,131,144,160,195] 853 [216,96,147,99,141,154] 854 [99,210,114,111,133,187] 855 [105,120,131,144,160,195] 855 [128,199,133,158,138,99] 856 [113,124,152,150,167,150] 856 [296,99,117,113,115,116] 858 [105,121,131,144,161,196] 858 [112,123,135,311,86,91] 860 [154,111,257,114,98,126] 860 [182,114,135,157,115,157] 861 [105,121,132,145,161,197] 862 [153,173,122,122,129,163] 862 [198,150,126,132,135,121] 863 [105,121,132,145,162,198] 863 [183,109,172,148,122,129]

    Update 1: The code sometimes gives no output, entering an endless loop. Most of the times, though, it finds a solution.

    Update 2: To overcome the deadlock, you can randomly pick some other group instead of the largest/smallest, i.e. change the code at the end of the loop to something like this:

    ($minsum,$maxsum) = ($max,$min); foreach my $gid (keys %groups){ if($groups{$gid}{sum} < $minsum and rand(100)<99){ $minsum = $groups{$gid}{sum}; $smallest_gid = $gid; } if($groups{$gid}{sum} > $maxsum and rand(100)<99){ $maxsum = $groups{$gid}{sum}; $largest_gid = $gid; } }

    Update 3: Code updated to handle the random escapes from the deadlock.

Re: Bin packing problem variation repost (see[834245])
by Perlbotics (Archbishop) on Apr 17, 2010 at 19:02 UTC

    Hi, TIMTOWTDI.
    This approach uses two phases:

    Phase-1: Find a combination of bins with at maximum two capacity classes. Ideally use minimum number of bins with maximum capacity. If that doesn't work, find greatest number of bins with maximum capacity and the rest of the bins will have minimum capacity (choose min-capacity as big as possible - by configuration).
    E.g., 31 bins: 31 bins a 6 capacity = 186 elements. 32 bins: 26 bins a 6 + 6 bins a 5 = 186 elements.
    (I understand that it is the OP's intention to keep the bin capacity maximised and homogeneous.)
    After this phase the bin geometry is fixed (e.g. use bins with capacity 5-6) and the bins will be filled naively - without respecting the second constraint. That will be optimised in ...

    Phase-2: Compute an error measure that identifies how good the current solution is. Now select groups of bins that have a sum that is too big or too small. Pick random elements from these groups and cross-exchange items iff the error measure will decrease. Fall back to any group if there is no group that satisfies the constraint (i.e. sum too big/small). Stop if all restrictions are satisfied. Periodically, check how far we are. Stop, if the solution has not improved. Instead of stopping the simulation, one could introduce a little mutation and increase the error by randomly swapping elements. However, I noticed that it is sufficient to restart the simulation with a new set of freshly shuffled items...
    Concerning the target value to optimse for (e.g. sum of a bin shall be 852), I used something along bin-size * mean value. When there are two capacity classes, the mean capacity is used. This leads to an evenly distributed sum per bin even when two bin-capacities are in use (e.g. smaller bins get less but bigger items).
    I hope, these assumptions were made in the OP's intention...

    Here's a sample output:
    partitions: 31x6 0x6 mean-cap. : 6.000 sum : 26403 target-sum: 852 mean : 141.951612903226 inital error: 67.0183766996175 === SOLUTION FOUND === runtime : 0s, loops:144, improv.:57 ranges : low: 840, target: 852, high: 900 capacity: min: 6, meancap: 6, max: 6 error. : 0.593394689997349 bins: 31 1) 858 OK (sz=6, trgt= 852.0, [840-900]) 123, 131, 132, 137 +, 137, 198 2) 856 OK (sz=6, trgt= 852.0, [840-900]) 121, 123, 127, 133 +, 157, 195 3) 842 OK (sz=6, trgt= 852.0, [840-900]) 107, 109, 134, 157 +, 164, 171 4) 855 OK (sz=6, trgt= 852.0, [840-900]) 126, 128, 136, 141 +, 147, 177 5) 848 OK (sz=6, trgt= 852.0, [840-900]) 98, 121, 122, 147 +, 173, 187 6) 844 OK (sz=6, trgt= 852.0, [840-900]) 112, 129, 129, 152 +, 154, 168 7) 849 OK (sz=6, trgt= 852.0, [840-900]) 117, 129, 139, 144 +, 160, 160 8) 858 OK (sz=6, trgt= 852.0, [840-900]) 88, 91, 99, 141 +, 143, 296 9) 845 OK (sz=6, trgt= 852.0, [840-900]) 95, 117, 136, 139 +, 142, 216 10) 844 OK (sz=6, trgt= 852.0, [840-900]) 105, 121, 133, 148 +, 154, 183 11) 845 OK (sz=6, trgt= 852.0, [840-900]) 113, 114, 132, 151 +, 153, 182 12) 848 OK (sz=6, trgt= 852.0, [840-900]) 99, 122, 124, 157 +, 158, 188 13) 855 OK (sz=6, trgt= 852.0, [840-900]) 105, 116, 150, 151 +, 161, 172 14) 876 OK (sz=6, trgt= 852.0, [840-900]) 130, 135, 138, 143 +, 149, 181 15) 856 OK (sz=6, trgt= 852.0, [840-900]) 92, 126, 139, 140 +, 167, 192 16) 843 OK (sz=6, trgt= 852.0, [840-900]) 99, 111, 112, 116 +, 144, 261 17) 871 OK (sz=6, trgt= 852.0, [840-900]) 103, 109, 113, 131 +, 158, 257 18) 857 OK (sz=6, trgt= 852.0, [840-900]) 120, 122, 143, 152 +, 154, 166 19) 851 OK (sz=6, trgt= 852.0, [840-900]) 103, 105, 124, 166 +, 173, 180 20) 852 OK (sz=6, trgt= 852.0, [840-900]) 113, 122, 131, 135 +, 141, 210 21) 850 OK (sz=6, trgt= 852.0, [840-900]) 103, 116, 126, 161 +, 167, 177 22) 858 OK (sz=6, trgt= 852.0, [840-900]) 96, 132, 134, 160 +, 162, 174 23) 850 OK (sz=6, trgt= 852.0, [840-900]) 98, 130, 145, 150 +, 150, 177 24) 844 OK (sz=6, trgt= 852.0, [840-900]) 112, 121, 124, 145 +, 146, 196 25) 841 OK (sz=6, trgt= 852.0, [840-900]) 115, 118, 122, 124 +, 165, 197 26) 841 OK (sz=6, trgt= 852.0, [840-900]) 114, 130, 135, 144 +, 156, 162 27) 840 OK (sz=6, trgt= 852.0, [840-900]) 97, 111, 118, 126 +, 193, 195 28) 852 OK (sz=6, trgt= 852.0, [840-900]) 99, 122, 126, 160 +, 163, 182 29) 844 OK (sz=6, trgt= 852.0, [840-900]) 86, 105, 115, 141 +, 198, 199 30) 872 OK (sz=6, trgt= 852.0, [840-900]) 103, 120, 135, 155 +, 156, 203 31) 858 OK (sz=6, trgt= 852.0, [840-900]) 89, 111, 114, 115 +, 118, 311
Re: Bin packing problem variation repost (see[834245])
by duelafn (Parson) on Apr 15, 2010 at 12:04 UTC

    Not sure if the naïve approach is "Good Enough", but here it is for reference. I sort the list; pack one item into each bin (of 31); then reverse the bins for the next pass, thus hoping to balance out the weight.

    Results:

    Bin 1 (946): 311 151 151 124 123 86 Bin 2 (933): 296 152 150 124 123 88 Bin 3 (898): 261 152 150 124 122 89 Bin 4 (897): 257 153 150 124 122 91 Bin 5 (859): 216 154 149 126 122 92 Bin 6 (855): 210 154 148 126 122 95 Bin 7 (848): 203 154 147 126 122 96 Bin 8 (846): 199 155 147 126 122 97 Bin 9 (845): 198 156 146 126 121 98 Bin 10 (845): 198 156 145 127 121 98 Bin 11 (847): 197 157 145 128 121 99 Bin 12 (846): 196 157 144 129 121 99 Bin 13 (844): 195 157 144 129 120 99 Bin 14 (845): 195 158 144 129 120 99 Bin 15 (845): 193 158 143 130 118 103 Bin 16 (846): 192 160 143 130 118 103 Bin 17 (842): 188 160 143 130 118 103 Bin 18 (840): 187 160 142 131 117 103 Bin 19 (837): 183 160 141 131 117 105 Bin 20 (836): 182 161 141 131 116 105 Bin 21 (837): 182 161 141 132 116 105 Bin 22 (837): 181 162 141 132 116 105 Bin 23 (836): 180 162 140 132 115 107 Bin 24 (836): 177 163 139 133 115 109 Bin 25 (837): 177 164 139 133 115 109 Bin 26 (840): 177 165 139 134 114 111 Bin 27 (837): 174 166 138 134 114 111 Bin 28 (836): 173 166 137 135 114 111 Bin 29 (837): 173 167 137 135 113 112 Bin 30 (835): 172 167 136 135 113 112 Bin 31 (835): 171 168 136 135 113 112

    And the code:

    Good Day,
        Dean

      I haven't run your code, but as I see no reference to either 840 or 900 within it, I somehow think you missed some of the requirements?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Here's some code using randomizations and a modified "best fit" algorithm. It enforces the 900 cap, but still ignores the 840 minimum (mostly out of necessity). The naïve approach above has better load balance than I have been able to achieve here (and is also simpler and faster).

        Update: Added some more packing algorithms (though, "next fit" and "worst fit" are unlikely to produce any results)

        #!/usr/bin/perl use strict; use warnings; use 5.010; use POSIX 'ceil'; use List::Util qw/ sum shuffle min /; my @a = sort { $b <=> $a } qw( 121 182 111 160 105 113 121 97 123 157 133 161 141 135 137 145 133 1 +37 151 118 126 141 174 181 154 109 198 114 122 162 91 99 116 122 195 19 +9 150 192 163 88 112 157 182 210 124 105 144 166 144 257 164 156 173 154 1 +93 142 143 126 118 130 107 86 131 154 131 147 134 118 115 135 141 158 1 +29 143 126 128 134 129 167 130 135 117 127 146 96 117 99 99 139 152 149 + 136 105 124 136 160 160 139 177 115 123 103 150 183 132 171 121 114 111 +113 131 144 122 141 111 139 145 109 114 122 103 160 153 147 172 155 122 +296 124 112 161 124 311 99 157 122 120 198 152 140 162 177 98 138 156 17 +7 103 180 187 173 150 135 168 132 196 112 195 126 113 116 105 116 151 216 +188 158 121 166 148 132 89 197 92 115 98 130 103 120 261 143 126 167 203 + 95 165 129 ); my $MAX_ITEMS = 6; my $bins = try_hard( ceil(@a/6), 900, \@a, \&pack_best_fit, 5000 ); show_bins( $bins ); sub min_bin { min(map sum(@$_), @{$_[0]}) } sub show_bins { my $bins = shift; my $i = 1; for (@$bins) { my $size = sum @$_; say "Bin $i ($size): @$_"; $i++; } } sub try_hard { my ($bins, $size, $data, $sub, $attempts) = @_; my $best; my $max; for (1..$attempts) { my $new = $sub->($bins, $size, $data); next unless @$new == $bins; my $test = min_bin($new); next if $max and $test <= $max; $best = $new; $max = $test; } continue { @$data = shuffle @$data } return $best; } sub pack_best_fit { my ($n, $size, $data) = @_; my @bins = map [], 1..$n; our @free = map $size, 1..$n; for my $item (@$data) { my $i = [-1, 1+$size]; for (0..$#free) { next unless $free[$_] >= $item and @{$bins[$_]} < $MAX_ITEMS; @$i = ($_, $free[$_]) if $free[$_] < $$i[1]; } $i = $$i[0]; if ($i >= 0) { push @{$bins[$i]}, $item; $free[$i] -= $item; } else { push @bins, [$item]; push @free, $size-$item; } } return \@bins; } sub pack_next_fit { my ($n, $size, $list) = @_; my @bins = ([]); my $free = $size; for my $item (@$list) { if ($free >= $item and @{$bins[-1]} < $MAX_ITEMS) { push @{$bins[-1]}, $item; $free -= $item; } else { push @bins, [$item]; $free = $size-$item; } } return \@bins; } sub pack_first_fit { my ($n, $size, $list) = @_; my @bins = ([]); my @free = ($size); for my $item (@$list) { my $i; for (0..$#free) { if ($free[$_] >= $item and @{$bins[$_]} < $MAX_ITEMS) { $i = $_; last; } } if (defined $i) { push @{$bins[$i]}, $item; $free[$i] -= $item; } else { push @bins, [$item]; push @free, $size-$item; } } return \@bins; } sub pack_worst_fit { my ($n, $size, $data) = @_; my @bins = map [], 1..$n; our @free = map $size, 1..$n; for my $item (@$data) { my $i = 0; for (1..$#free) { $i = $_ if $free[$_] > $free[$i]; } if ($free[$i] >= $item and @{$bins[$i]} < $MAX_ITEMS) { push @{$bins[$i]}, $item; $free[$i] -= $item; } else { push @bins, [$item]; push @free, $size-$item; } } return \@bins; }

        Good Day,
            Dean

        No, I saw those. I just chose to ignore them :) ... In particular, I did some additional tests using randomization with the traditional bin packing algorithms and found that the less-packed bins were typically filled to only 760 or so. Additionally, since sum(@a)/31 = 851.7, strictly enforcing the 840-900 limits is difficult/unlikely. Thus, I took the 840-900 limits to really mean "as balanced as possible", hence my solution.

        Good Day,
            Dean

Re: Bin packing problem variation repost (see[834245])
by ambrus (Abbot) on Apr 15, 2010 at 14:03 UTC

      I don't think brute force combinatorics is going to work. I get the number of groupings to be checked as: 53,011,617,022

Re: Bin packing problem variation repost (see[834245])
by Limbic~Region (Chancellor) on Apr 17, 2010 at 22:40 UTC
    BrowserUk,
    I have skim read the referenced thread and these are apparently random numbers which implies a normal distribution as many values above the mean as below the mean. Ignoring the requirement from the other thread about not having duplicates in any of the groups, I would apply a similar technique as I did in Re: Average Price Algorithm. Essentially:
    1. Determine the average of the entire list
    2. Start filling buckets of 6 picking the value in the list that brings the average of the group closest to the target average without going over the max for the group

    As it turns out, the average * 6 = 851.71. This allows a little wiggle room. If there are left overs, you can spend some time swapping things around but also limit how many total iterations or time you spend swapping. I would code this up if I had time but I don't tonight. The code from the referenced node should make it clear the intent though.

    Cheers - L~R

      Ignoring the requirement from the other thread about not having duplicates in any of the groups,

      If you re-read the other thread, the OP was concerned that no index was re-used. That is, whilst there are only 94 values, there are 186 indexes and each index should only be used once.

      To simplify in the traditional way. There are 186 balls each with a number on that must be distributed into 31 boxes, 6 per box, such that the sums of the values on the balls in each box is between 840-900 inclusive.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        BrowserUk,
        I am just now getting a chance to look at this. While I haven't had an opportunity to implement my heuristic algorithm outlined above, brute forcing it should be in the realm of possibility.
        186 Choose 6 = 53,011,617,022

        See Arbitrarily Nested Loops where I indicated I could get 25-30 million iterations per second using a modest windows machine (albeit in C). That makes checking all possible combinations in about 35 minutes. Of course, not all possible combinations need to be checked because every item you choose reduces the possible choices (can't undershoot or overshoot the target sum). I am a terrible C programmer and could not make a generic interface though ikegami gave it a go.

        Since I don't really understand the OP's objective (I readily admit to having only skim read the original thread), I am not sure what approach to go with. My inclination would be a heuristic solution that almost always produces an acceptable solution though not always the best versus and exhaustive search that short circuits when it has found an acceptable solution or an exhaustive search that finds all possible acceptable solutions.

        Cheers - L~R