Divide an array into 2 subsets to verify their sum is equal or not.

bimleshsharma has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Divide an array into 2 subsets to verify their sum is equal or not. by choroba (Cardinal) on May 02, 2013 at 07:50 UTC
To sum a list, I used the `sum` from List::Util. I made the following observation: if the array can be split, then the sum of each part is the half of the sum of the whole array. I used vec to generate binary vectors to be used as Indicator function. Checking a half of the possible vectors is enough, the rest is complementary (i.e. the two subsets are swapped). `#!/usr/bin/perl use warnings; use strict; use List::Util qw(sum); sub is_divisible { my $array = shift; my $sum = sum(@$array) / 2; for my $bitmask (1 .. 2 ** $#$array - 1) { return 1 if sum(map { $array->[$_] * vec $bitmask, $_, 1} 0 .. + $#$array) == $sum; } return; } my @arrays = ( [qw(1 3 5 7)], [qw(1 3 8 4)], [qw(1 6 2)], [qw(5 5 4 6 2 8 1 9)], ); for my $array (@arrays) { print "@$array: ", is_divisible($array) ? 'yes' : 'no', "\n"; }` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l] [select]
Re: Divide an array into 2 subsets to verify their sum is equal or not. by BrowserUk (Patriarch) on May 02, 2013 at 09:51 UTC
choroba's Explore-all-possible-combinations mechanism works okay for smallish sets (max:32 or 64 depending upon your Perl), but will get very slow for anything much larger than 20 or so. This will very quickly (less than 0.001 of a second) find a solution, if one exists, for sets of 100s or 1000s of elements. : #! perl -slw use strict; use Time::HiRes qw[ time ]; use List::Util qw[ sum ]; sub partition { my $sum = sum @_; return if $sum & 1; $sum /= 2; my @s = sort{ $b <=> $a } @_; my @a; my( $t, $n ) = ( 0, -1 ); $t + $s[$n] <= $sum and $t+= $s[$n] and push @a, $n while ++$n < @ +s and $t <= $sum; @a = delete @s[ @a ]; @s = grep defined, @s; return unless sum( @a ) == sum( @s ); return \@a, \@s; } our $N //= 64; my( $a, $b ) = partition 1,3,5,7; print "sum( @{ $a } ) == sum( @{ $b } )" if $a; my @set = map int( rand 100 ), 1 .. $N; my $start = time; ( $a, $b ) = partition @set; printf "Took %f seconds\n", time() - $start; if( $a ) { printf "(%u) == sum( @{ $a } ) == sum( @{ $b } )\n", sum @$a; } else { print "No solution existed for the $N element set @set"; } [download] A few runs: <Reveal this spoiler or all in this thread> With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^2: Divide an array into 2 subsets to verify their sum is equal or not. by choroba (Cardinal) on May 02, 2013 at 10:52 UTC
It gives a wrong answer for `2, 12, 4` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. by BrowserUk (Patriarch) on May 02, 2013 at 11:14 UTC
Corrected. Thanks. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re: Divide an array into 2 subsets to verify their sum is equal or not. by kcott (Archbishop) on May 02, 2013 at 10:35 UTC
G'day bimleshsharma, Update: <Reveal this spoiler or all spoilers in this node or all in this thread> Update2: Original solution substantially rewritten! It had various problems: Failed with data like `[ 7, 5, 3, 3]` (see ++roboticus' response) and `[ 8, 4, 4, 7, 6, 3 ]` (see ++hdb's explanation). Contained some fairly crappy code (see ++BrowserUk's questions). Didn't test edge/corner cases such as empty array, single element array, arrays containing only zeros. Here's the original code I posted: Here's my take on a solution. I added some additional test data (one copied from choroba's solution). #!/usr/bin/env perl -l use strict; use warnings; use List::Util qw{sum}; my @test_arrays = ( [1, 3, 8, 4], [1, 6, 2], [1, 3, 5, 7], [4, 3, 2, 2, 1], [4, 3, 2, 2, 2, 2, 1], [5, 5, 4, 6, 2, 8, 1, 9], ); check_arrays($_) for @test_arrays; sub check_arrays { my @full_array = @{shift()}; print "Checking: (", join(", " => @full_array), ")"; my $sum = sum @full_array; if ($sum % 2) { print "\tSubsets not equal"; return; } my $half = $sum / 2; my (@a1, @a2); for (sort { $b <=> $a } @full_array) { push @{(sum(@a1) \|\| 0) + $_ <= $half ? \@a1 : \@a2} => $_; } print "\tSubsets: (", join(", " => sort { $a <=> $b } @a1), ") and (", join(", " => sort { $a <=> $b } @a2), ")"; print "\tSubset sum = $half"; } [download] Output: `$ pm_split_equal_sums.pl Checking: (1, 3, 8, 4) Subsets: (8) and (1, 3, 4) Subset sum = 8 Checking: (1, 6, 2) Subsets not equal Checking: (1, 3, 5, 7) Subsets: (1, 7) and (3, 5) Subset sum = 8 Checking: (4, 3, 2, 2, 1) Subsets: (2, 4) and (1, 2, 3) Subset sum = 6 Checking: (4, 3, 2, 2, 2, 2, 1) Subsets: (1, 3, 4) and (2, 2, 2, 2) Subset sum = 8 Checking: (5, 5, 4, 6, 2, 8, 1, 9) Subsets: (1, 2, 8, 9) and (4, 5, 5, 6) Subset sum = 20` [download] Here's the rewrite: #!/usr/bin/env perl -l use strict; use warnings; use List::Util qw{sum}; use Test::More; my @test_equal_subsets = ( [1, 3, 8, 4], [1, 3, 5, 7], [4, 3, 2, 2, 1], [4, 3, 2, 2, 2, 2, 1], [5, 5, 4, 6, 2, 8, 1, 9], [1, 1], [2, 2], [], [0], [0, 0], [0, 0, 0], [0, 0, 0, 0], ); my @test_unequal_subsets = ( [1, 6, 2], [7, 5, 3, 3], [8, 4, 4, 7, 6, 3], [0, 1], [1, 2], [1], [2], ); plan tests => scalar @test_equal_subsets + scalar @test_unequal_subset +s; for (@test_equal_subsets) { is(check_arrays($_), 1, 'Expecting equal subsets.'); } for (@test_unequal_subsets) { is(check_arrays($_), 0, 'Not expecting equal subsets.'); } sub check_arrays { my $full_array = shift; print "Checking: (", join(", " => @$full_array), ")"; if (! grep { $_ } @$full_array) { print "\tSubsets: (", join(", " => @$full_array), ') and ()'; print "\tSubset sum = 0"; return 1; } my $full_sum = sum @$full_array; if ($full_sum % 2) { print "\tSubsets not equal."; return 0; } my $half_sum = $full_sum / 2; my (@a1, @a2); my $a1_total = 0; for (sort { $b <=> $a } @$full_array) { if ($a1_total + $_ <= $half_sum) { push @a1, $_; $a1_total += $_; } else { push @a2, $_; } } if ($a1_total == $half_sum) { print "\tSubsets: (", join(", " => sort { $a <=> $b } @a1), ") and (", join(", " => sort { $a <=> $b } @a2), ")"; print "\tSubset sum = $half_sum"; return 1; } else { print "\tSubsets not equal."; return 0 } } [download] Here's `prove` output: `$ prove pm_split_equal_sums.pl pm_split_equal_sums.pl .. ok All tests successful. Files=1, Tests=19, 0 wallclock secs ( 0.03 usr 0.01 sys + 0.02 cusr + 0.00 csys = 0.06 CPU) Result: PASS` [download] Here's the full output: $ pm_split_equal_sums.pl 1..19 Checking: (1, 3, 8, 4) Subsets: (8) and (1, 3, 4) Subset sum = 8 ok 1 - Expecting equal subsets. Checking: (1, 3, 5, 7) Subsets: (1, 7) and (3, 5) Subset sum = 8 ok 2 - Expecting equal subsets. Checking: (4, 3, 2, 2, 1) Subsets: (2, 4) and (1, 2, 3) Subset sum = 6 ok 3 - Expecting equal subsets. Checking: (4, 3, 2, 2, 2, 2, 1) Subsets: (1, 3, 4) and (2, 2, 2, 2) Subset sum = 8 ok 4 - Expecting equal subsets. Checking: (5, 5, 4, 6, 2, 8, 1, 9) Subsets: (1, 2, 8, 9) and (4, 5, 5, 6) Subset sum = 20 ok 5 - Expecting equal subsets. Checking: (1, 1) Subsets: (1) and (1) Subset sum = 1 ok 6 - Expecting equal subsets. Checking: (2, 2) Subsets: (2) and (2) Subset sum = 2 ok 7 - Expecting equal subsets. Checking: () Subsets: () and () Subset sum = 0 ok 8 - Expecting equal subsets. Checking: (0) Subsets: (0) and () Subset sum = 0 ok 9 - Expecting equal subsets. Checking: (0, 0) Subsets: (0, 0) and () Subset sum = 0 ok 10 - Expecting equal subsets. Checking: (0, 0, 0) Subsets: (0, 0, 0) and () Subset sum = 0 ok 11 - Expecting equal subsets. Checking: (0, 0, 0, 0) Subsets: (0, 0, 0, 0) and () Subset sum = 0 ok 12 - Expecting equal subsets. Checking: (1, 6, 2) Subsets not equal. ok 13 - Not expecting equal subsets. Checking: (7, 5, 3, 3) Subsets not equal. ok 14 - Not expecting equal subsets. Checking: (8, 4, 4, 7, 6, 3) Subsets not equal. ok 15 - Not expecting equal subsets. Checking: (0, 1) Subsets not equal. ok 16 - Not expecting equal subsets. Checking: (1, 2) Subsets not equal. ok 17 - Not expecting equal subsets. Checking: (1) Subsets not equal. ok 18 - Not expecting equal subsets. Checking: (2) Subsets not equal. ok 19 - Not expecting equal subsets. [download] Update3: Fixed a bug and added some features: <Reveal this spoiler> Update4: Fixed some bugs and changed volume testing. Display issue with subset output. Duplicate data due to "`@a2 ... @a2`" not being "`@a1 ... @a2`". - FIXED. `while` loop in `check_arrays()` had flaws. This has been pretty much rewritten. - FIXED. Volume testing was decimal-based, now it's octal-based. Previously, `--volume_power_max=3` [or `--vpm=3`] created arrays of up to `103` (1,000) elements; now, the value is `83` (512) elements. Decimal-based was a bad choice as neither `1 .. 10` nor `1 .. 100` can be split into two equal portions. Added a few more tests. Here's Update4's version of `pm_split_equal_sums.pl`: #!/usr/bin/env perl -l use strict; use warnings; use List::Util qw{first sum}; use Test::More; use Time::HiRes qw{time}; use Getopt::Long; my %opt = ( test_more => 1, time_hires => 1, volume_tests => 0, volume_power_max => 3, array_limit => 3, ); GetOptions(map { join('\|' => @{[join '' => /(?>^\|_)([a-z])/gi]}, $_) . ':i' => \$op +t{$_} } keys %opt); my $test_equal_subsets = [ [1, 3, 8, 4], [1, 3, 5, 7], [4, 3, 2, 2, 1], [4, 3, 2, 2, 2, 2, 1], [5, 5, 4, 6, 2, 8, 1, 9], [8, 4, 4, 7, 6, 3], [1, 1], [2, 2], [], [0], [0, 0], [0, 0, 0], [0, 0, 0, 0], [ (1) x 100 ], [ 1 .. 1000 ], ]; my $test_unequal_subsets = [ [1, 6, 2], [7, 5, 3, 3], [1, 2 ,3, 7], [0, 1], [1, 2], [1], [2], [8, 1, 2, 3], [ 1 .. 10 ], [ 1 .. 100 ], ]; if ($opt{volume_tests}) { for (1 .. $opt{volume_power_max}) { my @volume = map { (($_), ($_)) } 1 .. 8$_ / 2; push @$test_equal_subsets, [@volume]; push @$test_unequal_subsets, [@volume, 8(2 * $_)]; } } if ($opt{test_more}) { plan tests => scalar @$test_equal_subsets + scalar @$test_unequal_ +subsets; } my @expectations = ('Not expecting equal subsets.', 'Expecting equal s +ubsets.'); my @subsets_data = ([$test_unequal_subsets, 0, 0], [$test_equal_subset +s, 1, 1]); for (@subsets_data) { my ($subsets, $expect_code, $expect_name_index) = @$_; my $expect_name = $expectations[$expect_name_index]; for (@$subsets) { my $start = time if $opt{time_hires}; if ($opt{test_more}) { is(check_arrays($_), $expect_code, $expect_name); } else { check_arrays($_); } printf "Took %f seconds\n", time() - $start if $opt{time_hires +}; } } sub check_arrays { my $full_array = shift; print 'Checking: (', array_string($full_array), ')'; if (! grep { $_ } @$full_array) { print "\tSubsets: (", array_string($full_array), ') and ()'; print "\tSubset sum = 0"; return 1; } my $full_sum = sum @$full_array; if ($full_sum % 2) { print "\tSubsets not equal: sum of starting array is odd ($ful +l_sum)."; return 0; } my $half_sum = $full_sum / 2; my @sorted_array = sort { $b % 2 <=> $a % 2 \|\| $b <=> $a } @$full_ +array; if (my $big = first { $_ > $half_sum } @sorted_array) { print "\tSubsets not equal: element ($big) larger than sum of +rest."; return 0; } my (@a1, @a2); my $total = 0; while (@sorted_array) { push @a1, shift @sorted_array; $total += $a1[$#a1]; @sorted_array = map { $total + $_ <= $half_sum ? do { push @a1, $_; $total += $_; () } : $_ } @sorted_array; if ($total == $half_sum) { (@a2, @sorted_array) = (@a2, @sorted_array); } else { push @a2, pop @a1 if @a1; } } if ($total == $half_sum) { print "\tSubsets: (", array_string([sort { $a <=> $b } @a1]), +')'; print "\t and (", array_string([sort { $a <=> $b } @a2]), +')'; print "\tSubset sum = $half_sum"; return 1; } else { print "\tSubsets not equal: no solution found."; return 0 } } sub array_string { my $array = shift; return join(', ' => @$array > 3 * $opt{array_limit} ? ( @$array[0 .. $opt{array_limit} - 1], " ... [snip: @{[@$array - 2 * $opt{array_limit}]} elements +] ...", @$array[@$array - $opt{array_limit} .. $#$array] ) : @$array); } [download] Here's a test run. Note that this uses `--vpm=8` and final volume test "`Took 89.489836 seconds`" — you might want to start with a lower value. <Reveal this spoiler> -- Ken	[reply] [d/l] [select]
Re^2: Divide an array into 2 subsets to verify their sum is equal or not. by hdb (Monsignor) on May 02, 2013 at 11:08 UTC
It does not work for `[ 8, 4, 4, 7, 6, 3 ]`. This is an NP-complete problem http://en.wikipedia.org/wiki/Knapsack_problem. Which means there is no (known) solution in polynomial time. Applies to BrowserUK's solution as well.	[reply] [d/l]
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. by BrowserUk (Patriarch) on May 02, 2013 at 11:29 UTC
This is an NP-complete problem Only true if the OP was looking for an optimum solution. He isn't: Subset size is not matter, Behooves you to read the actual question. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. by BrowserUk (Patriarch) on May 02, 2013 at 11:23 UTC
Applies to BrowserUK's solution as well. Look again :) #! perl -slw use strict; use Time::HiRes qw[ time ]; use List::Util qw[ sum ]; sub partition { my $sum = sum @_; return if $sum & 1; $sum /= 2; my @s = sort{ $b <=> $a } @_; my @a; my( $t, $n ) = ( 0, -1 ); $t + $s[$n] <= $sum and $t+= $s[$n] and push @a, $n while ++$n < @ +s and $t <= $sum; @a = delete @s[ @a ]; @s = grep defined, @s; return unless sum( @a ) == sum( @s ); return \@a, \@s; } my $start = time; my( $a, $b ) = partition 8, 4, 4, 7, 6, 3; my @set = map int( rand 100 ), 1 .. $N; printf "Took %f seconds\n", time() - $start; if( $a ) { printf "(%u) == sum( @{ $a } ) == sum( @{ $b } )\n", sum @$a; } else { print "No solution existed for 8, 4, 4, 7, 6, 3"; } __END__ No solution existed for 8, 4, 4, 7, 6, 3 Took 0.000258 seconds [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. div	[reply] [d/l]
Re^4: Divide an array into 2 subsets to verify their sum is equal or not. by hdb (Monsignor) on May 02, 2013 at 11:27 UTC
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. by BrowserUk (Patriarch) on May 02, 2013 at 11:32 UTC
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by LanX (Saint) on May 02, 2013 at 22:08 UTC
> This is an NP-complete problem http://en.wikipedia.org/wiki/Knapsack_problem Yes, to be precise a sub class known as "Partition Problem". See WP article for some efficient algorithms and further links. I wonder who and why is posting well known scientific problems w/o references ...? Cheers Rolf ( addicted to the Perl Programming Language)	[reply]
Re^4: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by hdb (Monsignor) on May 03, 2013 at 08:14 UTC
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. (NP != unsolvable)) by BrowserUk (Patriarch) on May 03, 2013 at 10:35 UTC
Some notes below your chosen depth have not been shown here
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by kcott (Archbishop) on May 03, 2013 at 18:54 UTC
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by LanX (Saint) on May 03, 2013 at 11:47 UTC
Re^4: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by hdb (Monsignor) on May 03, 2013 at 05:52 UTC
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by LanX (Saint) on May 03, 2013 at 11:41 UTC
Some notes below your chosen depth have not been shown here
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. (Partition Problem) by bimleshsharma (Beadle) on May 03, 2013 at 06:48 UTC
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. by kcott (Archbishop) on May 02, 2013 at 21:12 UTC
++ Thanks. I've rewritten the solution to handle those sorts of cases. -- Ken	[reply]
Re^4: Divide an array into 2 subsets to verify their sum is equal or not. by LanX (Saint) on May 02, 2013 at 22:29 UTC
Re^5: Divide an array into 2 subsets to verify their sum is equal or not. by kcott (Archbishop) on May 02, 2013 at 22:47 UTC
Re^2: Divide an array into 2 subsets to verify their sum is equal or not. by roboticus (Chancellor) on May 02, 2013 at 11:00 UTC
kcott: `[ 7, 5, 3, 3]` ;^) ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply] [d/l]
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. by kcott (Archbishop) on May 02, 2013 at 21:08 UTC
++ Thanks. That node was something of a minor disaster. I've rewritten the solution. -- Ken	[reply]
Re^2: Divide an array into 2 subsets to verify their sum is equal or not. by BrowserUk (Patriarch) on May 02, 2013 at 11:10 UTC
A couple of questions: Why pass in a reference if the first thing you are going to do is copy the reference array to a local array? `sub check_arrays { my @full_array = @{shift()};` [download] Why make a local copy of the array at all, when all the uses (join, sum, sort) of it require you to pass a list? Ie. Why not `my $sum = sum @$ref;` etc. Isn't re-summing your partial array over and over wildly inefficient? With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^3: Divide an array into 2 subsets to verify their sum is equal or not. by kcott (Archbishop) on May 02, 2013 at 21:19 UTC
++ Thanks. All your points are perfectly valid. There were other issues as well. I've substantially rewritten the solution. -- Ken	[reply]
Re: Divide an array into 2 subsets to verify their sum is equal or not. by hdb (Monsignor) on May 02, 2013 at 09:47 UTC
Using recursion: use strict; use warnings; use List::Util qw/sum/; # finds one solution sub findsum { my ($target, @array) = @_; while( @array ) { my $cand = shift @array; return () if $cand > $target; return ( $cand ) if $cand==$target; my @sol = findsum( $target-$cand, @array ); return ($cand, @sol) if @sol; } return (); } my @array = qw(1 3 5 7); my $total = sum(@array); die "Odd total $total cannot be split!\n" if $total % 2; my @sol = findsum( $total/2, sort @array ); if( @sol ) { print "Solution: ",join( ",", @sol), "\n"; } else { print "No solution."; } [download]	[reply] [d/l]
Re: Divide an array into 2 subsets to verify their sum is equal or not. by davido (Cardinal) on May 03, 2013 at 16:10 UTC
Here's my try, using Algorithm::Bucketizer: use strict; use warnings; use Algorithm::Bucketizer; use List::Util qw( sum ); use POSIX qw( ceil ); my @arrays = ( [ qw( 1 3 8 4 ) ], [ qw( 1 6 2 ) ], [ qw( 1 3 5 7 ) ], ); foreach my $array ( @arrays ) { print "( @{$array} ) can ", can_evenly_distribute( @{$array} ) ? '' : 'not ', "be evenly distributed.\n"; } sub can_evenly_distribute { my @elements = @_; my $b_size = ceil( sum( @elements ) / 2 ); my $b = Algorithm::Bucketizer->new( bucketsize => $b_size, algorithm => 'retry' ); $b->add_item( $_, $_ ) foreach @elements; $b->optimize( algorithm => 'random', maxrounds => @elements * 10 ); my @buckets = $b->buckets; return @buckets == 2 && sum( $buckets[0]->items ) == sum( $buckets[1]->items ); } [download] I haven't found any cases where it fails to return the correct answer... but having just said that, someone will probably find one. ;) Dave	[reply] [d/l]


Problems? Is your data what you think it is?
	PerlMonks