http://www.perlmonks.org?node_id=1003561


in reply to Mark Jason Dominus And Me - The Partition Problem

In your code, try modifying $target to 12 and @treasures to qw(1 2 4 8 16 32). The result is that it fails to figure out that the answer is 4 + 8. (And then trips another bug on the way to reporting its wrong answer.)

This is not an incidental mistake. The fundamental problem in your approach is that you've hardcoded a finite set of patterns of inclusion/exclusion patterns that you're willing to try. With that approach the question is when, not if, your solution will break. In this case it was a pattern of inclusion that went "no, no, yes, yes" that broke. But patch that and you'll have another hole in your logic.

You can't incrementally fix your approach to get a working solution. Please don't try. If you disregard that advice, then change @treasures to qw(1 2 4 8 16 32 64 128 256 512) and try to generate every $target in the range 0..1023. For any reasonable amount of work you can do with your solution, you'll find a bug with that test.

MJD's solution, by contrast, works perfectly. How does he do that?

MJD uses recursion. As the joke goes, To understand recursion, first you must understand recursion. The joke has truth to it. The trick is that to see how recursion works for a problem of given size, you only need to understand how recursion works for all problems of smaller size. To understand, you start with examples of the smallest possible size and works your way up.

For 0 treasures, MJD's algorithm tries out [] (that's the check for $target == 0) then gives up.

For 1 treasures, MJD's algorithm tries out []. does sanity checks, makes its first recursive call and tries out [A]. then makes a second recursive call and tries out [] again (silly, isn't it?), then gives up.

For 2 treasures, MJD's algorithm tries out [], does sanity checks, then makes its first recursive call inside of which it may try out [A], [A, B], [A] then makes a second recursive call inside of which it may try out [], [B], []. Basically "empty set", followed by "all subsets with A", followed by "all subsets without A". (With some duplication due to the empty set being checked in the sanity check, then again in the second recursive call.)

For 3 or more treasures, the pattern repeats. We test the empty set. Then all subsets with the first element in the first recursive call. Then all subsets without the first element in the second recursive call (and the empty set multiple times). There is some redundancy in the checks, but we're guaranteed to actually try every subset. If the answer is to be found, we'll find it! (And if not, then we tried really, really hard.)

Hopefully this has helped you figure this case out. But that raises an interesting question. If you got through the Fibonacci and directory walk examples then you obviously understand the principle of recursion. Why would you balk on this example? My most likely guess is that you try to visualize what is happening. In the case of Fibonacci it is easy enough to verify, number by number, what is happening. In the case of a directory walk, it is easy enough to visualize what it is doing, and see how it works. But in the case of the partition problem, what we're working with is subsets of sets. That's pretty abstract. When things jump in abstraction, it is easy for our brains to get lost. Once our brains get lost, we lose track and get confused.

If that is the case for you, then focus on the concept of the size of the problem you are trying to solve. As long as all of the base cases are handled, and the recursive calls always are to "simpler" problems, the recursive algorithm will work. If the recursive calls are not to simpler problems, then you can get caught in infinite recursion.

Replies are listed 'Best First'.
Re^2: Mark Jason Dominus And Me - The Partition Problem
by Tommy (Chaplain) on Nov 13, 2012 at 13:33 UTC

    Wow, thanks tilly. I really appreciate your taking time to look at my meditation and read my code. I'd like to respond to a few of the things you said, and mostly in the form of questions (if you don't mind)

    You said:

    "If you got through the Fibonacci and directory walk examples then you obviously understand the principle of recursion. Why would you balk on this example? My most likely guess is that you try to visualize what is happening. In the case of Fibonacci it is easy enough to verify, number by number, what is happening. In the case of a directory walk, it is easy enough to visualize what it is doing, and see how it works."

    YES, exactly. I can confidently say that I know what recursion is, and more importantly I'd like to believe that I "understand" recursion; the directory walk is a piece of cake (I've written very similar code in the past), and the Fibonacci exercise was also very easy to grasp. I could easily see where/why the code recursed and when/what it returned.

    The two things I don't understand:

    1.) Why the ambiguous output?

    My confusion lies in the final 3 lines of the output of my (more verbose) version of the MJD recursive solution, as shown in the original post. I don't understand why something that appears to be a partial answer gets passed back up the stack and marked as a "solution", per the "$solution" variable name taken from the MJD code. When the code claims to have found a solution the first time, the solution it finds is not the real solution to the problem. Only higher on up the stack is the real solution identified. The first claim to have found a solution doesn't ring true. Sub-question: What made this solution work then?

    2.) When is recursion the right thing to do?

    What criteria make recursion the method of choice for approaching problems as opposed to iteration? For a predefined tree structure, the answer is obvious, however I don't see why it is the only way to accomplish the solution of the Partition Problem which has no pre-defined tree that I can see. I don't doubt your assertion that recursion is the correct methodology for finding a solution to the Partition Problem, but could you tell me why? How would I recognize that I needed recursion there?

    I had to jot this down quickly before taking the kids to school and going to $work, so please forgive me if there's any muddled grammar or disjointed sentences...

    --
    Tommy
    $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'
      For 1 the trick is in the first recursive call. When we subtract the current treasure from the target, we want to find solutions that include the current treasure. However inside that call, we don't know about the existence of that current treasure on the inner call - that fact is implicit in the fact that the target is smaller, and it will only get added back to the solution after the inner call returns.

      For 2 the answer is, "Whenever we have a problem that we don't know how to solve, which can be somehow reduced to simpler versions of the same problem." For Fibonacci, that means smaller $n. In the case of the directory walk, that means farther down the directory tree. In the case of the partition problem, simpler means "fewer treasures".

        Ummm. Just doesn't make sense to me to try and hit a moving $target... pun intended. It makes a helluva lot more sense now that my target stays constant and my "share" is the only thing really changing (see code below).

        Thank you tilly for helping me understand why I didn't understand. That's when everything started making sense.

        NEW CODE

        This is how I would do it... (remove "print" statements after debugging done).

        #!/usr/bin/perl use strict; use warnings; my ( $target, @treasures, $calls ); $target = shift @ARGV || 7; @treasures = @ARGV; @treasures = ( 2, 3, 4, 5, 1 ) unless @treasures; sub find_share { my ( $trial_set, $avail_choices, $depth ) = @_; $depth ||= 0; $depth += 1; $calls++; my $ts_total = 0; $ts_total += $_ for @$trial_set; printf qq{%-8s Call %-5d Depth: %-5d Trial set: %-10s Remaining: %- +16s ... }, ( '-' x $depth ) . '>', $calls, $depth, "[ @$trial_set ]", "[ @$avail_choices ]"; print qq{Found solution: [ @$trial_set ] == $target\n\n} and return $trial_set if $ts_total == $target; print qq{Returning undef because trial set total $ts_total > $targe +t\n} and return undef if $ts_total > $target; print qq{Returning undef because no available choices remain } . qq{and $ts_total < $target\n} and return undef if @$avail_choices == 0; my ( @my_trial_set ) = @$trial_set; my ( @my_avail_choices ) = @$avail_choices; push @my_trial_set, shift @my_avail_choices; print qq{$ts_total < $target, so I put "$my_trial_set[-1]" } . qq{into the trial set, and now I'll recurse\n}; my $solution = find_share( \@my_trial_set, \@my_avail_choices, $dep +th ); print +( '-' x $depth ) . '>', qq(A recursive call at depth @{[ $depth + 1 ]} found that a solu +tion } ) . qq(of [ @$solution ] adds up to my target of $target at depth $d +epth\n) and return [ @$solution ] if $solution; print '-' x $depth, '> ', qq{Situation failed. Omitting treasure [ $my_trial_set[-1] ] an +d } . qq{backtracking to choices of [ @my_avail_choices ] at depth $de +pth\n\n}; pop @my_trial_set; return find_share( \@my_trial_set, \@my_avail_choices, $depth ); } print qq{Trying to hit target of $target...\n\n}; print <<__ANS__; Solution: [ @{ find_share( [], \@treasures ) || [ "no solution possibl +e" ] } ] __ANS__ exit; __END__ DING! That moment when... \ | / _---_ \ / \ / __ | | __ \ 8-8 / / \l_l/ \ === =u= ...the light comes on and you get it.

        OUTPUT

        Click "Download code" link to see the actual, pretty, wide-formatted text

        $ perl find_shares_myway_2.pl Trying to hit target of 7... -> Call 1 Depth: 1 Trial set: [ ] Remaining: [ 2 +3 4 5 1 ] ... 0 < 7, so I put "2" into the trial set, and now I'll + recurse --> Call 2 Depth: 2 Trial set: [ 2 ] Remaining: [ 3 +4 5 1 ] ... 2 < 7, so I put "3" into the trial set, and now I'll + recurse ---> Call 3 Depth: 3 Trial set: [ 2 3 ] Remaining: [ 4 +5 1 ] ... 5 < 7, so I put "4" into the trial set, and now I'll + recurse ----> Call 4 Depth: 4 Trial set: [ 2 3 4 ] Remaining: [ 5 +1 ] ... Returning undef because trial set total 9 > 7 ---> Situation failed. Omitting treasure [ 4 ] and backtracking to ch +oices of [ 5 1 ] at depth 3 ----> Call 5 Depth: 4 Trial set: [ 2 3 ] Remaining: [ 5 +1 ] ... 5 < 7, so I put "5" into the trial set, and now I'll + recurse -----> Call 6 Depth: 5 Trial set: [ 2 3 5 ] Remaining: [ 1 +] ... Returning undef because trial set total 10 > 7 ----> Situation failed. Omitting treasure [ 5 ] and backtracking to c +hoices of [ 1 ] at depth 4 -----> Call 7 Depth: 5 Trial set: [ 2 3 ] Remaining: [ 1 +] ... 5 < 7, so I put "1" into the trial set, and now I'll + recurse ------> Call 8 Depth: 6 Trial set: [ 2 3 1 ] Remaining: [ ] + ... Returning undef because no available choices remain +and 6 < 7 -----> Situation failed. Omitting treasure [ 1 ] and backtracking to +choices of [ ] at depth 5 ------> Call 9 Depth: 6 Trial set: [ 2 3 ] Remaining: [ ] + ... Returning undef because no available choices remain +and 5 < 7 --> Situation failed. Omitting treasure [ 3 ] and backtracking to cho +ices of [ 4 5 1 ] at depth 2 ---> Call 10 Depth: 3 Trial set: [ 2 ] Remaining: [ 4 +5 1 ] ... 2 < 7, so I put "4" into the trial set, and now I'll + recurse ----> Call 11 Depth: 4 Trial set: [ 2 4 ] Remaining: [ 5 +1 ] ... 6 < 7, so I put "5" into the trial set, and now I'll + recurse -----> Call 12 Depth: 5 Trial set: [ 2 4 5 ] Remaining: [ 1 +] ... Returning undef because trial set total 11 > 7 ----> Situation failed. Omitting treasure [ 5 ] and backtracking to c +hoices of [ 1 ] at depth 4 -----> Call 13 Depth: 5 Trial set: [ 2 4 ] Remaining: [ 1 +] ... 6 < 7, so I put "1" into the trial set, and now I'll + recurse ------> Call 14 Depth: 6 Trial set: [ 2 4 1 ] Remaining: [ ] + ... Found solution: [ 2 4 1 ] == 7 ----->A recursive call at depth 6 found that a solution } of [ 2 4 1 ] + adds up to my target of 7 at depth 5 --->A recursive call at depth 4 found that a solution } of [ 2 4 1 ] a +dds up to my target of 7 at depth 3 ->A recursive call at depth 2 found that a solution } of [ 2 4 1 ] add +s up to my target of 7 at depth 1 Solution: [ 2 4 1 ]
        --
        Tommy
        $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'

      Just an additional point. When you'll go further down into MJD's brilliant Higher Order Perl, you'll find out that he warns about problems about recursion (paragraph 1.8, "Where Recursion Blows"), in which he revisits the Fibonacci and Partition problems for larger Fibonacci numbers or larger collections of treasures) and that, later in the book, he is spending quite a bit of time on methods to get rid of recursion (especially Chapter 5, "From Recursion To Iterators, but not only there).

      Recursion is a beautiful model not only when you have a predefined tree structure (such as the directory walk example), but also when a problem is very complicated, you often need to reduce it to a collection of smaller or simpler problems, and possibly again and again, until the individual problems become (almost) trivial (this is what divide and conquer algorithms are about). As an example, some of the efficient sort algorithms just do that, although their actual implementation may well have no recursive call at all. You may also want to use recursion if you want to generate all subsets of a set, or combinations or permutations. This may seem to be maths, but it is quite commonly necessary in software engineering. Many of the Operational Research problems also often fall into the category of candidates for recursion.

      In brief, recursion gives you a very powerful tool to modeling complex problems, especially when you want to make sure that you have examined absolutely all possibilities. The downside is that, sometimes, recursion will let you visit each possibility a number of times, possibly a huge number of times (there are some ways around that to a certain extent, one of them being caching, which MJD examines in some depth, but not always). So, sometimes you use recursion to better understand the problem and how to solve it, but then change it to iteration or something else when is comes to actual implementation. In such cases, recursion can be thought of as a prototype.

      Reading further MJD's book, you will also find out that even for directory walk, recursion is not necessarily the best solution, the reason being that recursion is very good at depth-first traversing of the directory tree; but sometimes you need to traverse the tree in breadth-first order (or some other order), and then, recursion becomes very unpractical to use.

      Keep reading this excellent book. I really did not have any problem with recursion when I first read it, but I learned an awful lot of other things from it, and there are other with which I am still fighting after having read them three times. There are some chapters that I will probably read for the fourth time, because I haven't fully understood their full implications and I still have quite a but to learn for them.

        Indeed. People who only know how to traverse a directory using recursion will suffer grief if they are asked to do it breadth-first instead. Recursion gives you an implicit stack. If you can unwind it to get an explicit stack instead, then switching between depth-first and breadth-first is as simple as switching from a stack to a queue.

        But, one step at a time. Recursion is a great technique for thinking through a problem that you don't know how solve up front. Once you understand that solution, you can start trying to think of other approaches with other trade-offs. :-)