perlmeditation
Aristotle
<p>I am currently working my way through <i>Higher-Order Perl</i>, and as you’d expect, the subject of tree traversal makes a frequent apperance:</p>
<ol>
<li>First, as an introductory example to recursion;</li>
<li>then, when discussing how to turn recursive functions into iterators using an explicit stack (which permits breadth-first searching);</li>
<li>again recursively, in the section on tail call elimination, where the tail-recursive call is eliminated first, and the other recursive call is then replaced by an explicit stack.</li>
</ol>
<p>There may be even more appearances later in the book that I’ve yet to discover; as I said, I’m not through with it yet. However, the book changes topic after that, at least momentarily, so I stopped to ponder. It occured to me that this is the entire extent to which discussions of tree traversal typically go. Another obvious option that occured to me many years ago is not discussed anywhere that I’ve seen, though it is occasionally mentioned as a possibility in passing:</p>
<p>You can get rid of any stacks whatsoever by keeping a parent pointer in the tree node data structure. Effectively, this turns the tree into a (sort of) state machine. While traversing, you need no memory other than the current and the previous node/state. The traversal algorithm is very simple:</p>
<ol>
<li>If the previous node is this node’s parent node, descend to the left child node.</li>
<li>If the previous node is this node’s left child node, descend to the right child node.</li>
<li>If the previous node is this node’s right child node, ascend to the parent node.</li>
</ol>
<p>Obviously, if there is no left child to descend to, you try the right one; and if there is no right child to descend to, you ascend to the parent. Traversal is complete when an attempt to ascend to the parent node fails because there is no parent. Pre-, post- and in-order traversal can be implemented simply by changing which of the conditions implies that the current node must be visited: if you visit the node when coming from…</p>
<ol>
<li>… the parent node, you get pre-order traversal.</li>
<li>… the left child node; you get in-order traversal.</li>
<li>… the right child node; you get post-order traversal.</li>
</ol>
<readmore>
<p>Assuming all tree nodes are instances of a class which has <code>parent</code>, <code>left</code> and <code>right</code> methods and uses <code>undef</code> to signify the absence of a pointer, the following is an implementation of the in-order version of the traversal algorithm in Perl:</p>
<code>
sub traverse_tree {
my ( $tree_root, $visitor_callback ) = @_;
my ( $curr_node, $prev_node ) = $tree_root;
while( $curr_node ) {
my $next_node;
if( $prev_node == $curr_node->parent ) {
$next_node = $curr_node->left;
if( not $next_node ) {
$visitor_callback->( $curr_node );
$next_node = $curr_node->right || $curr_node->parent;
}
}
elsif( $prev_node == $curr_node->left ) {
$visitor_callback->( $curr_node );
$next_node = $curr_node->right || $curr_node->parent;
}
elsif( $prev_node == $curr_node->right ) {
$next_node = $curr_node->parent;
}
( $prev_node, $curr_node ) = ( $curr_node, $next_node );
}
}
</code>
<p>This is the most straightforward implementation, which does have a fault: there is some code duplication between the coming-from-parent and coming-from-left-child states. The complication comes about because node visiting must be ensured even when the node does not have the particular pointer to come from; eg. in the case of in-order traversal, you visit the current node when you come from the left child node; but when a node has no left child node, you must still ensure that the node will be visited. The discovery that the left child node is absent will happen when the previous node was the parent, and so that state must ensure to visit the current node before going on to try to descend to the right.</p>
<p>The fix is conceptually simple, but not easy to express in code. You need a way to fall through from the body of one branch to another’s without checking the condition for that branch, much the way C’s <code>switch</code> statement works, where branches fall through by default and require an explicit <code>break</code> to exit. A <code>switch</code> statement in C is simply a structured expression of a jump table (but note that you couldn’t actually use a <code>switch</code> statement in C for this because the <code>case</code> conditions in this algorithm wouldn’t be constant expressions); so the Perl version will need a couple of explicit <code>goto</code>s:</p>
<code>
sub traverse_tree {
my ( $tree_root, $visitor_callback ) = @_;
my ( $curr_node, $prev_node ) = $tree_root;
while( $curr_node ) {
my $next_node;
{
goto FROM_PARENT if $prev_node == $curr_node->parent;
goto FROM_LEFT if $prev_node == $curr_node->left;
goto FROM_RIGHT if $prev_node == $curr_node->right;
FROM_PARENT:
last if $next_node = $curr_node->left;
FROM_LEFT:
$visitor_callback->( $curr_node );
last if $next_node = $curr_node->right;
FROM_RIGHT:
$next_node = $curr_node->parent;
}
( $prev_node, $curr_node ) = ( $curr_node, $next_node );
}
}
</code>
<p>In this rendition of the algorithm, the reformulation required to implement pre- or post-order traversal is trivial: you just move the callback invocation to the appropriate label.</p>
<p>(It is in fact quite simple to implement all three variants in a single function: just put a call in every branch and make them conditional on an extra parameter, eg. <code>$visitor_callback->( $curr_node ) if $order == -1;</code> where <code>$order == 0</code> means in-order traversal and in that case the parameter is optional.)</p>
</readmore>
<p align="right" class="pmsig pmsig-114691"><i>Makeshifts last the longest.</i></p>