Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Collecting data in a recursive routine.

by Anonymous Monk
on Mar 07, 2002 at 03:29 UTC ( [id://149924]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm posting this anonymously mostly because I'm ashamed to ask for help for something that is probably terribly stupid and rudimentary. I have been tooling over this and similar routines for more than two weeks now and most of my time is spent staring blankly at the screen in completely stupification. The completion of this routine is imperetive to the next blocks of my project however and until I complete it I can do little else. I know that once I see how to do it, I'll find the solution applicable in numerous other areas as I grow but it's impeding me so at this point that I'm almost too frustrated to see the issue clearly anymore.

This routine pulls data from a table in a PostgreSQL database like the following:

 cat_id |     name       | parent 
--------+----------------+--------
      1 | Books          |        
      4 | Comics         |      1 
      5 | Graphic Novels |      4 
      6 | Music          |        
      7 | Sheet Music    |      6 
      8 | Instruments    |      6 
      9 | Guitar Tabs    |      7 

And produces a nested list of results of these categories as in the below example:

Books
  Comics
    Graphic Novels
Music
  Instruments
  Sheet Music
    Guitar Tabs

The problem is that I implement HTML::Template with most of my projects and in this case, I can't figure out how to properly 'catch' the data instead of outputting it to the browser directly from the code. That is, I need a $cat_output or a @cat_output that contains the entire sorted and nested results as displayed above and as would otherwise have been output to the browser. Once I have a scalar or an array, I can feed it to HTML::Template. But because of the recursiveness of this routine, I can't figure out how to do this without ending up with completely mangled, partial or repetitive results.

Here is the code that simply print()'s out the categories to the browser as they are processed. I need help taking it from this into a routine that can store all of the data in a variable that I can then use.

sub ListCategories { my (%cat, $sql_query, $already_run, $parent_cat_id, $indent, $all_cats ); $indent = $_[0]; $parent_cat_id = $_[1]; $already_run = $_[2]; print $query->header if ($already_run ne 'y'); if ($parent_cat_id >= 0) { if ($parent_cat_id == 0) { $all_cats = $dbh->selectall_arrayref("SELECT category_id,n +ame FROM $sql{categories} WHERE parent IS NULL"); } elsif ($parent_cat_id > 0) { $all_cats = $dbh->selectall_arrayref("SELECT category_id,n +ame FROM $sql{categories} WHERE parent=$parent_cat_id"); $indent .= "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"; } foreach my $current_cat (sort { $a->[1] cmp $b->[1] } @{$all_c +ats}) { ($cat{id}, $cat{name}) = @$current_cat; print "$indent $cat{name} ($cat{id})<br>"; &ListCategories($indent, $cat{id}, 'y'); } } }

Replies are listed 'Best First'.
Re: Collecting data in a recursive routine.
by chipmunk (Parson) on Mar 07, 2002 at 04:38 UTC
    I see that you're fetching categories in groups, grabbing all the children of a specific category, then going on to the next parent category... My inclination would be to fetch all the categories at once, then organize them. Here's one way to do this:
    my $cats_sth = $dbh->prepare(<<"EndOfSQL"); SELECT category_id, name, parent FROM $sql{categories} ORDER BY category_id ASC -- assumes parent id is always less than child id EndOfSQL my @cats; # build the data structure while (my($cat_id, $name, $parent_id) = $cats_sth->fetchrow_array()) { $cats[$cat_id] = [$name, $parent_id, []]; if ($cats[$parent_id]) { push @{$cats[$parent_id][2]}, $cat_id; } elsif ($parent_id) { warn "Parent $parent_id not found for category $cat_id '$name' +\n"; } } # print the categories foreach my $cat (@cats) { next unless $cat and !$cat->[1]; print_category($cat, \@cats, ''); } # recursively print a category and its children sub print_category { my($cat, $cats, $indent) = @_; print $indent, $cat->[0], "\n"; foreach my $child_id (@{$cat->[2]}) { print_category($cats->[$child_id], $cats, "$indent "); } }
    The data structure that this uses is:
    [ [ name, parent id, [ child ids ] ], [ name, parent id, [ child ids ] ], ... ]
    where the ids are also the indexes into the array.

    This probably is not the best data structure for your task, although it does get the job done. It should give you some ideas about solving the problem.

Re: Collecting data in a recursive routine.
by steves (Curate) on Mar 07, 2002 at 05:29 UTC

    I had a similar database construct that I built a tree structure around. I won't attempt to write your code, but I'll explain what I did and maybe it will help.

    Each level of the tree was an array reference. The array members were hash references representing the individual nodes. Those nodes had optional references to other array references (children nodes), etc ...

    To implement this you have two functions: One to create the tree, and one to traverse it. The creation function algorithm is:

    • Accept one optional argument: A parent node (hash reference)
    • Get nodes for the parent (or root nodes if no parent given) from the database
    • If the DB hands back any nodes ...
      • Create an array reference
      • For each DB node returned ...
        • Create a hash reference representing the node. Keys are name, ID, etc.
        • Push the created hash (node) reference onto the array created above
        • Call yourself recursively with the node reference as an argument
      • If a parent was given set a "children" key in it to the created array reference
    • Return the created array reference (or undef if no nodes)
    Traversal is implemented as a function taking an array reference of nodes, a depth, and a function (CODE) reference to call for each node. The called node function takes two arguments: A node and a depth. The traversal function algorithm looks like this:
    • Accept three arguments: An array reference of nodes, a depth, and a CODE reference to a function to call per node
    • For each node in the given array reference ...
      • Call the node processing function with the node and the given depth
      • If the node has a "children" array reference ...
        • Call yourself recursively with the "children" array reference, depth+1, and the node processing CODE reference
    You build the tree by calling the creation function with no arguments. Then you print the header and call the traversal function with the tree (array reference handed back by creation), a depth of zero, and a reference to a sub that does your printing. Your printing sub takes a node (hash reference) and a depth. You indent based on depth and print the name that's in the hash reference representing the node.

    Once I got this far I optimized by loading all nodes with one DB call and building the tree as a secondary step. But I was able to do that by virtue of having another set of data that defined the relationships (the node "links"). So your mileage may vary there.

    I was proud of this one. I built it fast and none of our Java programmers ever built anything like it. That wasn't so much a Java versus Perl thing (which I don't argue) -- it was more a function of me having done dozens of similar things in other languages and being very comfortable with the model. But it showed that Perl was powerful and could handle complex data structures. For a long time that Perl package was the key to people getting a fast, hierarchical view of the data. It's still used heavily today, several years later. One key to its longevity IMO is the abstracted traversal that takes a CODE reference to do the work. I've built dozens of functions to do different things as each node is visited. Easy to do once the core creation and traversal is implemented.

Re: Collecting data in a recursive routine.
by vladb (Vicar) on Mar 07, 2002 at 04:38 UTC
    I'm not overly familiar with PostgresSQL but wouldn't you do the sorting inside your SQL query? like this:
    $all_cats = $dbh->selectall_arrayref("SELECT category_id,name FROM $sq +l{catego +ries} WHERE parent=$parent_cat_id ORDER BY category_id");

    That's a minor point however ;-).

    Now, onto HTML::Template and how you could possibly produce a 'variable' (array actually) that could later be fed to HTML::Template and used reasonably therein.

    I know that HTML::Template allows for nested TMPL_LOOP statements, however, I hardly see them being useful in your case. Instead, you might try doing the following. Create an array structure like this:
    @categories = ( { id => <i>insert category_id value here</i>, name => <i>insert name value here</i> level => <i>level (inside a category etc.) </i> }, { ... }, ... );
    Notice introduction of the 'level' field? You may use this one in your HTML template to determine how much 'indentation' is required. To set 'level' value properly, you might modify your code to increment a $level variable instead of adding string filled with a bunch of 'spaces' to the $indent variable as it's no longer going to be needed. Basically, try to introduce $level variable in place of your $indent and also change this line:
    $indent .= "&nbsp; ...";
    to
    $level++;
    and later in your foreach loop, just push this hash onto your @categories array (this array may be global to start with.. however, it does feel yucky and you might have to change it later :)
    foreach my $current_cat (sort { $a->[1] cmp $b->[1] } @{$all_c +ats}) { ($cat{id}, $cat{name}) = @$current_cat; push @main::categories {level=>$level,name=>$cat{name},i +d=>$cat{id}}; &ListCategories($indent, $cat{id}, 'y'); }
    This way your Template code is left to decide which way to format the output based on the 'level' value ;-).

    See if it works for you and let me know. (I didn't test the code, though).

    "There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith
Re: Collecting data in a recursive routine.
by beppu (Hermit) on Mar 07, 2002 at 05:45 UTC
    DISCLAIMER: I have not tried this code out, but I think it will work. All I know is that :perl -wc doesn't complain. Let me know what happens.

    If it works I will be happy, because I've been away from Perl for a long time, and I'm trying to shake off some rust. Even if it doesn't work, it's got to be close. Anyway:

    $tree = GetCategories($node)

    # # # # # # #' . . `# # i # making an ass out of u and me since 1975 # assumptions # #+_. ._+# my %sql; # is filled w/ names of tables that hold category data my $dbh; # database handle =head2 The Structure of a Node { id => $number, name => $string, children => $array_ref_of_more_nodes, } =head2 Usage To get the whole tree: my $tree = GetCategories(); To get a partial tree, pass in a hashref that has an C<id> key representing the parent C<category_id> to start with: my $subtree = GetCategories($starting_node); =cut sub GetCategories { my $node = shift || undef; my $select_children = "SELECT category_id,name FROM $sql{categories} "; # Create the root node only on the 1st time through, # and complete the SQL statement for # selecting all the nodes children. if (not defined $node) { $node = { id => 0, name => '/', children => [ ], }; $select_children .= "WHERE parent IS NULL"; } else { $select_children .= "WHERE parent=$node->{id}"; } # Find all the children of the current node. my $all_cats = $dbh->selectall_arrayref($select_children); my $c; # Add each child to $node->{children}, and # then recursively add get each child's children foreach $c (sort { $a->[1] cmp $b->[1] } @{$all_cats}) { my $child_node = { id => $c->[0], name => $c->[1], children => [ ], }; # ............. depth-first style ..... push @{$node->{children}}, $child_node; GetCategories($child_node); } return $node; }
Re: Collecting data in a recursive routine.
by mojotoad (Monsignor) on Mar 07, 2002 at 17:12 UTC
    Make sure and check for cycles; the recursive code I've seen so far would be vulnerable to setting the "parent" of a category to some lower-level denizen. Track your "levels" as you visit the lineages, or simply note which categories have been visited, and hop out of the calls if you land in a familiar category.

    Or is your graph guaranteed to be a tree with no cycles?

    Matt

Re: Collecting data in a recursive routine.
by traveler (Parson) on Mar 07, 2002 at 21:44 UTC
    Maybe I missing something here and if so, please just follup with what I am missing. You say ListCategories works. What is wrong with
    1. replacing the print with $output_string .=
    2. changing the function to this
      { my $output_string; sub DoListCategories { $output_string = ""; ListCategories(@_); } sub ListCategories { #Just as you had it with the change in 1 above } }
      Note the outer braces so $output_string is "private" to these two functions.
    3. Changing the calls in the main program from ListCategories to DoListCategories
    This saves you from having to re-write working code.

    HTH, --traveler

      I used your solution first because it seemed the most straight forward and required the least amount of change to my existing code.

      It worked perfectly and accomplished exactly what I wanted. The problem I had been running into was, indeed, that I was trying to be too "smart". From the moment I sat down to write the routine, I did it with the mindset that I'd accomplish everything in one sub which obviously was rediculous (now I see the error of my ways).

      I had suspected that doing what you suggested would work and may be the necessary way, but I also thought that there must be a standard solution to doing it all in one simple routine and that I was just too inexperienced to see it. Forest for the trees.

      Now that I at least have it working, I intend to go back and try implementing some of the changes the other posters above suggested for more efficiency, readability and flexibility.

      Thanks for the help!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://149924]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-20 02:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found