Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Correction to Database problem

by demerphq (Chancellor)
on Feb 15, 2003 at 22:50 UTC ( [id://235619]=note: print w/replies, xml ) Need Help??


in reply to Correction to Database problem

Hi nasa,

What you are doing is a fairly classical data structure manipulation. You have a tree in list form. Or a list of node id's and their parent id's. The terms you have used are little different, usually we say that we have a parent-child relationship, and we dont talk about numbers, but rather id's or identifiers. I say all this because in future questions you may find the terms useful and easier to explain what you want to do. Anyway, here is yet another soluton of your problem.

use strict; use warnings; sub ancestors { my $node_parents = shift; my $node = shift; my @ancestors; while (defined($node_parents->{$node})) { push @ancestors,$node_parents->{$node}; $node=$node_parents->{$node}; } return wantarray ? @ancestors : \@ancestors } my %node_parent; #my $data_file="reff.data"; #open my $dat_fh, $data_file or die "Could not open $data_file: $!"); #while (<$dat_fh>) { while (<DATA>) { if (/^\s*(\d+)\s+(\d+)/) { $node_parent{$1}=$2; } } my $check_node='6'; print "Ancestory of $check_node from newest to oldest:\n"; print join(", ",ancestors(\%node_parent,$check_node)); # the tree looks like: # 0 # +-+-+ # 1 2 # +-+ +++ # 3 4 5 # +-+-+ # 6 7 8 # # so we should get the output: # Ancestory of 6 from newest to oldest: # 3, 1, 0 __DATA__ 1 0 2 0 3 1 4 2 5 2 6 3 7 3 8 3

Potentially you dont need to read the whole list first. This would be true if and only if every parent had a lower id of all of its children and and every "son" id in the list was greater in id than all of the preceding son id's.

Depending on what types of operations you will frequently need to perform you may wish to generate a more complicated data structure than a simple parent list. (which is what your data represents and what the hash does). You may wish to do some thing like this:

use strict; use warnings; # subroutines for walking a node tree # each takes a hash of nodes as the first parameter # and the id of the node to start with # # each $nodes->{$id} is of the form # { # id => $id, # integer # parent => undef, # undef for no parent, otherwise an integer # children => [3,5,6], # undef or a ref to an array of optional chil +dren. # } sub dump_nodes { my $nodes=shift; foreach my $node ( sort { $a->{id} <=> $b->{id} } values %$nodes ) { printf "Node %10s Parent: %- 10s Children: %s\n", $node->{id}, defined($node->{parent}) ? $node->{parent} : "undef", join(", ",sort {$a <=>$b } @{$node->{children}||[]}); } } sub dump_nodes_as_tree { my $nodes = shift; my $node_id = shift; my $depth = shift||0; unless (defined $node_id) { my @root_nodes=grep { !defined $_->{parent} } values %$nodes; foreach my $root ( sort { $a->{id} <=> $b->{id} } @root_nodes +) { dump_nodes_as_tree($nodes,$root->{id},0); } } else { my $node=$nodes->{$node_id}; printf "%s%d(%s)[%s]\n", (" " x $depth), $node_id, defined($node->{parent}) ? $node->{parent} : "undef", join(",",sort {$a <=>$b } @{$node->{children}||[]}); foreach my $child_id (sort {$a<=>$b} @{$node->{children}||[]}) + { dump_nodes_as_tree($nodes,$child_id,$depth+1); } } } sub ancestors { my $nodes = shift; my $node_id = shift; my @ancestors; while (defined($nodes->{$node_id}{parent})) { push @ancestors,$nodes->{$node_id}{parent}; $node_id=$nodes->{$node_id}{parent}; } return wantarray ? @ancestors : \@ancestors } sub children_eldestline { # perorder traversal of the tree my $nodes = shift; my $node_id = shift; my $children = shift || []; foreach my $child_id (sort {$a<=>$b} @{$nodes->{$node_id}{children +}||[]}) { push @$children,$child_id; children_eldestline($nodes,$child_id,$children); } return wantarray ? @$children : $children } sub children_bygen { # breadthfirst traversal of the tree my $nodes = shift; my $parent_id = shift; my @children; my @queue=($parent_id); while (@queue) { my $node_id=shift @queue; foreach my $child_id (sort {$a<=>$b} @{$nodes->{$node_id}{chil +dren}||[]}) { push @children,$child_id; push @queue,$child_id; } } return wantarray ? @children : \@children; } my %nodes; #my $data_file="reff.data"; #open my $dat_fh, $data_file or die "Could not open $data_file: $!"); #while (<$dat_fh>) { while (<DATA>) { if (/^\s*(\d+)\s+(\d+)$/) { #print "$2 is parent of $1\n"; $nodes{$1}{id}=$1; $nodes{$1}{parent}=$2; $nodes{$2}{id}=$2; push @{$nodes{$2}{children}},$1; } } print "Nodes:\n"; print dump_nodes(\%nodes),"\n"; print "Nodes as tree:\n"; print dump_nodes_as_tree(\%nodes),"\n"; print "Ancestory of 6 from newest to oldest (parent list):\n"; print join(", ",ancestors(\%nodes,6)),"\n"; print "Children of 0 by the liniage of the oldest child (pre-order):\n +"; print join(", ",children_eldestline(\%nodes,0)),"\n"; print "Children of 0 in by generation (breadth-first):\n"; print join(", ",children_bygen(\%nodes,0)),"\n"; # the tree looks like: # 0 # +-+-+ # 1 2 # +-+ +++ # 3 4 5 # +-+-+ # 6 7 8 # # so we should get the following output: # # Nodes: # Node 0 Parent: undef Children: 1, 2 # Node 1 Parent: 0 Children: 3 # Node 2 Parent: 0 Children: 4, 5 # Node 3 Parent: 1 Children: 6, 7, 8 # Node 4 Parent: 2 Children: # Node 5 Parent: 2 Children: # Node 6 Parent: 3 Children: # Node 7 Parent: 3 Children: # Node 8 Parent: 3 Children: # # Nodes as tree: # 0(undef)[1,2] # 1(0)[3] # 3(1)[6,7,8] # 6(3)[] # 7(3)[] # 8(3)[] # 2(0)[4,5] # 4(2)[] # 5(2)[] # # Ancestory of 6 from newest to oldest (parent list): # 3, 1, 0 # Children of 0 by the liniage of the oldest child (pre-order): # 1, 3, 6, 7, 8, 2, 4, 5 # Children of 0 in by generation (breadth-first): # 1, 2, 3, 4, 5, 6, 7, 8 __DATA__ 1 0 2 0 3 1 4 2 5 2 6 3 7 3 8 3
which builds a tree of nodes with previous pointer and children pointers and provides a couple of utility subs for finding the children (in different forms), printing out the data structure and the like. A more robust, but probably too confusing approach for right now would be to use real references instead of id lookups (actually on second thought using pure references might be less confusing, but for now whatever :-). This would have the advantage that if you had operations like delete in mind that you could delete entire subtrees with a single assignment.

Anyway, hopefully you find this node and its contents a useful introduction into different ways of representing heirarchical relationships, and operations that can be performed on them. I hope however that you will take the time to search on "Tree representation" and "Trees" for tree specific material (Binary Tree's are the canonical data structure in many respects) and obtain a book like "Algorithms and Data Structures in Perl" (O'Reiley). This will walk you through the different forms of representations in much more depth and precision.

You will need to look under "Graphs" and "Graph theory" if you want to research up on general material related to this. To computer science people and mathematicians all of this stuff comes under the heading "Graph Theory". A tree with only parent pointers or only child pointer is technically called a "DAG", directed acyclic graph, and when it has both parent and child pointers it is a "Directed Graph". There are also other names originating in specific subfields, such as databases (where they talk about keys and constrained indexs), networking (a graph is a network amd vice versa), or various forms of system programming where nodes, pointers and children are the terms used, or graphics where they use lots of different terms (its not my field :-))

Cheers and good luck. And definately dont put off asking a question about any of this if you feel the slightest confusion. If something is over your head we will do our best not to let you drown. :-)

---
demerphq


Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://235619]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-03-28 14:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found