http://www.perlmonks.org?node_id=278024

Isanchez has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I have a dataset that contains vocabulary entries such as:

term_Id Parent_Id
0 1 domestic_animal
2 1 dog
3 2 terrier
4 2 collie
5 3 fox_terrier

The first number in the column is a unique index for the word. The second indicates the parent category represented by some other word that it belongs to. So, terrier has the unique index 3, but it is a type of dog so its parent id is the unique Id of dog i.e. 2. Collie is another type of dog, it has the unique id 4 and since it is a dog too its parent is also 2. fox terrier is a type of terrier, so its parent is the index of terrier 3.

In the actual file there are categories that have up to 17 levels in depth. The root term, i.e. the highest node in the hierarchy is "vocabulary" and has 12 immediate doughters. I have an input such as the one above and I have to come up with an output that will have all their descendant terms. Something like:

domestic_animal dog, terrier, collie, fox_terrier, cat, chesire_cat
vehicles car, SUV, Ford, Ford_Passat, airplane, boeing_747

I imagine that a recursive function is needed to keep collecting doughter terms. and the structure in which they have to be stored is probably a hash of arrays.

Can anyone lead me at least to a begining or some piece of code, algorithm to do this?
thank you very much,
Ivo