Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Questions about Recursion and "Extract your transversal", by chromatic

by mascip (Pilgrim)
on Jun 06, 2013 at 23:38 UTC ( #1037529=perlquestion: print w/ replies, xml ) Need Help??
mascip has asked for the wisdom of the Perl Monks concerning the following question:

I don't manage to comment Improve your extracted transversal on chromatic's blog, so i'll ask my questions and remarks here, hoping to learn through discussion. You won't understand my questions if you don't read the original posts.

First, what would happen if we separated the process in 2 subroutines? For example (non-tested code) :

## Concatenates the texts retrieved in a node's (possibly nested) desc +endants # Note: preserve recursion with process_text_in($node) sub get_all_text_in { my $node = shift; # Concatenate the texts extracted from each child node my $text = reduce { $a .= process_text_in($b) } $node->content_list; } ## Get the text in a node, processing it if needed # Note: preserve recursion with get_all_text_in($node) sub process_text_in { my $node = shift; # Just text => get it return $text unless ref $node; # Not a special tag => get its children texts my $tag = $node->tag; return get_all_text_in($node) unless $action{$node->tag}; # Special tag => process it accordingly return $action{$tag}->($node); }
I feel like there would be both benefits and drawbacks:
  • + separation of concerns: concatenation, text processing
  • + simpler logic (no need for if/else, nor $text)
  • - __SUB__ makes it clear that we are dealing with a recursion
  • - we lose the value of "having it all in one place" (that's why i added "Note" comments)
Is that right? Did i forget something?

Having it all in one place feels nice and safe, but so does keeping a very simple logic. Maybe that for a recursion, keeping it all in one place is more critical.
My question is: in terms of cycles, what would be the consequence? Would my solution enable us to not need the "undef $traverse"?

Next related question: because both entries (a, p) in the %action hash start with $traverse->($node), it would be possible to take it out of the hash, which would solve one of the weak reference problems i think. (would it?) A good reason not to do that would be if you add some extra-entries in the hash, which wouldn't start by $traverse->($node). I am not familiar with HTML parsing enough, to know if this is likely to happen. Is it?

And finally, if the answer to the last question is yes, then are there other ways than using a hash? What if the hash contained references to named subroutines, which would call get_all_text_in($node)? Would we still have a leak? Sorry i'm not clear enough on how leaks work. What should i read?

Thank you for any insight or discussion.

Comment on Questions about Recursion and "Extract your transversal", by chromatic
Select or Download Code
Re: Questions about Recursion and "Extract your transversal", by chromatic
by Anonymous Monk on Jun 07, 2013 at 00:46 UTC

      Thank you. Now i understand the 2 sources of circularity in the original solution. With my solution they would be gone, as we would be using named subroutines instead of references to a subroutine. Thus, using get_all_text_in() in the hash won't create any leak.

      I can now add in the list of pros and cons above:

      • + no risk of inducing a memory leak

      I guess here that personal preferences will differ.Personally i would go for the "named subroutines" solution, because i can spot recursion errors more easily than memory leaks. And because i like simple logic.

      I am interested to hear people's reasons for preferring one or the other.

        In fact, the solution with __SUB__ could use a named subroutine, in which case there's no more memory leak risk. Is that right?

        We are then left with the pros and cons that i outlined in my first message. And i don't know which i would prefer.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1037529]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (13)
As of 2014-07-31 15:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (249 votes), past polls