A follow up to "Question about recursively generated iterators"by perlfan (Curate)
|on Oct 16, 2007 at 16:46 UTC||Need Help??|
This is a follow up to my question, Question about recursively generated iterators, to which Limbic~Region and ikegami provide some very helpful answers.
The following is an affirmation of what I've learned regarding the generation of iterators.
IntroductionThe motivating application for learning to do this is related to problem of generating valid strings given a deterministic finite automata (DFA), which is a machine that can be described using pure regular expressions.
Normally these machines are used as string acceptors, but here I wanted to do the opposite, and use them as string generators. I had recursive solutions, but I wanted something that I could use as an actual iterator - i.e., produce the next string and halt the execution until I wanted the next one.
The idea of string generation is not as intimidating as it seems (especially if I am playing with it:) because the DFA can be taken as a directed graph where the states are the nodes, and the transitions between states are the edges. Each transition may have multiple labels (i.e., symbols), but this fortunately does not make what I need to do any more difficult conceptually.
In order to find all paths that go from the start state (node) to an accepting state, one may use a depth first traversal (DFT) of the directed graph. Using this method, a valid string is simply the concatenation of the symbols labeling each edge in the valid path. A path is valid if it is acyclic and goes from the start state to some accepting state. A related method may find just the acyclic paths, but a DFT is also able to detect (and follow to a certain depth) cycles.
I am familiar with implementing this as a recursive routine, and that works fine when all I want is a dump of all strings. It doesn't work so well if I want to create a real iterator that offers some control of the traversal's execution. Some DFAs may also create a lot of strings depending on how "deep" one wants to go, so it is not a good idea to have to generate a ton of strings if all I want is a few.
The Basic SolutionConceptually, all that I really needed to do was to intercept the recursive calls before they were made, push them onto my own call stack, then manage the call stack in some way.
Using this scheme, an "iteration" consists of pop'ing off the top most anonymous subroutine, executing it, then pushing the set of resulting anonymous subroutines back down the stack. If on a particular call a terminating condition of the recursion is met, there are potentially no subroutines returned.
It should be noted that iterators in general do not need to ensure an exhausted call stack, but a recursive algorithm run indefinitely (like in a while loop) will eventually halt. Because the caller is in control of the execution stack, iterators are often used to control memory efficient infinite data generators.
Recursive Iterator Generator Pseudo-code I
Recursive Iterator Generator Pseudo-code II
In the above pseudo-code, it is important to note that the next level of recursion is never followed immediately. Control is returned back to the caller once all of the next set of subs are generated. The return value is a set of 0 or more newly manufactured subroutines that are ready to be pushed onto the call stack.
These dynamically manufactured subroutines are no different than explicitly declared subroutines except that during their creation, their input parameters were determined. This is why we execute these manufactured subroutines with out any parameters, for example $sub_ref->().
It is perfectly valid to have a manufactured subroutine accept run time parameters, but more often this is unnecessary. It is merely an additional dimension of flexibility one may add to these dynamically generated subroutines.
What About Getting Actual Return Values?Recursive functions are rarely useful if they do not return something to the original caller. Fortunately, Perl allows us to return complex data structures, so in this case we would return an anonymous hash where one field contained the anonymous array of generated subroutines, and anything else that needed to be returned could be contained in its own hash field.
This requires an extra step after each call from the top of the stack is made, but it is a small price to pay for the convenience. For example, below we return an anonymous hash reference with an array of subroutine references as one of its members:
and now we make the call and extract the proper values from the returned hash ref.
An Example Interface to My ImplementationThis is not the implementation, but the interface to the iterator implemenation. I show this first to illustrate what I was originally envisioning. This code provides valid string generation via my hobby module, Perl FLaT, using both the acyclic path and deep dft methods.
The Actual ImplementationFor the full context of this code snippet, see the full file.
Many Thanks To...
FeedbackI welcome feedback of all kinds, so please feel free - especially if you notice an problem anywhere :).