Re: help with a new type of Markov

I am trying to write a program that implements the Markov algorithm ... (two words) must appear in the sentence at some random fixed point.

Depending on what you mean by "random fixed point", you may have moved out-of-bounds with respect to the Markov algorithm.

For the benefit of those following along, the traditional Markov random-text generator samples text to (conceptually) build a big sparse matrix. Each cell in the matrix represents the probability of transitioning from one state to the next, where "state" is a tuple of the N adjacent tokens. For example, by sampling

the quick red fox jumped over the lazy perl hacker

with N=1, the matrix will show that the probability of transitioning from (the) to (quick) is 50%. By sampling some larger text with N=2, the matrix might show that the probability of transitioning from (Perl is) to (is neat) to be 90%, and to (is hard) to be 10%.

The transition matrix is easier to implement using a tree, but conceptually it's just a sparse matrix.

To generate random text, we just roll the dice and choose a legal transition.

Your problem complicates this considerably. By saying that you want two words A and B to appear in the sentence at random points, you require an algorithm that finds a legal set of transitions that first gets you to A (or B), and then eventually gets you to B (or A), and then eventually terminates with the end of a sentence. You could do this with a tree search, but there's no guarantee that it will terminate unless a valid sequence of paths existed in your source text. And depending on how you record sentence start and sentence end, there might not be.

You might try this. Start with A, run the algorithm backwards until you find a valid sentence start. To do this (assuming you're using a tree), you'll need to build a second tree to do backwards transitions. Then run the algorithm forward building a tree breadth-first (so that you can terminate at some arbitrary depth) looking for B. If you find it, run the algorithm forward until you find the end of then sentence (again using a breadth-first search).

And I recommend that you don't try to hold your breath while the algorithm runs. :)

Comment on Re: help with a new type of Markov


Perl: the Markov chain saw
	PerlMonks