|Perl: the Markov chain saw|
I like the way you get the chatterbox, by breaking into the nodelet units. I may do that instead of the match on the whole page I do right now.
I think the problem with your cache is that you are resetting old messages every time. Here's what happens. Assume the first time through the chatbox has the following lines:
So you add those to $newmessage, which then becomes $oldmessages. The next time, the box contains:
As you go through the lines, the only one that gets added to $newmessages is the last one, because the others are already in $oldmessages.
Then, and here is the problem, you remove $oldmessages! So the only message you have a record of is the last one. So the next time through, the first three messages are printed again, because they are no longer in the cache.
I don't think you need the juggling of $old and $newmessages. You can just keep one hash where you cache all the messages. The problem with this (and the reason why I didn't do it that way in my code) is that you have no way of knowing which messages are older or newer, so unless you attach a timestamp to each entry, your cache will grow indefinitely. Furthermore, if the same user says the same thing in two different occasions, the second time through it will not be printed because your program will think it's a repeated message.
Hope this helps,