|Perl: the Markov chain saw|
Given some recent discussions and the availability of XML::SAX::Machines, I took a stab at getting a basic Flow Based Programming system up and running, which I've completed tonight (woohoo!). At some later point, after I've done some code tidying up, I'll provide the full code, but as an example of what I've got going so far, the flow assembly code looks something like:
The naming of the components should be self-explainitory, and while the arguments to those are currently hard coded, there's no reason for them to be as such. What this code does above is take in a file, read line by line, and add a formatted line number and a tab stop before the line, and print out to stdout. Certainly not complicated, but the tricky part was to handle the merge points when you bring the data that you split off into one piece.
The basic concept I've got here is that you send 'chunks' of data around the framework; the chunks are transferred as XML (they could be anything, but there's no reason why, with XML as the data language, that flow can't flow to a networked computer and then flow back); however, chunks only are meant to represent a small piece of data, such as a single line in a file. Because of this, while all the basic components of the system above are XML::SAX:;Bases, the only function a user would typically need to overload is a recieve_chunk function as demonstrated below for the sprintf component:
(All the SAX event functions are buried away in the base class). The tricky part is the history features, as indicated in the last part of the arg list for emit_chunk, as it's necessary for the programmer to include new histories to make sure that components like the merger work right. (The merger looks at some past history point to decide when two chunks should be merged).
Again, this is only a start. I will post full code once I've tidied up what I've gotten and made some improvements. For example, see how I have to use XML::Filter::SAXT to split the stream into multiple parts. I should be able to build something similar in my component system. While I use SAXT, I have commented out functions that would be called on all components before and after the run which could normally be used to reset counters and free resources if needed.
Of course, at some point, it would be best to have a textual way to describe the Machine without using perl code. That's well in the future, but should be a simple addition when the framework is set.
Part of the reason that I post this know is that there's talk on the perl-xml list on XML Pipelines. Now, while I think there's an overlap, it's not the same as what I'm trying to do here; my impression is that with XML Pipelines, you feed in whole documents at a time, and you process at the document level, leaving character-level or other changes up to other transitions like XSLT and the like. This is probably great for, as some of matts preliminary examples suggest, setting up easy document conversion for a web server. I think my approach here is more finely grained, and thus may not be as appropriate for that purpose, but my initial idea when working on this way back even in Java was for a rapid 'scripting' type of programming.
Any initial comments are welcome. AGain, this is only a starting point, and there's a lot of work to get what I consider to be the key base down and stable. Once that's in place, then additions should be very easy to do.