Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Given some recent discussions and the availability of XML::SAX::Machines, I took a stab at getting a basic Flow Based Programming system up and running, which I've completed tonight (woohoo!). At some later point, after I've done some code tidying up, I'll provide the full code, but as an example of what I've got going so far, the flow assembly code looks something like:
#!/usr/bin/perl -w use strict; use XML::SAX::Machines qw( Machine ); use Language::Flow::Simple::Reader; use Language::Flow::Simple::Writer; use Language::Flow::Simple::Chomper; use Language::Flow::Simple::Counter; use Language::Flow::Simple::Merger; use Language::Flow::Simple::Constant; use Language::Flow::Simple::Sprintf; use XML::Filter::SAXT; my $reader = Language::Flow::Simple::Reader->new; my $writer = Language::Flow::Simple::Writer->new; my $chomper = Language::Flow::Simple::Chomper->new; my $counter = Language::Flow::Simple::Counter->new; my $merger = Language::Flow::Simple::Merger->new; my $constant = Language::Flow::Simple::Constant->new; my $sprintf = Language::Flow::Simple::Sprintf->new; my $merger2 = Language::Flow::Simple::Merger->new; my $m = Machine ( [ Intake => $reader => qw( B ) ], [ B => $chomper => qw( T ) ], [ T => XML::Filter::SAXT => qw( N I C ) ], [ C => $counter => qw( D ) ], [ D => $sprintf => qw( M ) ], [ I => $constant => qw( M ) ], [ M => $merger => qw ( N ) ], [ N => $merger2 => qw ( OUT ) ], [ OUT => $writer ] ); #for ( $m->parts ) { $_->preprocess() }; $m->parse(); #for ( $m->parts ) { $_->postprocess() };
The naming of the components should be self-explainitory, and while the arguments to those are currently hard coded, there's no reason for them to be as such. What this code does above is take in a file, read line by line, and add a formatted line number and a tab stop before the line, and print out to stdout. Certainly not complicated, but the tricky part was to handle the merge points when you bring the data that you split off into one piece.

The basic concept I've got here is that you send 'chunks' of data around the framework; the chunks are transferred as XML (they could be anything, but there's no reason why, with XML as the data language, that flow can't flow to a networked computer and then flow back); however, chunks only are meant to represent a small piece of data, such as a single line in a file. Because of this, while all the basic components of the system above are XML::SAX:;Bases, the only function a user would typically need to overload is a recieve_chunk function as demonstrated below for the sprintf component:

package Language::Flow::Simple::Sprintf; use Language::Flow::Base::Component; use Data::Dumper; BEGIN { @ISA = qw( Language::Flow::Base::Component ); } my $counter = 0; sub recieve_chunk { my $self = shift; my $chunk = shift; my $data = $chunk->get_data(); $self->emit_chunk( sprintf( "%4d", $data ), "string", $chunk->get_history(), { node=>"LFSSprintf", id=>$counter++ } ); } 1;
(All the SAX event functions are buried away in the base class). The tricky part is the history features, as indicated in the last part of the arg list for emit_chunk, as it's necessary for the programmer to include new histories to make sure that components like the merger work right. (The merger looks at some past history point to decide when two chunks should be merged).

Again, this is only a start. I will post full code once I've tidied up what I've gotten and made some improvements. For example, see how I have to use XML::Filter::SAXT to split the stream into multiple parts. I should be able to build something similar in my component system. While I use SAXT, I have commented out functions that would be called on all components before and after the run which could normally be used to reset counters and free resources if needed.

Of course, at some point, it would be best to have a textual way to describe the Machine without using perl code. That's well in the future, but should be a simple addition when the framework is set.

Part of the reason that I post this know is that there's talk on the perl-xml list on XML Pipelines. Now, while I think there's an overlap, it's not the same as what I'm trying to do here; my impression is that with XML Pipelines, you feed in whole documents at a time, and you process at the document level, leaving character-level or other changes up to other transitions like XSLT and the like. This is probably great for, as some of matts preliminary examples suggest, setting up easy document conversion for a web server. I think my approach here is more finely grained, and thus may not be as appropriate for that purpose, but my initial idea when working on this way back even in Java was for a rapid 'scripting' type of programming.

Any initial comments are welcome. AGain, this is only a starting point, and there's a lot of work to get what I consider to be the key base down and stable. Once that's in place, then additions should be very easy to do.

-----------------------------------------------------
Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
"I can see my house from here!"
It's not what you know, but knowing how to find it if you don't know that's important


In reply to A preliminary stab at Flow-Based Programming by Masem

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others cooling their heels in the Monastery: (4)
    As of 2014-09-19 22:57 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      How do you remember the number of days in each month?











      Results (151 votes), past polls