Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: rough start of an axml compiler

by Boldra (Deacon)
on Aug 01, 2011 at 12:02 UTC ( #917828=note: print w/ replies, xml ) Need Help??


in reply to rough start of an axml compiler

You say in your "offtopic epiphony" that you believe

<<a>b</a>>c</<a>b</a>>
to be unrepresentable in any kind of data structure, perl or otherwise. Here's a simple solution:
my @nodes = ( bless( { 'data' => 'c', 'tag' => bless( { 'data' => 'b', 'tag' => 'a' }, 'Node' ) }, 'Node' ) );
The definition of the action to be performed on data 'c' is postponed until operation 'a' is performed on data 'b'.

I think you wrote a parser already, so I'm sure you can adapt it to produce a structure like above. Once you have the structure, generating the output is also straightforward:

package Node; use Moose; has [ qw<data tag> ] => ( is => 'rw', isa => 'Any' ); sub as_text { my ($self) = shift; my $tag = $self->tag; my $tag_processing_method = ref $tag ? $tag->as_text : $tag; return $self->$tag_processing_method( $self->data ); } # Tag Processing Methods here: sub a { "super_$_[1]" } # prepend "super_" sub b { "b_$_[1]" } # prepend "b_" sub super_b { "B_$_[1]" } # prepend "B_"
If you run it (say for map { $_->as_text } @nodes), you'll see that instead of sub b being called, super_b is called.

I'd be very inclined to add string overloading to the Node package so:

use overload q{""} => 'as_text', fallback => 1, ;
which could make the calls even simpler, (with a possible cost to debugging and maintainability). as_text becomes
sub as_text { my ($self) = shift; my $processing_method = $self->tag; return $self->$processing_method( $self->data ); }
and generating output once you have your @nodes array is simply stringification. print @nodes;

update fixed some typos


Comment on Re: rough start of an axml compiler
Select or Download Code
Replies are listed 'Best First'.
Re^2: rough start of an axml compiler
by Logicus on Aug 01, 2011 at 19:29 UTC

    I said any kind I know of, but then I am renowned for being an uneducated thick-wit who won't listen to advice of my elders and betters.

    I'm going to have to have a good think about what you've put there above. Digestion should be complete in a few days, before which any comment I make will probably be seen as another example of my stupidity.

    The first thing that is running through the vacuous hole I refer to sometimes laughingly as my brain, is how to decompress this :

    my @nodes = ( bless( { 'data' => 'c', 'tag' => bless( { 'data' => 'b', 'tag' => 'a' }, 'Node' ) }, 'Node' ) );

    From the source;

    <<a>b</a>c</<a>b</a>>

    I have a pathological aversion to all things OOP, but the apparent simplicity of what you have shown above is strangely appealing. Thanks!

Re^2: rough start of an axml compiler
by Logicus on Aug 02, 2011 at 12:40 UTC

    Well Boldra, you've thrown a proper little spanner into my works... I'm not complaining because I really like your example!

    I was going to run a small number of regex conversions on an aXML string and turn it into classic XML to feed XML::Simple for turning into a perl structure, but I can't do that now if I want to use the method above. .o0(~Hrm~)

    One quick question though, under this schema would every tag have to have a definition? As in what would happen to tags which are just markup around and within tags which have defined roles?

    Also there is another thought that I don't know exactly how to describe I guess you could call it orphan data, for example:

    listing actions/default/body.aXML --------------------------------- <html> <head><title>acme products</title></head> <body> some orphan text that needs to be in the output <use>actions/<qd>action</qd>/main.aXML</use> some more orphan text </body> </html>

    I'm guessing that the above would be mapped to your moose solution thusly:

    package actions::default::body; my @nodes = ( bless ( { 'tag' => 'html', 'data' => [ bless ( { 'tag' => 'head', 'data' => bless ( { 'tag' => 'title', 'data' => 'acme products' }, 'Node' ), bless ( { 'tag' => 'body', 'data' => [ bless ( { 'tag' => 'orphan', 'data' => 'some orphan text that needs t +o be in the output' }, 'Node' ), bless ( { 'tag' => 'use' 'data' => [ bless ( { 'tag' => 'orphan', 'data' => 'action/'}, 'Node +' ), bless ( { 'tag' => 'qd' 'data' => 'action' }, 'Node +' ), bless ( { 'tag' => 'orphan', 'data' => '/main.aXML' }, ' +Node' ) ] }, 'Node' ) bless ( { 'tag' => 'orphan', 'data' => 'some more orphan text' ), 'Node' ) ] }, 'Node' ) ] }, 'Node' ) ); sub getNodes { return @nodes; } 1;
      Have you considered leaving the untagged content as plain text?
      my @nodes = ( bless ( { 'tag' => 'html', 'data' => [ bless ( { 'tag' => 'head', 'data' => bless ( { 'tag' => 'title', 'data' => 'acme products' }, 'Node' ), bless ( { 'tag' => 'body', 'data' => [ 'some orphan text that needs to be in the + output', bless ( { 'tag' => 'use' 'data' => [ bless ( { 'tag' => 'orphan', 'data' => 'action/'}, 'Node +' ), bless ( { 'tag' => 'qd' 'data' => 'action' }, 'Node +' ), bless ( { 'tag' => 'orphan', 'data' => '/main.aXML' }, ' +Node' ) ] }, 'Node' ) 'some more orphan text', ] }, 'Node' ) ] }, 'Node' ) );
      and it may interest you that with Moose buildargs, you can easily set up the Node constructor to expect a tag and data, e.g. Node->new( qd => 'action' );. The output of Data::Dumper would still contain the bless { }, 'Node' syntax, making it a good place to do debugging and testing.
      my @nodes = ( Node->new( html => [ Node->new( head => Node->new( title => 'acme products' ), ), Node->new( body => [ 'some orphan text that needs to be in the output', Node->new( use => [ 'actions/', Node->new( qd => 'action'), '/main.aXML', ), 'some more orphan text', ], ), ] ), );
      but then why make nodes out of plain html if you have no action planned for them? Checking whether a tag is implemented during parsing is going to save you headaches later.
      my @nodes = ( '<html> <head><title>acme products</title></head> <body> some orphan text that needs to be in the output', Node->new( use => [ 'actions/', Node->new( qd => 'action' ), ' +/main.aXML' ] ), 'some more orphan text </body> </html>', )
      with which print @nodes would just do the right thing.

        That makes life a lot easier!

        The only caveat I can think of is the refas plugin ie :

        (refas tag="user")/path/to/user.xml(/refas) <p>Welcome back <user>username</user>, you were last here on : [time f +ormat="HH:MM:SS, DD/MM/YYY"]<user>lastvisit</user>[/time].</p>

        Where <user> is not a known tag until refas creates a definition for it which maps the user tag data to the nodes in the user.xml file, and [time] takes an integer and gives back a formatted date/time string.

        I was planning on expressing the difference between the three tag types by adding an attribute called aXML_class to the tags when converting them to standard XML :

        (SQL mode="mask") <query> SELECT username,email FROM users; </query> <mask> [link action="showuser" username="<d>username</d>" ]<d>username</d>[/link], [link to="mailto:<d>email</d>"]<d>email</d>[/link] <br> </mask> (/SQL) Becomes : <SQL aXML_class="primary" mode="mask"> <query> SELECT username,email FROM users; </query> <mask> <link aXML_class="tertiary" action="showuser" username="<d>username</d>" ><d>username</d></link>, <link aXML_class="tertiary" to="mailto:<d>email</d>" ><d>email</d></link> <br> </mask> </SQL> Also when tags have tags embedded in their attributes like this : <a b="<c>d</c>">data</a> converting the expression to XML like this; <a aXML_class="standard"> <attr>b="<c aXML_class="standard">d</c>"</attr> <contents>data</contents> </a>

        The examples above would map like this :

        <SQL aXML_class="primary" mode="mask"> <query> SELECT username,email FROM users; </query> <mask> <link aXML_class="tertiary" action="showuser" username="<d>usern +ame</d>"><d>username</d></link>, <link aXML_class="tertiary" to="mailto://<d>email</d>"><d>email< +/d></link> <br> </mask> </SQL> becomes : my @nodes = ( Node->new( SQL => { aXML_class => 'primary', attr => { mode => "mask" }, contents => { '<query>SELECT * FRO +M users</query> <mask>', [ Node->new( link + => { aXML_class => 'tertiary', + attr => { action => 'showuser', + username => '<d>username<d>' }, + contents => '<d>username</d>' + } ), + + contents => '<d>username</d>' } ), Node->new( link + => { aXML_class => 'tertiary', + attr => { to => 'mailto://<d>email</d>' }, + contents => '<d>email</d>' + } ), ], '<br></mask>' } } ) ); and <a b="<c>d</c>">data</a> becomes : <a aXML_class="standard"> <attr>b="<c aXML_class="standard">d</c>"</attr> <contents>data</contents> </a> then becomes : my @nodes = ( Node->new( a => { aXML_class => 'standard', attr => { b => Node->new ( c => +{ aXML_class => 'standard', + contents => 'd' + } ) }, contents => 'data' } } ) );
Re^2: rough start of an axml compiler
by Anonymous Monk on Aug 02, 2011 at 02:34 UTC
    Corion, muba, and a few others already explained this independent of each other, he is just playing dumb, you're feeding the troll
      Look, I can feed two at once!

        Look, I can feed two at once!

        Needs more catsup!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://917828]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2015-07-31 23:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (282 votes), past polls