Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re^2: Thanks to Ikegami, Chromatic & Corion

by Logicus
on Nov 01, 2011 at 21:46 UTC ( #935230=note: print w/replies, xml ) Need Help??

in reply to Re: Thanks to Ikegami, Chromatic & Corion
in thread Thanks to Ikegami, Chromatic & Corion

The complete sourcecode to, including all site assets, will be available shortly in the "Castle Blueprints" section. (gosh a self-replicating castle... whatever next?)

In the table below, input refers to information either loaded from disc, retrieved from a db, created by a plugin, or entered through the query data by an end user.

Output refers to the final produced output at end of processing.

In between these two there is usually no need to consider these specials at all as they are all automatically dealt with by the system, so unless your doing something very bizarre your not likely to ever see them or need to know what they are about.

Having said that for the sake of absolute completeness the special symbols work as follows;

Input Output
&lab; <
&rab; >
&lcb; (
&rcb; )
&lsb; [
&rsb; ]
<special>lab</special> &lab;
<special>lcb</special> &lcb;
<special>lsb<special> &lsb;
<special>rab<special> &rab;
<special>rcb</special> &rcb;
<special>rsb</special> &rsb;
&lab;special&rab; <special>
&lab;/special&rab; </special>

Note: if you want to use a tag other than "special" to instruct the parser to insert a special char, perhaps in the case where your using an XML file containing tags of the same name, provision has been made to rename the special tag by changing a value in the file.

It's also possible in theory, that since the end programmer is not expected to know or care about these symbols for the vast majority of their tasks, that the system could be automated such that input is scanned for the existence of <special> tags and automatically shift to using a different delimiter, for instance <special1>. I'm not sure if that would be overkill, there is such a thing as a sledgehammer and a nut.

In the case where you wish to output aXML code such as <db_mask> then you have the choice to either use &lab;/&rab; or &lt;/&gt;, either will work, and which you use depends on whether you want the literal output, or output is to be encoded for display in a browser. (the latter being I suspect the far more common requirement.

The following aXML file was used to test the round trip completeness for the parser code:

Listing of actions/test/body.aXML --------------------------------- <html> <head></head> <body> (displayqd)input(/displayqd) <hr> <buildform action="test" method="POST"> <input type="submit" value="Make it so!"><br/> <textarea name="input" rows="80" cols="80" >(textareaqd)input(/textareaqd)</textarea> </buildform> <br/><br/><br/> </body> </html>

As you can see, no mention of any specials is required at the document level, as they are all handled automatically at the parser/plugin level. The above code run with the standard set, can take the parser as input in the text area, and correctly displays it above, and encodes it in the input area for another go around the circle if you click submit again. Sending it around the circle multiple times has no detrimental effect; the output remains identical to the input.

Modifying the above like this :

<html> <head></head> <body> (highlightcode)(displayqd)input(/displayqd)(/highlightcode) <hr> <buildform action="test" method="POST"> <input type="submit" value="Make it so!"><br/> <textarea name="input" rows="80" cols="80" >(textareaqd)input(/textareaqd)</textarea> </buildform> <br/><br/><br/> </body> </html>

Causes fragments of perl wrapped in <code> tags to be highlighted in a similar fashion to the "nano" text editor, and fragments of aXML wrapped in <aXML> tags to be highlighted in a similar vein.

The highlighting is done with <span> tags and can then be styled using CSS. PerlNights currently has two colour-schemes, midnight (light-text on darkblue bg) and daytime (dark text on white bg), and more might be added later.

P.s, if you want a sneak peak at the PerlNights code before I'm finished writing it, you can drop me an email and I will bundle you over a ZIP or tar.gz file, I'm sure you know my address, either that or you can just wait till I'm done and PerlNights is operational, won't be too long!

Replies are listed 'Best First'.
Re^3: Thanks to Ikegami, Chromatic & Corion
by chromatic (Archbishop) on Nov 01, 2011 at 22:00 UTC

    If I want to display &lab;special&rab;, do I emit &amp;lab;special&amp;rab;?

    Improve your skills with Modern Perl: the free book.

      "&" is not a meta character in aXML unless followed by "lab;" or one of the other 5, so "&amp;" outputs "&amp;", so "&amp;lab;special&amp;rab;" produces "&amp;lab;special&amp;rab;".

      To output "&lab;special&rab;" one needs "<special>lab</special>special<special>rab</special>".

      The escape function is:

      my %escapes = ( '<' => '&lab;', '>' => '&rab;', '(' => '&lcb;', ')' => '&rcb;', '[' => '&lsb;', ']' => '&rsb;', '&lab;' => '<special>lab</special>', '&lcb;' => '<special>lcb</special>', '&lsb;' => '<special>lsb</special>', '&rab;' => '<special>rab</special>', '&rcb;' => '<special>rcb</special>', '&rsb;' => '<special>rsb</special>', ); #my $escapes_pat = join '', map quotemeta, keys %escapes; #my $escapes_re = qr/$escapes_pat/; my $escapes_re = qr/[<>()\[\]]|&[lr][acs]b;/; # Manually tweaked. sub escape(_) { my ($s) = @_; $s =~ s/($escapes_re)/$escapes{$1}/g; return $s; }

      These are probably better choices:

      my %escapes = qw( & &AMP; < &LAB; > &RAB; ( &LCB; ) &RCB; [ &LSB; ] &R +SB; ); sub escape(_) { my ($s) = @_; $s =~ s(/[&<>()\[\]])/$escapes{$1}/g; return $s; }

      Or using v5.14's s///r:

      my %escapes = qw( & &AMP; < &LAB; > &RAB; ( &LCB; ) &RCB; [ &LSB; ] &R +SB; ); sub escape(_) { $_[0] =~ s(/[&<>()\[\]])/$escapes{$1}/gr }

      Why is "parenthesis" abbreviated to "c"? I think reading the "c" as curly, but "{" and "}" are the curly brackets.

        c as in "curved"

        Changing the specials from lower to uppercase would be quite easy, perhaps it would be better to support both? Or would that be confusing?

        I'm going to add that sub escape in the parser right now as I think it will be a lot faster than how I'm currently doing it:

        $aXML =~ s@&lab;@\<@gs; $aXML =~ s@&rab;@\>@gs; $aXML =~ s@&lcb;@\(@gs; $aXML =~ s@&rcb;@\)@gs; $aXML =~ s@&lsb;@\[@gs; $aXML =~ s@&rsb;@\]@gs;

        Oh, btw there is also another special and token which I haven't mentioned yet.

        pseudocode ---------- $aXML =~ s@`@<backtick>@gs; while ( commands remain unprocessed ) { foreach match for <any_command> { substitute for <`any_command> set found_command boolean flag true } if (found_command) { while ( find match for <`command>data</command> ) { process if ( no nested ` chars found ) } } }

        The reason for that is so that it forces the tags to be computed innermost to outermost, by negating the ` control char. Also the parser does not proceed to the lower priority tag types until all of the higher types have been processed.

        File inclusions restart the parser so that they can have a complete new tag hierarchy within them, which can then include another hierarchy and so on recursively.

        I suspect the backtick char (and tag) probably won't be needed at all by a proper compiler.

        I also have some ideas about an editor for aXML, as far as I can tell it should be possible to run tags in isolation right inside the editor to see what they output.

        So if we have a bit of aXML like


        And we give the editor a query in like an address bar at the top;


        Then right clicking foo could run the plugin and return the result right there.


        This would make debugging really easy and quick! If "bar" is not what your expecting to get from the tag then you know the problem is with the plugin. If "bar" is correct, then you can click on one tag outwards to execute that and see what it does... and so on interactively.

        It would beat the hell out of having to swap between browser and editor windows constantly to see what is going on, and say a double right click could restore it back to its original state.

        opening up an inc tag like that would load the appropriate file, ready for editing, then when you click back to close it again the editor can automatically save the secondary file for you. This way you can navigate and traverse complex structures and hierarchies without having to load, save and close files manually.

        Erm... something ain't right, I just hacked your escapes code in and it's converted every "<" and ">" in the whole document!

        output ------ &lab;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "ht +tp://"&rab; &lab;html lang="en"&rab; &lab;head&rab; &lab;link href="/css/main.css" rel="stylesheet" type="text/css"&ra +b; &lab;null&rab; &lab;link href="/css/colours/daytime.css" rel="stylesheet" type="t +ext/css"&rab; &lab;script type="text/javascript" src="/js/ajax.js"&rab;&lab;/scr +ipt&rab; &lab;meta http-equiv="Content-Type" content="text/html; charset=ut +f-8"&rab; &lab;title&rab;Perl Nights&lab;/title&rab; &lab;/head&rab; &lab;body&rab; ... ...

      Yes, just tested that and it works as expected. You could also do:


      bit more verbose, but the output is the same, and possibly a bit more readable.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935230]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2018-06-20 13:52 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (116 votes). Check out past polls.