Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^3: Thanks to Ikegami, Chromatic & Corion

by chromatic (Archbishop)
on Nov 01, 2011 at 22:00 UTC ( #935234=note: print w/replies, xml ) Need Help??

in reply to Re^2: Thanks to Ikegami, Chromatic & Corion
in thread Thanks to Ikegami, Chromatic & Corion

If I want to display &lab;special&rab;, do I emit &lab;special&rab;?

Improve your skills with Modern Perl: the free book.

Replies are listed 'Best First'.
Re^4: Thanks to Ikegami, Chromatic & Corion
by ikegami (Pope) on Nov 01, 2011 at 23:20 UTC

    "&" is not a meta character in aXML unless followed by "lab;" or one of the other 5, so "&" outputs "&", so "&lab;special&rab;" produces "&lab;special&rab;".

    To output "&lab;special&rab;" one needs "<special>lab</special>special<special>rab</special>".

    The escape function is:

    my %escapes = ( '<' => '&lab;', '>' => '&rab;', '(' => '&lcb;', ')' => '&rcb;', '[' => '&lsb;', ']' => '&rsb;', '&lab;' => '<special>lab</special>', '&lcb;' => '<special>lcb</special>', '&lsb;' => '<special>lsb</special>', '&rab;' => '<special>rab</special>', '&rcb;' => '<special>rcb</special>', '&rsb;' => '<special>rsb</special>', ); #my $escapes_pat = join '', map quotemeta, keys %escapes; #my $escapes_re = qr/$escapes_pat/; my $escapes_re = qr/[<>()\[\]]|&[lr][acs]b;/; # Manually tweaked. sub escape(_) { my ($s) = @_; $s =~ s/($escapes_re)/$escapes{$1}/g; return $s; }

    These are probably better choices:

    my %escapes = qw( & &AMP; < &LAB; > &RAB; ( &LCB; ) &RCB; [ &LSB; ] &R +SB; ); sub escape(_) { my ($s) = @_; $s =~ s(/[&<>()\[\]])/$escapes{$1}/g; return $s; }

    Or using v5.14's s///r:

    my %escapes = qw( & &AMP; < &LAB; > &RAB; ( &LCB; ) &RCB; [ &LSB; ] &R +SB; ); sub escape(_) { $_[0] =~ s(/[&<>()\[\]])/$escapes{$1}/gr }

    Why is "parenthesis" abbreviated to "c"? I think reading the "c" as curly, but "{" and "}" are the curly brackets.

      c as in "curved"

      Changing the specials from lower to uppercase would be quite easy, perhaps it would be better to support both? Or would that be confusing?

      I'm going to add that sub escape in the parser right now as I think it will be a lot faster than how I'm currently doing it:

      $aXML =~ s@&lab;@\<@gs; $aXML =~ s@&rab;@\>@gs; $aXML =~ s@&lcb;@\(@gs; $aXML =~ s@&rcb;@\)@gs; $aXML =~ s@&lsb;@\[@gs; $aXML =~ s@&rsb;@\]@gs;

      Oh, btw there is also another special and token which I haven't mentioned yet.

      pseudocode ---------- $aXML =~ s@`@<backtick>@gs; while ( commands remain unprocessed ) { foreach match for <any_command> { substitute for <`any_command> set found_command boolean flag true } if (found_command) { while ( find match for <`command>data</command> ) { process if ( no nested ` chars found ) } } }

      The reason for that is so that it forces the tags to be computed innermost to outermost, by negating the ` control char. Also the parser does not proceed to the lower priority tag types until all of the higher types have been processed.

      File inclusions restart the parser so that they can have a complete new tag hierarchy within them, which can then include another hierarchy and so on recursively.

      I suspect the backtick char (and tag) probably won't be needed at all by a proper compiler.

      I also have some ideas about an editor for aXML, as far as I can tell it should be possible to run tags in isolation right inside the editor to see what they output.

      So if we have a bit of aXML like


      And we give the editor a query in like an address bar at the top;


      Then right clicking foo could run the plugin and return the result right there.


      This would make debugging really easy and quick! If "bar" is not what your expecting to get from the tag then you know the problem is with the plugin. If "bar" is correct, then you can click on one tag outwards to execute that and see what it does... and so on interactively.

      It would beat the hell out of having to swap between browser and editor windows constantly to see what is going on, and say a double right click could restore it back to its original state.

      opening up an inc tag like that would load the appropriate file, ready for editing, then when you click back to close it again the editor can automatically save the secondary file for you. This way you can navigate and traverse complex structures and hierarchies without having to load, save and close files manually.

        I'm going to add that sub escape in the parser right now

        The point of escape is to prevent aXML from processing that which is passed to the function. It is used by plugins, not aXML. aXML performs the *reverse* operation after the template has been fully processed.

        my %escapes = ( '&lab;' => '<', '&rab;' => '>', '&lcb;' => '(', '&rcb;' => ')', '&lsb;' => '[', '&rsb;' => ']', ); sub final_processing { my ($content) = @_; $content =~ s{ (?: (&[lr][acs]b;) | <special...>(...)</special> | <post_include...>(...)</post_include> ) }{ if (defined($1)) { $escapes{$1} } elsif (defined($2)) { '&'.$2.';' } else { ... } }xeg; return $content; }

      Erm... something ain't right, I just hacked your escapes code in and it's converted every "<" and ">" in the whole document!

      output ------ &lab;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "ht +tp://"&rab; &lab;html lang="en"&rab; &lab;head&rab; &lab;link href="/css/main.css" rel="stylesheet" type="text/css"&ra +b; &lab;null&rab; &lab;link href="/css/colours/daytime.css" rel="stylesheet" type="t +ext/css"&rab; &lab;script type="text/javascript" src="/js/ajax.js"&rab;&lab;/scr +ipt&rab; &lab;meta http-equiv="Content-Type" content="text/html; charset=ut +f-8"&rab; &lab;title&rab;Perl Nights&lab;/title&rab; &lab;/head&rab; &lab;body&rab; ... ...

        The escapes need to be the other way around!

        my %escapes = ( '&lab;' => '<', '&rab;' => '>', '&lcb;' => '(', '&rcb;' => ')', '&lsb;' => '[', '&rsb;' => ']' );

        I need a new escapes_re, because now it's simply destroying all the brackets!

Re^4: Thanks to Ikegami, Chromatic & Corion
by Logicus on Nov 01, 2011 at 22:31 UTC

    Yes, just tested that and it works as expected. You could also do:


    bit more verbose, but the output is the same, and possibly a bit more readable.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935234]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2018-05-25 13:47 GMT
Find Nodes?
    Voting Booth?