Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Perl Code Colorizer

by BrentDax (Hermit)
on Jul 31, 2001 at 11:29 UTC ( #101088=sourcecode: print w/ replies, xml ) Need Help??

Category: Utility Scripts
Author/Contact Info Brent Dax
brentdax1@earthlink.net
Description: Okay, this one is pretty scary. This script reads in a (simple) chunk of Perl code (using normal filter behavior) and spits out an HTML file with certain things colorized. (You can see the list in the setup of the %config hash at the top.) It's extremely regexp-heavy. It's also pretty easy to confuse.

Notable bugs:
1. In a line like m#regexp#;  #comment, the comment won't be colorized. Sorry.
2. In a line like m{foo{1,2}bar}, the program will get confused and stop highlighting after the 2. I've really got to work on nesting...

For all that, however, there's a lot of cool things it /can/ do, like:
-recognizing and colorizing (most) heredocs
-colorizing statements like @{&{$foo{bar}}} nicely to show which curlies belong to which sigil
-actually working most of the time

Only the colors for sigils are well-thought-out--the rest were just temporary values I assigned on a whim.

Also note that this was a lot of monkeys and typewriters--I myself aren't quite sure how it all works correctly. Well, have fun with this chunk of code!

#!/usr/bin/perl -w

use strict;

our($text, %config, %complements);

$config{colors}={
    #sigils=red
    '$'    => '993333',
    '@'    => 'CC6633',
    '%'    => '660000',
    '&'    => '990033',
    '*'    => '990000',

    #quotes=blue
    "'"    => '6699FF',
    '"'     => '3366CC',
    '`'     => '333399',
    'qw'     => '0000FF',

    #regexes
    'm'    => '000088',
    's'     => '0000CC',
    'y'     => '0000FF',
    'tr'    => '0000FF',

    #various
    '()'    => '880000',
    '[]'    => 'CC0000',
    '{}'    => 'FF0000',
    '<<'    => '00CC00',
    '#'    => '339999',
};

%complements=(
    '{' => '}',
    '(' => ')',
    '[' => ']',
    '<' => '>'
);

for(32..127) {
    $complements{chr($_)}=chr($_) unless exists $complements{chr($_)};
}

$/=undef;
$text=<>;

$text=highlight_quotes($text);
$text=highlight_regexes($text);
$text=highlight_various($text);
$text=highlight_sigils($text);
$text=fix_it_up($text);

print <<"END";
<HTML><font face="Courier New">
    $text
</font></HTML>
END


sub fixit($) {
    local $_=shift;
    
    if(/qq/)    { '"' }
    elsif(/qx/) { '`' }
    elsif(/qw/) { 'qw'}
    elsif(/q/)  { "'" }
    else { die "$0: $_ is not a valid quoter\n" }
}

sub fix_it_up($) {
    local $_=shift;
    s/\n/\0<BR\0>\n/g;
    s/(?<!\0)</&lt;/g;
    s/(?<!\0)>/&gt;/g;
    s/  /&nbsp; /g;
    #since I'm using null as an escape character, I have to get rid of
+ the ones that are left
    s/\0//g;
    return $_;
}

sub highlight_quotes($) {
    local $_=shift;

    #heredocs: MUST BE DONE before '' and "" highlighting
    s|<<(['"]?)(.*)\1([^\n]*?)\n(.*?)\n\2\n|<<$1$2$1$3\n\0<font color=
+\0"\0#$config{colors}{'<<'}\0"\0>$4\0</font\0>\n$2\n|gs;

    #normal quoted strings
    s|(?<!\0)(['"`])(.*?)(?<!\\)\1|$1\0<font color=\0"\0#$config{color
+s}{$1}\0"\0>$2\0</font\0>$1|gs;

    # qX
    s|(q[qxwr]?)([^\w\s\[\{\(\<])(.*?)(?<!\\)(??{quotemeta($complement
+s{$2})})|qq($1$2\0<font color=\0"\0#).$config{colors}{fixit($1)}.qq(\
+0"\0>$3\0</font\0>$complements{$2})|ges;

    return $_;
}

sub highlight_various($) {
    local $_=shift;

    #highlight subscripting and function args
    s|(?<=\w)([\{\[\(])(.*?)(??{quotemeta($complements{$1})})|\0$1\0<f
+ont color=\0"\0#$config{colors}{$1.$complements{$1}}\0"\0>$2\0</font\
+0>\0$complements{$1}|sg;

    #highlight comments--unless the sharp appears to be the delimiter 
+of a q, qq, qw, qx, tr, y, m, or s
    s|^([^#]+)(?<![qwxmsry\0])#(.*)$|$1\0<font color=\0"\0#$config{col
+ors}{'#'}\0"\0>\0#$2\0</font\0>|gm;
    s|^#(.*)$|\0<font color=\0"\0#$config{colors}{'#'}\0"\0>\0#$1\0</f
+ont\0>|gm;

    return $_;
}

sub highlight_sigils($) {
    local $_=shift;

    #wrapped in {}
    1 while s|(?<!\0)([\$\@\%\&\*])(?<!\0)\{(.*?)(?<!\0)\}|\0<font col
+or=\0"\0#$config{colors}{$1}\0"\0>\0$1\0{$2\0}\0</font\0>|sg;

    #unwrapped
    s[(?<!\0)([\$\@\%\&\*])((?:[\w:]|\0(?:\{|\}|\[|\]))*)][\0<font col
+or=\0"\0#$config{colors}{$1}\0"\0>\0$1$2\0</font\0>]sg;

    return $_;
}

sub highlight_regexes($) {
    local $_=shift;

    #m//, m{}
    s{m([^\w\s\0])(.*?)(?<![\\\0])(??{quotemeta($complements{$1})})}
     {m$1\0<font color=\0"\0#$config{colors}{m}\0"\0>$2\0</font\0>$com
+plements{$1}}gs;

    #s///, tr///
    s{(s|tr|y)([^\w\s\0])(.*?)(?![\\\0])\2(.*?)(?![\\\0])\2}
     {$1$2\0<font color=\0"\0#$config{colors}{$1}\0"\0>$3\0</font\0>$2
+\0<font color=\0"\0#$config{colors}{'"'}\0"\0>$4\0</font\0>$2}gs;

    #s{}{}, tr{}{}
        s{(s|tr|y)([\{\[\(\<])(.*?)(?<![\\\0])(??{quotemeta($complemen
+ts{$2})})(\s*)([\{\[\(\<])(.*?)(?<![\\\0])(??{quotemeta($complements{
+$5})})}{$1$2\0<font color=\0"\0#$config{colors}{$1}\0"\0>$3\0</font\0
+>$complements{$2}$4$5\0<font color=\0"\0#$config{colors}{'"'}\0"\0>$6
+\0</font\0>$complements{$5}}gs;

    return $_;
}

Comment on Perl Code Colorizer
Download Code
Re: Perl Code Colorizer
by damian1301 (Curate) on Jul 31, 2001 at 19:10 UTC
    Don't reinvent the wheel! :) We already have Syntax::Highlight::Perl to do that for you, written by our own bbfu.

    $_.=($=+(6<<1));print(chr(my$a=$_));$^H=$_+$_;$_=$^H; print chr($_-39); # Easy but its ok.
Re: Perl Code Colorizer
by bikeNomad (Priest) on Jul 31, 2001 at 20:06 UTC
    I do this with my favorite Perl syntax utility (which is quite hard to fool) Perltidy with the -html flag.

    You can also use it to indent inside your editor.

Re: Perl Code Colorizer
by gabaux (Initiate) on Jun 26, 2003 at 07:24 UTC

    Hi,

    Emacs will colorize your Perl code on the fly...

Back to Code Catacombs

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://101088]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2014-07-25 03:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (167 votes), past polls