http://www.perlmonks.org?node_id=134822
Category: PerlMonks Related Scripts
Author/Contact Info crazyinsomniac at /tell in da chatterbox
Description: If you've ever been to scratch pad viewer and you've wanted to download some code from somebody's scratchpad, you've noticed there is no "d/l code link". It turns out, if you use my script, you don't need it.
 %>perl scratchcode.pl monkname

update: Tue Apr 2 10:37:58 2002 GMT ~ changed off trigger due to Petruchio

update: Fri May 24 11:19:30 2002 GMT ~ " as added, so gots to take it out (and also added a \n remove, jeez, i'm gonna make this obsolete soon ~ and & ~ i hope the goods see pmdev wiki soon)

update: Fri Jan 17 05:31:13 2003 GMT -- more changes, more fixes.

#!/usr/bin/perl -w

=head1 AUTHOR

 crazyinsomniac

=head1 SYNOPSIS

 %>perl scratchcode.pl monkname 

=head1 TRIGGERS

=head2 START

 <hr>

=head2 END

   <!-- nodelets start here -->

=cut

use strict;
use LWP::Simple qw(get);
use HTML::TokeParser;

my $PADD = "http://perlmonks.org/index.pl?node_id=108949;user=";

my $user = shift or die "usage: ". __FILE__ ." user";

my $file = get($PADD.$user) or die "Cannot get the page $PADD$user\n";

my $P = HTML::TokeParser->new(\$file);

my $TRIGGER=0; # oooh
my $CRAP = "";

while (my $T = $P->get_token() )
{                     #   0     1     2       3        4
    if($$T[0] eq "S") # ["S", $tag, $attr, $attrseq, $text]
    {
        if(not $TRIGGER and $TRIGGER !=-1 and $$T[1] eq 'hr')
        {
            $TRIGGER=1;
            print STDERR "TRIGGER ON\n";
        }
        else
        {
            $CRAP .= $$T[4] if $TRIGGER;
        }
    }
    elsif($$T[0] =~ /^(?:E|PI)$/ ) # end tag | process instruction
    {
        $CRAP .= $$T[2] if $TRIGGER;
    }
    elsif($$T[0] =~ /^(?:T|C|D)$/ ) # text | comment | declaration
    {
        if($$T[0] eq 'C'
           and $TRIGGER
           and $TRIGGER != -1
#           and $$T[1] eq '<!-- nodelets start here -->')
           and $$T[1] =~ m{\Q<!-- Begin nodelets -->\E})
        {
            $TRIGGER=-1;
            print STDERR "TRIGGER OFF\n";
        }

        $CRAP .= $$T[1] if $TRIGGER;
    }
} # endof while (my $T = $P->get_token)

=head2 STRIP HTML

Because I did wanted to match only the nodelets start here comment and
+ not
these tags before the comment, I just strip the html off usign substr
(last 55 characters).

  </td>
  <td width="20%" valign=top align=right>
   
=cut

# L to the V to the A to the L
substr($CRAP,-55) = '';

# and now to massage crap (reverse the effects of code tags
$CRAP =~ s{\n<FONT color="red">\+<\/FONT>}{}g;
# multiline code
$CRAP =~ s{<pre><TT><FONT size="-1">}{\<CODE\>}g;
$CRAP =~ s{</FONT></TT></pre>}{\<\/CODE\>}g;
# single line code
$CRAP =~ s{<TT><FONT size="-1">}{\<CODE\>}g;
$CRAP =~ s{</FONT></TT>}{\<\/CODE\>}g;
$CRAP =~ s{\&lt\;}{<}g;
$CRAP =~ s{\&gt\;}{>}g;
$CRAP =~ s{\&#91\;}{\[}g;
$CRAP =~ s{\&#93\;}{\]}g;
$CRAP =~ s{\&quot;}{"}g;
$CRAP =~ s{\&amp;}{\&}g;

print $CRAP if(@ARGV);

#print "$1\n\n\n" while($CRAP =~ m{\<code\>(.*?)<\/code\>}igs)

print "$1\n\n\n" while($CRAP =~ m{\Q<pre><tt class="code"><font size="
+-1">\E(.*?)\Q</font></tt></pre>\E}igs)
__END__