Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Playing with PerlMonks site (1) - Copy a CODE without the '+'

by gmpassos (Priest)
on Jul 26, 2002 at 07:43 UTC ( #185455=sourcecode: print w/replies, xml ) Need Help??
Category: PerlMonks Related Scripts
Author/Contact Info Graciliano M. P.
Description: The '+' used to split big lines inside CODE are very usefull to reade in the browser, but to copy the CODE and test not! Just point to a node and it will return all the codes of the page without '+'.
#####################################
# PLAY WITH THE PERLMONKS SITE (1). #
#####################################

use LWP::Simple ;

$|=1;

my $node = '185131' ;
my $url = "http://www.perlmonks.org/index.pl?node_id=$node" ;
if ($node =~ /^http:\/\//) { $url = $node ;}

print "Getting node $node...\n" ;
print "$url\n" ;

$html = get($url);

my $lng = length($html) ;
print "$lng bytes.\n\n" ;

$html =~ s/\r\n?/\n/gs ;

my (@codes) = ( $html =~ /<pre><tt><font.*?>(.*?)<\/font><\/tt><\/pre>
+/gsi );

foreach my $code ( @codes ) {
  $code =~ s/\n<font.*?>\+<\/font>//gi ;
  $code = filter_from_html($code) ;
  print
"# CODE #################################################\n"
  if ($#codes > 0) ;
  print "$code\n" ;
}

####################
# FILTER_FROM_HTML #
####################

sub filter_from_html {
  my ( $code ) = @_ ;

  my %SYMBOLS_html = (
  'acute' => 'aeiouAEIOU#' ,
  'grave' => 'aeiouAEIOU#' ,
  'circ'  => 'aeiouAEIOU#' ,
  'uml'   => 'aeiouAEIOU#' ,
  'tilde' => 'aoAO#' ,
  'cedil' => 'cC#' ,
  'lt'    => '#<' ,
  'gt'    => '#>' ,
  'quot'  => '#"' ,
  ) ;
  
  $code =~ s/&#(\d{1,3});/pack("C",$1)/eg;

  $code =~ s/&amp;?/&/gsi ;
  $code =~ s/&nbsp;?/ /gsi ;
  
  my ($start,$end,@letras1,@letras2,$max);

  foreach my $Key ( keys %SYMBOLS_html ) {
    ($start , $end) = split('#' , $SYMBOLS_html{$Key}) ;
    @letras1 = split('' , $start) ;
    @letras2 = split('' , $end) ;
    
    $max = $#letras1 ;
    if ($#letras2 > $max) { $max = $#letras2 ;}
    
    for(0..$max) {
      $code =~ s/\&$letras1[$_](?i:$Key);?/$letras2[$_]/g ;
    }
  }

  return( $code ) ;
}

#######
# END #
#######

# Send your feedback!
#
# "The creativity is the expression of the liberty".
Replies are listed 'Best First'.
Re: Playing with PerlMonks site (1) - Copy a CODE without the '+'
by Anonymous Monk on Jul 26, 2002 at 08:01 UTC
    Or, just use the "d/l code" link right below the 'comment on' link and realize you just solved a non-problem.
Re: Playing with PerlMonks site (1) - Copy a CODE without the '+'
by Corion (Pope) on Jul 26, 2002 at 08:01 UTC

    Of course, the d/l code link does the same :-))

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: Playing with PerlMonks site (1) - Copy a CODE without the '+'
by gmpassos (Priest) on Jul 26, 2002 at 08:05 UTC
    Thanks! I don't know this! New user...

    But we still need to save the file on d/l, since I'm using IE...

    But this code can be used for educational purpose, specially &filter_from_html.

    "The creativity is the expression of the liberty".
Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://185455]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2020-02-29 10:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (128 votes). Check out past polls.

    Notices?