Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery


by epoptai (Curate)
on Jan 17, 2001 at 03:35 UTC ( #52413=CUFP: print w/replies, xml ) Need Help??

Merlyn Technique Viewer enhances the Web Techniques Perl Columns of Randal L. Schwartz (aka merlyn). On his home page he says of these articles,

"Because I'm generally busy (or lazy), I simply ran the source files through the pod2html translator so that I could put them online here. It makes them ugly sometimes, but you'll just have to live with that for now until someone comes up with a better solution."

A better solution
  1. Code description interleaved with source code.
  2. Code listed in a form for easy editing and saving.

    This program fetches selected articles and code listings from their source locations and interleaves the code with the article, makes some navigation links, and lists the code in a textarea. An index data file of all the current web technique articles must be built and refreshed once in a while.

    Here's a live demo.

    Update: Now uses HTML::TableExtract to resist subtle changes in the source files.
    #!perl -T # - Merlyn Technique Viewer by epoptai # 1. Interleaves source code with description. # 2. Code listed in textarea for editing and saving. use strict; use CGI qw(:standard); use LWP::Simple qw(get getstore); use HTML::Entities; use vars qw(%code $code $init $tmp $src $oop $wtc); use HTML::TableExtract; use CGI::Carp 'fatalsToBrowser'; my@oops=qw(10 13 28 38 48 55); # columns with double code listings my$url = url(); my$dat = 'index.dat'; unless(-e $dat){$init=1} &import_names('IN'); my$merlyn = ''; if($IN::action eq 'bindex'){&indexer} if(($IN::action eq 'save as')or($IN::codename)){&savecode} unless($IN::action eq 'view'){ print header,start_html(-title=>'MTV',-bgcolor=>'white',-text=>'black' +),blockquote,h1('MTV'); print<<HTML; Merlyn Technique Viewer enhances the <a href=" +m/merlyn/WebTechniques"><i> Web Techniques Perl Columns</i></a> of Randal L. Schwartz. On his home + page he says of these articles,<blockquote>"Because I'm generally busy (or lazy), I simply r +an the source files through the pod2html translator so that I could put them online here. It makes + them ugly sometimes, but you'll just have to live with that for now until someone comes up with + a better solution." </blockquote><h3>A better solution</h3> <ol><li>Code description interleaved with source code.<li>Code listed +in a form for easy editing and saving.</ol> This script fetches selected articles and code listin +gs from their source locations, modifies the contents, and displays the results. An index d +ata file of all the current web technique articles must be <a href="$url?action=bindex">built</a> +and <a href="$url?action=bindex">refreshed</a> once in a while. HTML if($init==1){ print qq~<p><b>Datafile not detected. <a href="$url?action=bindex" +>Build an index.</a>~; } if($init!=1){ my%dex = (); my$md = (stat($dat))[9]; open(FILE,"< $dat") or die "$!"; my@dat = <FILE>; close(FILE) or die "$!"; for(@dat){ my($num,$desc)=split(/\t/); $dex{$num}=$desc } my$c=keys(%dex); my$cc=($c/4); print qq~<p>Index of $c Web Techniques columns<br><small>~; printf "%s updated: %s\n", $dat, scalar localtime($md); print qq~</small><form><select name="getdoc" size="$cc">~; for(sort {$b<=>$a} keys(%dex)){ my$val=$_; for(@oops){ if($val=~$_){ $val='00ps. '.$_; } } print qq~<option value="$_">$val. $dex{$_}~; } print qq~</select><br><font size=-2>Note: '00ps' marks double code lis +tings (@oops) which choke MTV.<br>They are diplayed without enhanceme +nt.</font><br><input type="submit" name="action" value="view"></form> +<br><p align=right><small><a href=" +=merlyn+technique+viewer">coded</a> by <a href="">epoptai</a></small>~; } print end_html; exit; } if($IN::action eq 'view'){ $src = 'col' . $IN::getdoc . '.listing.txt'; $tmp = $^T . '.tmp'; my$col = 'col' . $IN::getdoc . '.html'; $wtc = $merlyn.'/'.$col; my$wts = $merlyn.'/'.$src; for(@oops){ if($IN::getdoc=~$_){$oop=1}} if($oop==1){ my$MTV=get($wtc); print header,blockquote; print $MTV; print end_html; exit(0); } my$res; getstore($wtc,$tmp); # had problems doing this with get $code = get($wts) or die "$!"; unless(-e $tmp){ print header,start_html(-title=>"Column $IN::getdoc source code listin +g",-bgcolor=>'white',-text=>'black'),blockquote,p("The newest columns + are not available online until after they appear in print in <a href +=''>Web Techniques magazine</a>. Until t +hen only the source code is published at <a href='http://web.stonehen'>Randal's site</a>. Try again +soon! <a href='$url'>Back to index.</a>"); encode_entities($code); print "<pre>$code</pre>"; print end_html; exit(0); } } open(F,"< $tmp") or die "$!"; my@col = <F>; close(F) or die "$!"; my$MTV=(); { open(F,"< $tmp") or die "$!"; local $/ = undef; $MTV = <F>; close(F) or die "$!"; } if($IN::action eq 'view'){ unlink $tmp} my@c=@col; for(@col){ if($_=~/=(\d+)=/){ # find code listing lines my$m=$&; # define hash key if($_=~/=1=/){$_=~s/<(.*?)>//g} # fix =1= $m=~s/=(\d+)=/$1/; # strip = = from key $_=~s/=(\d+)=/$1\./; # and value $_=~s/^\s+//g; # strip leading whitespace $code{$m}=$_; # build code hash } } my$h=(keys(%code)+3); my$form = "<form method='post'><input type='submit' name='action' valu +e='save as'> <input type='text' name='codename' size='30' value='col$'><br><textarea cols=80 rows='$h' name='codebody'>$code< +\/textarea></form>"; $MTV=~s/(([L,l]ines?\s(\d+))(\s(through|to|and)\s(\d+))?)/<mmm><p><mer +lyn>$1<\/merlyn>/gi; $MTV=~s/(<\/head>)/$1<base href="http:\/\/web\.stonehenge\.com\/merlyn +\/WebTechniques\/">/oi; $MTV=~s/(<BODY bgcolor=White text=Black>)/$1<blockquote>/oi; $MTV=~s/(<H1>)<A NAME="(.*?)">(.*?)<\/A>(<\/H1>)/$1<a name="$2" href=" +$wtc">$3<\/a>$4<b>As rendered by <a href="http:\/\/perlmonks\.org\/index\.pl\?node=merlyn\+technique\+v +iewer">MTV<\/a><\/b> - (<a href="$url">index<\/a>)<p>/iso; $MTV=~s/(<H2>.*?<\/H2>)(.*?)<\/PRE>/$1$form/iso; my@MTV=split(/<mmm>/,$MTV); for(@MTV){ if($_=~/^<p><merlyn>(([L,l]ines?\s(\d+))(\s(through|to|and)\s(\d+) +)?<\/merlyn>)/i){ my$d0=$&; my$d1=$3; my$c1=$5; my$d2=$6; if($c1=~/and/){ $_=~s/$d0/<p><tt> $code{$d1}<\/tt><br><tt> $code{$d2}<\/tt +><p>$d0/g; } elsif($c1=~/through|to/){ my@total; while($d1 <= $d2){ push @total, $code{$d1}; $d1++ } my$sum = join('<br>', @total); $_=~s/$d0/<pre>$sum<\/pre><p>$d0/g; } elsif(!$d2){ $_=~s/$d0/<p><tt> $code{$d1}<\/tt><p>$d0/g; } } } print header; print @MTV; print end_html; exit; sub indexer{ my$c; my$z = $merlyn . '/index.html'; my$dexter = get($z); my$te = new HTML::TableExtract( headers => [qw(Column Listing Descript +ion)] ); $te->parse($dexter); open (FILE,"> index.dat") or die "$!"; foreach my $ts ($te->table_states){ foreach my $row ($ts->rows){ chomp(@$row[0]); chomp(@$row[2]); @$row[0]=~s/col//o; print FILE "@$row[0]\t@$row[2]\n"; $c++ } } close(FILE) or die "$!"; print header,start_html(-bgcolor=>'white',-text=>'black'),blockquote,p +("$c Web Techniques columns indexed!<p><a href='$url'>Reload</a>"); exit } sub savecode{ open (FILE,"> $IN::codename") or die "$!"; print FILE $IN::codebody; close(FILE) or die "$!"; print header,start_html(-bgcolor=>'white',-text=>'black'),blockquote,p +("Code saved as <a href='$IN::codename'>$IN::codename</a> in same dir + as <a href='$url'>this script</a>."); exit }

Replies are listed 'Best First'.
Re: Merlyn Technique Viewer
by ichimunki (Priest) on Jan 17, 2001 at 18:59 UTC
    The HTML produced by this code generates a host of interesting errors when run through a validator. The first one is a DTD declaration (the second one in the document) inside the <head>...</head>. (comments based on sample output available at the link)
      You say this code generates "a host of interesting errors when run through a validator" followed by "comments based on sample output available at the link".

      Had you validated the output of the script you would have seen that it didn't generate the sample, a much earlier revision did. I neglected to update the sample because it wasn't as important to me as the code. After all it was only a sample, nothing i expected anyone to bother validating and criticizing on PERLmonks!

      Next time please run the perl code before running off to the HTML validator.

      And if you're really worried about a 5 space alignment variation in lynx (how nit picky can you get?) try looking at the source code to find out why it does that, and fix your copy, if you actually use lynx and can't live with it. I gave you the code for free, it's the least you could do.

      Update: I changed this node when i realized how ridiculous the criticism was. Apologies to davorg for eliminating the context of his reply. forgot to mention that these html errors are pretty much irrelevant since they don't interfere with rendering the document in a web browser, where most people will be using them.

        When Netscape (and later Microsoft) decided that their browsers were going to be lenient on pages that did not contain valid HTML, I wonder if they realised what a huge can of worms they were opening. The vast majority of the web is now made up of badly constructed pages of invalid HTML.

        When taken to task on this, the author's generally argue my that same way as you have, "but it works in both (sic) browsers" or "I'm not interested in the minority who use 'non-standard' browsers". Leaving aside the fact that there aren't just two browsers (and that there are many different versions of even the major two) and also the fact that there's no way that Netscape or IE can be described as 'standard' with any meaningful use of the term - there are two major flaws in this argument:

        • In my experience, it's just as easy to create standard HTML as it is to create non-standard HTML. This is particularly true if you're using something like to create your HTML.
        • Maybe now, most of your visitors are coming from Win32 PCs using IE5 or Netscape 4.x, but this situation is changing fast. The number of people surfing from Linux boxes is increasing all the time - and they have a much wider range of browsers to choose from. Also, what about the growing numbers of people surfing the web using PDAs or even WAP phones.

        There's really no good reason not to produce valid HTML (or, even better, XHTML) and it will make your site far more accessible and flexible.


        "Perl makes the fun jobs fun
        and the boring jobs bearable" - me

        Thanks for that concise and completely convincing defense of the "Best viewed with [INLINE]" school of web design.

        I viewed the results of this code in Lynx (a web browser), and I wonder why you have the code snippets being lined up with the left side of the window in some spots and lined up with the body text (about five characters in) in others.
UPDATE: Merlyn Technique Viewer
by epoptai (Curate) on Mar 08, 2001 at 09:02 UTC
    Now uses HTML::TableExtract to resist subtle changes in merlyn's HTML article index. Also fails gracefully for new columns that are listed but not yet posted (these display source code only). When new columns with double script listings come online simply add its column number to @oops to have MTV skip it (sorry but double code listings are very difficult to deal with, and there's only six of them out of sixty columns). Perhaps some genius will address that issue (hint hint).


Re: Merlyn Technique Viewer
by mkmcconn (Chaplain) on Jan 18, 2001 at 01:21 UTC

    Zowee epoptai! This wins my coveted Useful New Cool And Painless Things On PerlMonks award (The first recipient of said award!). Just uncork and pour, it does just what you say it does.

    You are right, this is a better idea. And after it's been run through the gauntlet I expect to see it used.


(crazyinsomniac) Re: MTV
by crazyinsomniac (Prior) on Aug 07, 2001 at 09:06 UTC
    I'm on dialup, and don't want to eat merlyns bandwidth, so, here's a little hack entitled

    MTV Cache

    #!C:/perl/bin/perl -w # generates an mtv cache use strict; my @oops = qw(10 13 28 38 48 55); # columns with double code listings my $mtv = 'perl -T '; my $dat = 'index.dat'; # make sure they match my %dex; $|=1; print "Binding Index\n"; `$mtv action=bindex`; print "reading $dat\n"; open(FILE,"< $dat") or die "$!"; my @dat = <FILE>; close(FILE) or die "$!"; print "generating list\n"; for(@dat) { my($num,$desc)=split(/\t/); chomp $desc; $dex{$num}= $desc; } undef @dat; print "deleting chokers\n"; for(@oops) { delete $dex{$_}; } undef @oops; print "building mtv.cache.index.html\n"; open(OUTFH, '> mtv.cache.index.html') or die $!; print OUTFH q~ <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" ""> <html xmlns="" lang="en-US"><head><title>M +TV</title> </head><body text="black" bgcolor="white"><blockquote /><h1>MTV Cache< +/h1>~; for my $key (sort keys %dex) { open(OUTFH2, '> '."$key.mtvcol.html") or die "$key.mtvcol.html $!\ +n"; print "generating $key.mtvcol.html\n"; print OUTFH2 `$mtv getdoc=$key action=view`; close OUTFH2; print OUTFH '<a href="'."$key.mtvcol.html".'"'.">$key.$dex{$key}</ +a><BR>\n"; } print OUTFH q~</body></html>~; close OUTFH;exit; __END__ =head1 DESCRIPTION Just a dirty little script i hacked together in a few minutes, borrowing heavily from mtv. It should work regardless of platform. And it'll take down the server load. Currently, since i'm just capturing the output of, the http header is not stripped. It is kind of refreshing to be greeted by Content-Type: text/html; charset=ISO-8859-1 It is a "commandline" script, as if you couldn't tell.

    Disclaimer: Don't blame. It came from inside the void

    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://52413]
Approved by root
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2023-12-07 16:58 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (33 votes). Check out past polls.