HTML Utility

by vroom (Pope)
on May 25, 2000 at 20:16 UTC
on Aug 20, 2002 at 14:04 UTC
by PodMaster
Finally, a better link extractor, in a module, HTML::LinkExtractor (does the things people wished HTML::LinkExtor did )

See pod for description and documentation.

Use pod2html with a patched version Pod::Html which correctly interprets <a href="">f</a> in verbatim blocks (my mail to perl5 porters).

I do have a HTML::TokeParser::Simple version of this ;D

and later i fixed a typo

UPDATE: Mon Aug 26 11:09:37 2002 GMT
I just put it up on CPAN (version 0.04). Enjoy HTML::LinkExtractor

on Apr 13, 2002 at 19:04 UTC
by Amoe

Makes HTML::TokeParser return a list when get_tag and get_token are called in list context. Other than that, identical to using a while to iterate over. It's to enable me to say:

my @links = map { $_->[1]{href} } $parser->get_tag('a')

And expect it to work. Sating the addiction of map-junkies. :)

This code would possibly be better applied to HTML::PullParser, but if I applied it to that I'd have to reimplement get_tag and do some other stuff which I don't want to. I think, anyway.

Extracting information from the SETI@Home PM group
on Dec 11, 2001 at 20:51 UTC
by Rhose
I had never used the HTML::TableExtract module, so I created this script as a learning experience. As normal, if anyone has suggestions on things which I could do better, I would love to read them.
Web Deployment Schemes
on Nov 25, 2001 at 01:13 UTC
by Rich36

These applications allow the user to store deployment schemes in a database and use that information to FTP the files to a remote location.
The idea behind this code is to have a method to easily send a set of files for a web page out to an ftp server and to be able to resend the set of files when necessary. - store the deployment schemes in a database - ftp the files to the specified remote location.

See the POD for more information. These applications have only been tested on the Windows platform

UPDATE: Added checking for binary/acsii file in (19:35 11/24/01).

on Nov 16, 2001 at 09:55 UTC
by staeryatz
This is a utility for a larger program of mine, Webcpp ( This utility will convert Webcpp's native colour schemes (*.scs) to CSS, which is a new compatible scheme format for the Webcpp 0.6+ series.
XML Pretty Printer
on Sep 04, 2001 at 02:21 UTC
by OeufMayo

This is a small script that turns a valid XML file into a colorful HTML file! Yay!

Some handlers have not been used (most notably entities, notations), but they should be, eventually.

Update Tue Sep 4 07:46:13 UTC 2001: added mirod's suggestion. Thanks mirod!

on Aug 31, 2001 at 03:13 UTC
by OeufMayo

Pyxie is an alternative way of representing XML datas. These datas are represented in a really simple way, one information per line.
The nice thing about PYX is the ease of parsing the informations you get, on the other hand, there are a lot of features found in the XML format that can't be representated by PYX (CDATA, entities,...)

Now, I know the module XML::PYX exists, and it even comes with a script called pyxhtml, which does pretty much what this code does.
But XML::PYX per se isn't really flexible if you want a finer control over what's being kept or not in the HTML file.

Hopefully, this code can be easily customized to suit your needs, provided you know how to use HTML::Parser (which is really fun to use, especially the v.3).

And the really cool thing is that your HTML doesn't have to be a valid XML file! (I wouldn't try to feed it Word 2000 pseudo-HTML though...)

More infos on PYX

Very Flexible HTML Template System
on Jun 19, 2001 at 21:09 UTC
by Torgo
This is a little bit of Perl code that can be included into any CGI script or HTML file generator that has been an absolute life-saver for me, both at home and at work. I'm new to the site, so I thought I'd share it with yous all.
HTML To ASP Converter
on Apr 30, 2001 at 21:48 UTC
by patgas
This script grabs an HTML file, and converts it into a VBScript Response.Write command for use in ASP pages. Allows custom levels of indenting, and does proper double-quote escaping. Simple, really, but I find myself using it all the time.
on Apr 30, 2001 at 08:51 UTC
by jeffa
This is now available as the CPAN module DBIx::XHTML_Table. Get it at CPAN or this cool mirror. Feel free to visit the homepage. The code posted here is left for others to point and laugh at. :D
Update HTML Doc
on Apr 16, 2001 at 16:09 UTC
by Rudif
ActiveState Perl installer creates a html tree and a TOC file for access to the perl documentation from a browser.
The PPM updates this tree and the TOC when installing packages.

However, in several circumstances you may wish to use this script here, to convert pod found in module files to html and/or to update the TOC:
  1. you added html files found on the web
  2. you installed a module from CPAN whose files contain pod
  3. you installed your own scripts or modules containing pod
Update: the ryddler's 'quick and dirty utility' that I started from was originally posted right here at PM. Thanks to $code or die for making the connection.

Web Color Spectrum Generator
on Apr 06, 2001 at 21:01 UTC
by extremely
This is a simple little color generator much like the ones discussed in this node Shading with HTML colors - color_munge. This one can do spectral rotation from red to green to blue without shifting brightness or can do all kinds of wacky color shifts. It can go thru the spectrum in either direction too. I'll post the code on my website too, and maybe even a CGI that you can tinker with. As a bonus I'll put up the original code for you to laugh at on the site this weekend.
Make and index html doc files
on Mar 03, 2001 at 23:11 UTC
by Rudif
Script pods2htmlextracts the pod documentation from a multitude of pod, pm and pl files in a source directory tree into the corresponding html files. It will create/update a html directory tree, populate it with html files, and optionally create an index file and a 2-frame browser frameset with the index in the l.h. frame and the current html file in the r.h. frame.

My script is an extension of script of same name which is distributed with the module Pod-Tree-1.06 by Steven McDougall.

I added the option and code that generates the 2-frame frameset similar to that used in ActiveState Perl doc.
I also fixed a few minor problems, documented in my script.

To install, drop the script below into a directory that is in your path and name it Next, install the prerequisite modules from CPAN: Pod-Tree and HTML-Stream.
To create or update a html doc tree from pods in your perl work directory, invoke
pods2html <workdir> <htmldir> --frames
To view the html doc index, point your browser to file <htmldir>/default.html.

Template HTML
on Feb 13, 2001 at 21:00 UTC
by thealienz1

Takes a directory, and all sub directories, of files, and copys and parses them to a template HTML file.

Used for a site I made where the people were to lazy to jsut insert the template into each page, but this make it easier to change if you change the template again.

Yes I understand that SSI can be used, but I still lazy to do that too... ENJOY!

Change Absolute to Relative links in HTML files
on Feb 05, 2001 at 02:22 UTC
by dkubb

This utility will recurse through a specified directory, parse all the .htm and .html files, and replace any absolute URL's with relative URL's to a base you define.

You can also specify what types of links to parse: img, src, action, or any others. Please see HTML::Tagset's %linkElements hash, in the module's source, for a precise breakdown of supported tag-types.

This program was good practice for trying out Getopt::Declare, an excellent command-line parser. Please note the parameter specification below the __DATA__ tag.

Disclaimer: Always use the -b switch to force backups, just in case you have non-standard HTML and the HTML::TreeBuilder parser mangles it.

Comments and suggestions for improvement are always welcome and very much appreciated.

shtml publisher
on Jan 25, 2001 at 20:17 UTC
by willdooUK
Utility to expand Include statements in html files, allowing them to be viewed without running a web server.
Image to table converter
on Sep 30, 2000 at 08:11 UTC
by bastard
I hacked this thing together during my quest to get around the "no images on the home node under level 5" rule. (yes i know there are other ways) I'm not sure how useful it is, but since someone requested it i'll post it here in case anyone else is interested. (I suppose the code could also provide a simple example of the use of the GD image module.)

What does it do you may ask? Basically it converts an image to a relatively optimized table representation of the image. It accepts one parameter which is the image file you are going to convert. It dumps the table to STDOUT. It can accept the following image types: PNG, JPEG, XPM and GD2

Warning, this will create very large and complex tables. I have created a 120k table from 6k PNG image, so this thing is not appropriate for larger images. (before the COLSPAN enhancements it could generate tables many times larger)

Gtk+ HTML Tree Viewer
on Sep 20, 2000 at 21:17 UTC
by mdillon

this is a rewrite of a utility i did for a job where i was using HTML::TreeBuilder and XML::XPath to parse and search normal HTML documents using the powerful XPath query language.

this utility uses HTML::TreeBuilder to parse an HTML document from a URL specified on the command line or from an internal browser location line and displays it as a Gtk+ Tree in a window. only subtrees with text nodes or anchors are expanded.

there are (simple) XPath queries displayed in the status bar that could be used to extract that node from the document (for example, by converting it to XHTML with HTML::TreeBuilder and then using XML::XPath, or by traversing the TreeBuilder parse tree and programmatically constructing an XPath parse tree).

it's probably not a bad example of simple Gtk+ GUI programming. more may be yet to come in the way of functionality (and comments).

this was written and tested against Gtk 0.7003.

there is support for using GtkHTML as well, if your installation is functional (mine was partially functional when i wrote the code, but stopped working after i upgraded from GtkHTML 0.4 to 0.6.1 and recompiled Gtk::HTML)

most recently updated: 24 Sep 2000

Personal PerlMonks Stats plot creator
on Jul 31, 2000 at 10:47 UTC
by ase
Here's my contribution to statistics nuts like myself.
This utility Logs in to Perlmonks (using ZZamboni's PerlmonksChat module), gets your writeup page and creates 3 plots from the data, which are ftp'd to a server of your choice.
I run it every few days to update the graphs. All modules besides are available at CPAN. See my home node for an example of the results.

Update: I no longer post the graphs on my home node. The updated code given in the replies to this node is more modern. Thanks to everyone for the kind comments I received when I first wrote this.

Automatic CODE-tag creation (Prototype)
on Jun 21, 2000 at 20:28 UTC
by Corion
Out of a discussion about how we can prevent newbies from posting unreadable rubbish, here is a program that tries to apply some heuristics to make posts more readable. This version isn't the most elegant, so it's called a prototype.
on Jul 05, 2000 at 03:23 UTC
by beppu
a filter to make your HTML delirioius
Random Color Generator
on Feb 03, 2000 at 08:05 UTC
by Elihu
This is a cgi script that generates an 8 by 8 grid of random colors with their appropriate hex values. Useful for picking colors for web pages.
embedded table remover
on May 26, 2000 at 11:15 UTC
by BigJoe
This script you can run on a html document to remove all embedded tables that are in it. Assuming that the tables were programmed into the document correctly. By default it will remove all embedded and leave the main table but you can also tell how many embedded tables are allowed by changing the numofTables variable.
Code Viewer
on May 19, 2000 at 01:49 UTC
by BigJoe
This is a script that I put together for use on my source code page. This script then allows me to copy html and scripts into a dir and let people pick the ones they want to view and I don't have to set up a page for each. It does require a param sent to it by using ?html=filename.
Update 6/2/200 With the help of Fastolfe I have added some testing on the $in{html} to make sure it is not tainted.
Dark Theme for /. through Perl
on May 04, 2000 at 23:07 UTC
by PipTigger
Here's a little script I wrote for myself since I like light text on dark backgrounds (thanks again for the nice PerlMonks theme Vroom!) and /. doesn't have one... I know it's pretty suckie and could be a lot simpler. If you can make it better, please email it to me ( as I use it everyday now. It doesn't werk yet for the ask/. section but when I find some time, I'll add that too. I hope someone else finds this useful. TTFN & Shalom.

p.s. I tried to submit this to the CUFP section but it didn't work so I thought I'd try here before giving up. Please put the script on your own server and change the $this to reflect your locale. Thanks!
Propaganda Tile Browser
on Apr 14, 2000 at 05:14 UTC
by Anonymous Monk
This perl script (when deployed and executed in a directory containing images) will generate a nice HTML front-end for viewing the images remotely.

For an example of this script's output, have a look here.
