Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Web Stuff

by vroom (His Eminence)
on May 25, 2000 at 20:53 UTC ( [id://14809] : sourcecodesection . print w/replies, xml ) Need Help??
Australian weather with Web::Scraper
on Oct 07, 2008 at 13:52 UTC
by missingthepoint
Grab tomorrow's forecast from the Australian Bureau of Meteorology website using Web::Scraper
on Aug 30, 2008 at 09:23 UTC
by TheRealNeO
This script should be used on a webserver tested on apache,It has been designed to be multi purpose as at can be used for logging form data or with some small modification people who enter your page. Also it looks better in kwrite than on this page
on Aug 17, 2008 at 22:23 UTC
This has been an on going project of mine for about 2 years or so. I wrote it mainly to show a concept for a web portal framework that will reduce the need to edit the source/core to add in new scripts or functions.

It uses a MySQL back-end and it has been tested on Apache servers only. Can work some what on a windows OS, but Suggested OS is *nix or a supportable OS with case sensitivity.

Perl 5.8.8 or 5.10
GD::SecurityImage - For Captcha.cgi and GD for located in /lib/module
Image::ExifTool - For upload.cgi only
This script can run under Mod Perl, but does not have many hours of testing.

I posted this here for some feed back of how i built the framework and some of the modules for the portal. Keep in mind some of the script are not fully completed and it may have minor bugs. Also I have not made much documentation only enough to install it and has comments in a lot of the coding.

Please request to received the latest files.
on Feb 20, 2008 at 16:02 UTC
by jfroebe

If you've ever tried to back up your Flickr account or download the public Flickr photos of other users, then your options were pretty limited on Linux. After trying and failing to get FlickrDown to work under wine, I decided to write my own... in perl :)

The various Flickr API's on CPAN are incomplete with respect to obtaining the list of public/friends photos. I had to use XML::Parser::Lite::Tree::XPath to extract the photos from each Flickr page.

By default, we obtain the largest photo available (orginal -> large -> medium -> small), populate the EXIF comments field with the Flickr photo title, description and url. We verify the file type with File::MimeInfo::Magic and change/add the file extension as necessary.

During testing I received numerous duplicate files which were mostly resolved by adding a rudimentary duplicate file checker.

It should work on just about any platform that Perl does.


Petrolhead Automatic Racer
on Dec 24, 2007 at 23:19 UTC
by davidov0009
A program that races a given Facebook account 10 times using the petrolhead racing application. (to gain you points) See POD for more details.
Zooomr uploader
on Nov 13, 2007 at 03:25 UTC
by polettix
A simple upload script for Zooomr, an online photo album service that will hopefully reach Flickr standards. The script contains full documentation in POD, but you will be able to do the following:
shell$ zooomr -u -p pluto foto*.jpg # Let the program ask you for the password, so that you don't have # to show it on the screen shell$ zooomr add -u photo*.jpg # Add a tag shell$ zooomr add -u --tag awesome photo*.jpg # Add two tags shell$ zooomr add -u -t awesome -t good photo*.jpg
Happy uploading! You can also save your recurrent options inside a configuration file .zooomr inside your home directory.

Update: added more features and checks for good uploads vs fake ones (Zooomr sometimes simply seems to throw an upload in the dust bin).

Update: added support for browser cookie file, managing external authentication via browser (needed for OpenID accounts). Added some missing documentation.

Update: added support for summary file at the end of upload, to report managed files and their URIs (see this post in Zooomr groups).

Update: changed upload URI to conform to current non-flash upload URI.

Link Hunter
on Feb 13, 2007 at 15:39 UTC
by wizbancp
A script for exploring site and catch link simply specify the starting url and the searching depth (sorry for my english!:-)) at the end the script produce a text files with the address catched.
Twitter POSTer
on Feb 12, 2007 at 10:28 UTC
by wizbancp
A code for posting update to your twitter account :-)
Twitter Reader
on Feb 12, 2007 at 09:03 UTC
by wizbancp
Is a code for visualize the last post of you and your friend's on twitter. configurable number of post to read (default 8)
download mp3s listed in RSS feed
on Jan 27, 2007 at 02:36 UTC
by blahblahblah
Scans WFMU's MP3 archive RSS Feed for certain show titles, and then downloads those shows.

There's no particular reason to use POE::Component::RSSAggregator rather than XML::RSS::Feed, other than the fact that I heard about the POE version first and was interested in trying something in POE. (Thanks again everyone for helping me get around the problems due to my out-of-date POE in POE::Component::RSSAggregator breaks LWP::Simple::get.)

Also, I heartily recommend this station to everyone!

Simple link extraction tool
on Jan 02, 2007 at 21:41 UTC
by Scott7477
This code gets a web page that you specify at the command line and extracts all links from that page to a list in ASCII text file format. Comments and suggestions for improvement welcome! Updated per suggestions by ikegami and merlyn.
on Nov 13, 2006 at 15:19 UTC
by BerntB
Screen scrape

Downloads all your posted comments to a directory. Generates a yaml file with definitions of all the comments, suitable for other scripts to use for parsing the data.

Can be the first of a number of scripts to e.g. find mod-bombers, etc.

Tab size 4.

on Oct 23, 2006 at 11:34 UTC
Module Link, Current version is 4.0:
Edited 2! 10-29-2006
I changed this post because lots of ppl here must have A.D.D.

Link to Better documentation.
Re: SF_form_secure

More Examples.
SFLEX's scratchpad
I am now working on a 5.0 that will use parts of CGI::Util for the experation time and fix a bug in action 5 that returns the version if the matching code is blank.
Still trying to put together more documentation for this code so one can use it the right way.

JavaScript::XRay + HTTP::Proxy equals Crazy Delicious
on Jul 05, 2006 at 18:31 UTC
by diotalevi

Blame the author of JavaScript::XRay. He demoed this at YAPC::NA 2006 but there wasn't a proxy in the distribution. I wrote this so I could use it. Just run this and set your browser to use it as a proxy. The simplest way to use this is just run it and tell your browser to use a proxy at port 8080 on your local host.

Simple Hit Counter
on May 29, 2006 at 01:59 UTC
by Anonymous Monk
This is a simple hit counter I threw together for my website. Nothing fancy, it just
CiteULike tools
on Dec 08, 2005 at 08:46 UTC
by eweaverp - takes a citeulike url or a .bib file and searches it against the Collection of Computer Science Bibliographies, outputing a canonicalized file with all available fields filled. It does not clobber your citeulike URLs or tags. - downloads and caches all .pdfs it can find from your citeulike account (including private) and spools them to a printer. Will not print duplicates even over multiple runs as long as you don't delete the cache folder. It's good for printing only the things you have most recently added. Outputs a "missing.html" file with links to the citeulike articles it could not find a .pdf for. You will probably have to customize some of the regexs for the databases you use the most.

on Oct 20, 2005 at 01:16 UTC
by cosecant
This script is used to download podcasts. It is very basic and uses LWP::Simple to download the rss file and the audio files. The feeds go under the __DATA__ token and it also requires an external podcast.conf file to store a list of already downloaded files.
KFAI radio recorder
on Sep 30, 2005 at 03:54 UTC
by diotalevi

A simple way to download stuff from KFAI's program archive. I just wrote the thing, it seems to work. I hope so. You'll need mplayer and lame.

KFAI is a public radio station in Minneapolis and it rocketh.
on Aug 19, 2005 at 05:36 UTC
by Spidy
A very barebones, quick and dirty mini-FTP client I built myself (The Perl Cookbook is a wonderful thing indeed!), when the nag screen of my not-so-free FTP client started to annoy me(read: I am a broke,non-pirating student). I haven't quite worked out the error reporting on it yet, but it's VERY useful for synchronizing something on your web server with whatevers in the folder you put in.

If you're not too worried about having your password echoed back to your screen, you can remove the lines that involve Term::ReadKey and any changes in ReadMode.

on Jun 24, 2005 at 00:31 UTC
by tcf03
Simple app for leaving quick notes between myself and some other admins at work.
on Mar 27, 2005 at 20:32 UTC
by svetho
I hope this program to look up Wikipedia entries from the command line will come in handy for some people. My main motivation to write this was my rather slow computer. So, I don't always have to start up my browser when I can look up a Wikipedia entry from within a shell. Also, if you display a long version of an article the script automatically archives it for you in a subfolder. I'm not a Perl Guru (I mention that in every post ;-)) but I hope the script is useful enough to be posted here. SveTho
linkextor extract particular links from HTML documents
on Jan 30, 2005 at 00:01 UTC
by Aristotle


linkextor extract particular links from HTML documents


linkextor [ -b baseurl ] [ -f filter ] [ files and urls ]


linkextor prints links found in given HTML documents, filtered by any given expressions. If no files are specified on the command line, the input document is read from STDIN. You can specify URLs on the command line; they will be downloaded and parsed transparently.

By default, no links are filtered, so the output will include references to every external resource found in the file such as stylesheets, external scripts, images, other documents, etc.

After considering criteria, all matching links will be printed to STDOUT. You could pipe them to wget -i - to mass download them, or to while read url ; do ... ; done or xargs to process them further, or do anything else you can do with a list of links.


  • -h, --help

    See a synopsis.

  • --man

    Browse the manpage.


  • -b, --base

    Sets base URL to use for relative links in the file(s).

    This only applies to local files and input read from STDIN. When parsing documents retrieved from a URL, the source URL is always the base URL assumed for that document.

  • -f, --filter

    Defines a filter.

    If no filter has been specified, all external links will be returned. This includes links to CSS and Javascript files specified in <link> elements, image links specified in <img> elements, and so on.

    If one or more filters have been specified, only links conforming to any of the specified criteria will be output.

    Filters take the form tagname:attribute:regex. tagname specifies which tag this filter will allow. attribute specifies which attribute of the allowed tags will be considered. regex specifies a pattern which candidate links must match. You can leave any of the criteria empty. For empty criteria on the right-hand side of the specification, you can even omit the colon.

    Eg, -f 'a:href:\.mp3$' will only extract links from the <href> attributes in the document's <a> tags which end with .mp3. Since <a> tags can only contain links in their <href> attribute you can shorten that to -f 'a::\.mp3$'. If you wanted any and all links from <a> tags, you could use -f a::, where both the attribute and regex component are empty, or just -f a: because both components to the right are empty, you can leave out the colons entirely. Likewise, you could use -f img to extract all images.


None currently known.



(c)2005 Aristotle Pagaltzis


This script is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

on Jan 09, 2005 at 01:52 UTC
by rendler
Simple blogging client that uses Net::Blogger, thus supporting all blogs that it supports. It has features for local post/template storage and editing, spell checking, previewing and regexp replaces.
Lookup seagate hard drives by model number
on Jan 03, 2005 at 00:53 UTC
by diotalevi

Lookup Seagate hard drives by model number.

$ bin/lookup_seagatehd ST118202FC ST336752FC ST150176FC Desc=ST-118202FC Fibre Channel FC-AL, Dual Port (Cheetah 18) Capacity=18.2 RPM=10000 Access time=6 Desc=ST-336752FC Fibre Channel FC-AL, Dual Port (Cheetah X15 36LP) Capacity=36.7 RPM=15000 Access time=3.6 Desc=ST-150176FC Fibre Channel FC-AL, Dual Port (Barracuda 50) Capacity=50.01 RPM=7200 Access time=7.6
Link extractor
on Nov 17, 2004 at 22:01 UTC
by marcelo.magallon
Small script to extract all links from a given URL. I whipped up something real quick sometime ago, and I use this for extracting links from different webpages (e.g. wget $(lnkxtor URL '\.tar\.gz$') and the like). Comments about further development directions and improvements much welcomed!
Cron update of AWStats and database for "country report" feature.
on Nov 04, 2004 at 10:41 UTC
by Alexander
Update AWStats and keep a current database for the country reporting feature. Point cron to and set the directory variables to match your configuration.
    Three scripts:
  • Perl control script, and fetches gzipped country database updates
  • Shell script to gunzip the compressed update and replace the old database file.
  • Shell script to update AWStats.
AWStats requires Geo::IP or Geo::IP::PurePerl to use the country reporting feature. The modules in turn are dependant on a database file called GeoIP.dat which is updated about every month.

Both versions of Geo::IP assume users have the correct file permissions to /usr/local/share, and hard code the path /usr/local/share/GeoIP/GeoIP.dat for the database directly into the module.

I do not have the needed file permissions on my server to the /usr/local/share area which created a problem.

I wrote my scripts with a correction to this problem in mind. Directory paths are in variables at the top of each script.

Rather than changing the large AWStats program, I opted for changing the directory paths within the module before installing it. If you have similar file permission problems, this may work for you also.

I have successfully run this on my server which uses FreeBSD 5.2.1, Apache 1.3.26, and Perl 5.8.2.

My directory structure used is:

NOTE (Nov. 7, 2004): I found a minor glitch I had not noticed in earlier. I have changed the code provided below to fix it.

Original line:
if (mirror("$db_website$db_gz_file", "$db_gz_file") != RC_NOT_MODIFIED) {

Corrected line:
if (mirror("$db_website$db_gz_file", "$script_dir/$db_gz_file") != RC_NOT_MODIFIED) {

When executing the script from outside the $script_dir, the mirrored file would instead be placed in the pwd from which the script was called. The corrected line now prevents this from happening.

Domain Name Expiration Reminder
on Oct 07, 2004 at 23:40 UTC
by knowmad
If you manage any domains at all, keeping up with expirations can become a burden. This script, which employs the excellent Net::Domain::ExpireDate module, will help make the task easier. Add this to a cron job and don't spend any more time worrying about whether your own or your clients' domains will expire without warning.
on Aug 12, 2004 at 07:52 UTC
by TZapper
A script to automatically update dynamic DNS with service. Version updated to 1.2.
Changes include:
+ Signal handling (per Nachoz's recommendations and code sample)
+ self daemonazation
Class for getting film information from IMDB.
on Jul 28, 2004 at 13:35 UTC
by nite_man
IMDB::Film is OO Perl interface to the films information in the IMDB. I've tried to implement possibility to inherit, override and expande its functionality (Hope I did it). For example, you need retrieve number of votes with user rating. So, you can use IMDB::Film as base class and override method rating. All critical remarks, suggestions and etc will be appreciated. Hope that module will be useful for someone.

Updated: Fixed bug with retrieving film information by its title.

Proxy list fetcher
on Jul 07, 2004 at 22:46 UTC
by sock
This script retrieves several websites that list anonymous http proxies and then creates a formated list based on what was retrieved. Please refer to my website, for updates and more information. Updates have been made, use strict and use warnings applied as well as huge code reduction.
web page update notifier
on Jun 17, 2004 at 20:51 UTC
by Juerd
Very simply script that keeps me informed of changes in a static page. Runs daily with cron, which emails me any changes or error messages.
Apache::GenericAuthen 0.01
on Apr 23, 2004 at 18:51 UTC
by japhy
Half authen/authz tutorial, half authen/authz base class. For people who want to write their own Apache authen/authz modules in Perl, but don't know how.
Web Crawler
on Mar 09, 2004 at 01:24 UTC
by mkurtis
This is a webcrawler that will grab links from web addresses in file1.txt and place them in file2.txt and then visit all of those links and place them in file3.txt and so on. It prints the content of each page in numerated files in a seperate folder. - Syndicate CAKE news
on Dec 24, 2003 at 03:49 UTC
by LTjake
This script grabs the contents of CAKE's news page, parses the entries and outputs it as either an RSS feed or Atom feed.
cpandir_asrss - Get an RSS feed for your CPAN directory
on Dec 12, 2003 at 15:36 UTC
by LTjake
Although there is already a "recent" RSS feed for CPAN (, it returns results for all authors. If you have a personal page where you would like to display only those modules which you've authored, this script will generate that feed for you.
RiotJournal - Graphical LiveJournal Client
on Nov 25, 2003 at 17:52 UTC
by #include
A small, simple graphical LiveJournal client. Based on by amoe. Uses Perl/Tk, LWP::UserAgent, and HTTP::Request::Common.

UPDATE: According to jsprat, RiotJournal works on Win2K, as well as Linux. Updating Client
on Nov 23, 2003 at 23:18 UTC
by lithron
A "smart" client that will update your (or other domain from the company) to your current IP address. Works from behind a NAT (via Only tested under Win32 with Cygnus' Cygwin tools installed. Requires LWP::Simple and Socket.
on Nov 20, 2003 at 04:21 UTC
by thraxil

Tidy is a useful command-line utility that cleans up messy and invalid HTML, producing nice, pristine, XHTML. I wrote this module as a wrapper around tidy to let you clean up html on the fly as it is served by apache. It also uses the Apache::Filter framework so you can use it to clean up the results of any other Filter compliant perl handler (or registry script using Apache::RegistryFilter). This could be very useful if you are trying to get your site to validate but are stuck with an old CMS that produces messy, invalid markup.

You can also download the module from:

Any and all suggestions are welcome. If no one finds any big problems, I may try to upload it to CPAN.

APB v1.0 (subject to better name change)
on Oct 19, 2003 at 17:57 UTC
by hacker

After rebuilding Apache hundreds of times on just as many boxes, each with their own tweaks and configuration options with mod_php, mod_perl, mod_ssl, MySQL and so on, I was tiring of doing it all manually. Normally you have to do the following:

  • Fetching the tarballs
  • Verifying the md5sums
  • Unpacking the tree
  • Configuring the tree
  • Building the tree
  • Testing the build
  • Installing the tree
  • ...etc.
I tried using Apache Toolbox several times, but it has quite a few flaws in design:
  • You can't instruct it to use the system-installed MySQL headers and libraries
  • It insists on building its own PHP from scratch
  • Many other libraries (gd, t1, xpm) all get built by ApacheToolbox and installed in /usr/local
  • ...and so on. Not clean, not scalable. Too intrusive to strictly-controlled systems.

Well, now I don't have to!

With this script, all you have to type is 'build', and it will fetch, verify, unpack, configure, build, test, and install Apache, PHP, mm (shared memory), mod_ssl, mod_perl, and so on for you, in your target prefix, all in one shot. You just run 'build' and it does the rest for you as DSO, with just a 14k httpd server daemon when it completes (on my test boxes).

For those wondering what this has to do with Perl, I rely on mod_perl heavily for my work, and rebuilding Apache with each new release of mod_perl, Perl, mod_ssl, and so on gets tedious. This removes that tedium, so I can get back to work, faster, with less babysitting of the code.

There are three files: build (the main script), functions (some helper routines), and source.md5 (md5sums of the target tarballs). You'll need all three in your work directory to properly run this.

Hopefully PerlMonk's wrapping won't kill this too much.

Suggestions and patches are welcome, of course.

LWP::Simple::Cookies - adds cookie support to LWP::Simple
on Oct 15, 2003 at 20:45 UTC
by diotalevi

This module alters the operation of LWP::Simple so that it keeps track of any cookies presented by the server. Any import options are passed directly to HTTP::Cookies.

Statistical Graphs with gnuplot and mod_perl
on Sep 11, 2003 at 09:19 UTC
by projekt21

When my company's sysadmin emailed me some log files from sysstat ( with the note "Can you make some graphs out of those?", I started to:

  • write a converter from sysstat logs into a gnuplot data file (basically a row to col converter)
  • write a form based web-frame to select files, columns and options, generate queries for a gnuplot backend and display the resulting graphs
  • finally write a gnuplot backend that outputs corresponding graphs

The gnuplot backend is a module currently called Apache::Gnuplot that can be used for nearly any statistical data files in gnuplot format.

The converter and web-frame are not published yet, but you may contact me if you like to have a look at it. See for a screenshot.

Fellow monks, I would like to read your comments on possible errors or improvements. And please let me know if this is of any interest to put it on CPAN. Thanks.

Update: code cleanup for readability

Update2: added input validation thanks to Abigail-II

Scrape Google's Image Search
on Aug 29, 2003 at 19:10 UTC
by Hutta
Script to get all images in Google's image search engine matching specified keyword(s). Image downloads are done in parallel and spoof the refering URL in case the host protects against offsite linking. If you just want a single image for a background, see hossman's Random Background from Net.
Image Gallery
on Jul 30, 2003 at 10:41 UTC
by sulfericacid - See the gallery print in action!

There are two scripts: image upload, image display. With these two files you can have yourself your very own image gallery!

User uploads an image at a time and specifies some image information (such as title and description). The image display prints/displays 20 images per page (5x4) as thumbnails (100x100 with link that opens in new window. Under each image displays: file name, image title, image description, image height, image width.

What I learned from this:

  • Difference between positive and negative numbers when used as a hash reference [-10..-1] and [1..10]
  • How to use array splicing
  • and this very cool snippet $var ||= 1;

    You will need to go through both files and change all pertaining links to directories and paths as I left them in so you could see what format they had to be in.

  • Display most recent post to a YaBB forum
    on Jan 26, 2003 at 21:20 UTC
    by newrisedesigns
    This code (used in conjunction with Apache SSI) will display a link within a page to the most recently updated topic in your YaBB forum. See for information and source to the YaBB message board.
    Top Seti@home Teams
    on Jan 06, 2003 at 22:07 UTC
    by Mr. Muskrat

    I am still working on my top Seti@home teams script.

    This uses: (in no particular order)
    grantm's XML::Simple,
    samtregar's HTML::Template,
    mojotoad's HTML::TableExtract,
    gbarr's Net::FTP,
    and DBI.

    Many thanks to the authors of these wonderful modules!

    Output is available at Top 40 Seti@home Teams.

    The URL that Berkeley uses for the XML team stats may change once they "officially" announce it. They still need to fix some things... See my Malformed XML thread at the Seti@home Bulletin Board for more infomation.update 4

    Updated the ParseTeamXML subroutine so that it no longer uses a hash as a reference. Thanks tye and thunders for helping me resolve this issue.

    Update 2 added color coding for rise/fall in rank.

    Update 3 Updated code to work around duplicate teams in the top teams page.

    Update 4 Thread is AWOL.

    Real Synthetic Audio downloader
    on Oct 24, 2002 at 06:36 UTC
    by diotalevi
    Downloads the weekly Real Synthetic Audio radio shows from
    Member Watcher
    on Oct 22, 2002 at 14:54 UTC
    by penguinfuz
    The script was written to report the daily changes of web users, New, Cancelled, and Remaining. It is suited for a specific system but with slight modification the script could be used elsewhere.

    For this specific instance we have 2 services storing web user names in multiple files (ala htpasswd). Once per day we read and log the usernames, finding the difference between today and yesterday (if there was a yesterday), and email the report to the relevent parties.
    on Sep 28, 2002 at 14:00 UTC
    by jpj

    This script is designed to be run from /perl or /cgi-bin on an Apache webserver. Given a root directory, it will create dynamic pages based on the contents of the directory that's referenced. As it navigates the directory hierarchy via user choices, it determines if it's already created a page for that dir, and creates a cache page in that directory, pushing the cached page to the next request unless the directory contents changes, when it will regenerate the cache page.

    In short, it gets faster the more it's used.

    Enhanced data display for MP3 files via MP3::Info

    Examples can be seen here. Project page is here.

    Comments, criticism welcome. This is one of my first "real" perl scripts.


    on Sep 05, 2002 at 17:56 UTC
    by LTjake
    This is a module to help access XML-RPC methods provided by the covers project. It needs to have Frontier::Client (and its dependencies) installed. HUGE thanks to Petruchio for fixes, rewrites and guidance.

    The covers project is "...a database of cover songs (songs performed by an artist other than the original performer) with the intention of creating cover 'chains.'"
    Marge - the Interactive Marginalia Processor
    on Jul 08, 2002 at 14:34 UTC
    by alfimp
    For letting people write in the margins of your pages. Doubles as a simplistic Wiki.
    on Jun 25, 2002 at 04:11 UTC
    by elusion
    This is a script I wrote to grab sites from the web to stick on my hard drive and/or handheld device. It follows links to a certain depth, can download images, follow offsite links, and remove some unwanted html.

    Code reviews/critiques are welcomed and requested.

    Updated: Jun. 28, 2002

    • Squashed 2 bugs
    • UserAgent support
    • Proxy support
    • Progress report
    (code) HTTP connectivity log, or nail up dial connection
    on Apr 23, 2002 at 01:08 UTC
    by ybiC

    Fetch a random web page from a list of URLs.   Waits a random length of time between fetches.

    Written with two purposes in mind:
      * log internet connectivity
      * nail up analog dial connection
    Mostly the latter, so my ISP won't drop my connection during looong downloads.

    There are a number of commandline options+arguments.   Perhaps the more interesting are

    • --verbose=1 which reports only delta of success/fail ala zeno from Internet Connection Uptime Logger
    • --errDelay=1 to cause almost-immediate retry on fetch failure
    • --daemonize which backgrounds and disconnects so vty can be closed
    • --logging to print progress, options, and versions to file

    From a Perlish perspective, this has been an exercise in rand, IO::Tee and Proc::Daemon, plus Getopt::Long bobs that I ended up not using.   I started out "use"ing Proc::Daemon, but had to copy+tweak subs from it to allow logging-to-file while daemonized.

    Thanks to tye, fletch, tachyon, lemming, chmrr and others for tips and suggestions.   And as always, constructive criticism is welcome and invited.

    on Apr 09, 2002 at 00:53 UTC
    by lshatzer
    This will detect if it is a URL, file, or html and pass it to HTML::TokeParser, and returns the HTML::TokeParser object. (This was my first venture into inheritance.)
    Updated: Changed a few things from Amoe's suggestions.
    Post a journal entry to
    on Apr 06, 2002 at 08:14 UTC
    by rjray

    This script is derived from one of the examples I developed for the chapter of my book that introduces the SOAP::Lite module.

    It posts a journal entry to the journal system at, without using the browser interface, and without faking a POST to the CGI interface. It uses instead the still-alpha SOAP interface that use.perl currently offers.

    All elements of the SOAP request may be specified on the command-line, including entry subject, whether to enable comments, uid and passwd, and the option of reading the entry from a file (the default is to read from STDIN, filter-like). It also will attempt to set the authentication credentials from your Netscape cookie file. But as I said, those can be set by the command-line (and the cmd-line switch overrides the Netscape cookie).

    In a first-response to this code posting, I will post a XEmacs defun (which should also work under GNU/Emacs) that I use to post from within XEmacs by selecting the text I want to upload into the region. More on that in the follow-up node.


    Web monitor
    on Apr 05, 2002 at 21:12 UTC
    by zeroquo
    It an Win32 based scripting like tried to connect to URL site whith validation or not, the second portion its the ini generator. Enjoy
    Muse - Personal Interlinked Encyclopedia Builder
    on Apr 04, 2002 at 05:50 UTC
    by Pedro Picasso
    Do you like Everything2 and Perlmonks but can't seem to get the code to work on your machine because you're running Mandrake and not Debian or RedHat and while EveryDevel is a great company with a great content manager, their documentation leaves a lot to the imagination, and you honestly don't have time to sift through database application code when all you want is to have your own easily maintainable set of interlinked content? Well this is the script for you!

    It's a simple CGI hack that imitates PerlMonks/Everything2's noding. You type a "musing" with links to other musings in brackets. It's great for keeping your own interlinked encyclopedia of personal notes.
    on Mar 30, 2002 at 17:54 UTC
    by Juerd
    Because the popular gnuvd is broken, I made this quick hack to query the Van Dale website for dictionary lookups. It's a quick hack, so no production quality here ;) Oh, and please don't bother me with Getopt or HTML::Parser: Don't want to use Getopt because I don't like it, and can't use HTML::Parser because has a lot of broken HTML, and because regexes are easier (after all, it's a quick hack because I can't live without a Dutch dictionary).

    This probably isn't of much use to foreigners :)

    Update (200306081719+0200) - works with html updates now. image splitter
    on Mar 30, 2002 at 13:55 UTC
    by djw
    Split has one purpose: crop 654x300 images into 6 equal sized smaller images that span two rows, and three columns. Html output (one page per image) is optional (on by default).

    I decided to do this after I talked to a buddy of mine Chris (from the sexy about a site called

    We were looking at some of the very cool art in the photo albums and saw that some people cut up a single larger picture into 6 pieces so they could fit the entire thing into one page of the album (you will have to go check out the site to see what I mean). Chris was telling me that this process can take a long time, and I mentioned I could write something to automate it.

    TaDA! Split was created.

    This program was written specifically for the image gallery, but it could be expanded for your own use if you feel like it. Or maybe you just need a chunk of code from it for something you are doing.

    Thought I'd post it anyhow.
    Sample: my car!

    Thanks, djw
    Frankenpage: Build on-the-fly customizable web pages from a directory of files
    on Mar 23, 2002 at 11:08 UTC
    by S_Shrum
    I made this script to deal with the hassle of having to maintain 4 different resumes for job recruiters. Split up a page into individual files and place them into a folder. Point the script at the folder (via the PATH parameter in the script call or by defining it as a default) and it will allow you to define a page with the files. Upon submitting the form, the script gets called again but this time will create the document from the parts you requested. The resulting URL can then be used as a link in your other web pages to display the frankenstein'ed page later (Get it? frankenpage...never mind). Acts as a pseudo SSI engine (I guess you could call it that). For more information, latest version, etc., you can view the script white paper here. It could probably do more but hey, I wrote it in like 30 minutes. "IT'S ALIVE!" Journal into RSS Feed
    on Mar 14, 2002 at 08:10 UTC
    by rjray

    The use Perl; site is starting to provide SOAP interface support to the journal system there. Much of the interface is still alpha, and I'm trying to help out pudge by testing what he has, and offering suggestions here and there as I run into things.

    This is a simple demonstration/proof-of-concept script that uses SOAP::Lite and XML::RSS to turn the last 15 journal entries from a given user into a RSS 1.0 syndication channel.

    Caveats: There is not yet an interface for turning a nickname into a user-ID, so you have to provide the user ID as the first (and only required) parameter to the script. You may provide the nickname as the second parameter. The titles and such are very simple, as this goes for minimal network traffice over feature-bloat. But it's mostly for illustrative purposes. As the interface fleshes out and becomes more stable, I expect to have a more functional version of this. Oh, and it writes to STDOUT, making it suitable as a filter to other XML/RSS processing apps.

    LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
    on Feb 26, 2002 at 16:09 UTC
    by gregorovius
    A subclass of LWP::UserAgent that replicates Netscape's behavior on redirects after a POST request (ie. it will follow POST redirects but it will turn them into GETs before doing so ). I believe Microsoft's IE behaves like this as well.

    A lot of web applications rely on this non-standard behavior in browsers so I think it would be a good idea to integrate this to LWP. See Redirect after POST behavior in LWP::UserAgent differs from Netscape for reference.

    Look for the XXX marker in the code to see where this code differs from the one in LWP::UserAgent.
    on Feb 14, 2002 at 18:27 UTC
    by jjhorner
    For those of us who have to administer IIS installations, we find it backward that the mmc (THE IIS ADMIN TOOL!) can't set the DefaultLogonDomain property of the MSFTPSVC. Instead of using some loopy VB way to do it, I wrote a simple utility and use it now. Easy, straightforward and a small example of ADSI code.
    on Feb 06, 2002 at 14:44 UTC
    by shockme
    After reading this thread, I found myself facing some downtime, so I decided to throwt his together. I needed a small project with which I could begin learning, and this seemed a good candidate. It's not perfect and could use some tweaking here and there, but it scratches my itch.

    Kudos to dws for his help when I became inevitably entagled. Sadly, I have yet to attain CGI-zen.

    Modify .htaccess files
    on Feb 04, 2002 at 22:38 UTC
    by cjf
    Uses Apache::Htpasswd to add, delete, or change the password of a user in a .htaccess file. User input is web-based using forms, includes an authorization check.
    CGI::Application::Session - A stateful extension to CGI::Application
    on Jan 28, 2002 at 17:34 UTC
    by rob_au
    While writing some code based upon the CGI framework concepts discussed here, I put together this small segment of code which introduces very simple session data storage into CGI::Application-inherited classes. This module, very much alpha with respect to code development level, allows sessional data to be stored server-side through the use of Apache::Session::File with _session_id being stored within a client cookie.

    Use of this module is very simple with the replacement of the use base qw/CGI::Application/; pragma with use base qw/CGI::Application::Session/;. All data to be stored and retrieved should be placed within the $self->{'__SESSION_OBJ'} hash. Additional new method parameters include SESSIONS_EXPIRY, SESSIONS_NAME and SESSIONS_PATH to set the respective parameters of the client-side ID cookie.

    This code can be expanded and improved upon greatly, but it demonstrates a proof-of-concept for session utilisation within a state-based CGI engine.

    on Jan 21, 2002 at 03:10 UTC
    by Amoe

    Replacement for the WWW::Search::Google module. I apologise for the scrappiness of the code, but at least it works.

    Thanks crazyinsomniac and hacker.

    Update 06/03/2002: Surprisingly, this module still works . After all the changes that Google has gone through since the time I first released it, I would expect it to have broken a long time ago, considering it parses HTML rather than some stable format. There's an interesting story at slashdot about googling via SOAP - maybe this is the future direction this module could take?

    Script Stripper
    on Dec 26, 2001 at 02:31 UTC
    by OeufMayo

    The following act can act as a HTML script filter, stripping Javascript, VBScript, JScript, PerlScript, etc. from the HTML code.

    This weeds out all the "scriptable" events from the HTML 4.01 specifications and all the <script> elements.

    It takes a filename as argument, or if there's no argument, read from STDIN. All the output is done on STDOUT.

    This piece of code should be pretty reliable, but I'd be interested to know if there's a flaw in this code.

    Yahoo Currency Exchange Interface
    on Dec 13, 2001 at 22:20 UTC
    by cacharbe
    A while back I had to re-write our intranet stock listing application that retreived stock information from the Yahoo! finance site (I didn't write the original). It was a huge cluge, and took forever to render, as each stock symbol, 401k symbol etc, was a seperate LWP request, and each request had to be parsed for the necessary data. The application began breaking down when Yahoo went through some major HTML/Display rewrites.

    While working on the re-write I discovered something that turned 75 LWP requests into two requests; one for the indices information, and one for the list of symbols for which I needed data. The discovery was that Yahoo has a publically facing application server that offers up all financial data in CSV rather than html. I could request ALL the symbols I needed at once, and it was returned in orderly, well formatted, easy to parse CSV.

    Needless to say, I saw a significant performance increase.

    Yesterday I was asked if I could create a small web application that allowed users to get current currency exchange rates, and rather than re-inventing the wheel, I went back to the stock work I had done and found that the same server would do the exchange for you if given the correct URL. I have included an example of a correct URL, but I am leaving getting the correct Country Currency codes as an exercise to the user. They are readily available from a few different sources.

    Usage might be somethig like:

    use LWP::Simple; use LWP::UserAgent; use CGI; $|++; print "Content-type: text/html\n\n"; &Header(); &PrintForm(); my $q = new CGI; if ($q->param("s") && $q->param("s") ne "DUH" && $q->param("t")ne "DUH +"){ my $sym = $q->param("s").$q->param("t")."=X"; my ($factor, $date, $time) = (&GetFactor($sym))[2,3,4]; $time =~ s/"//g; $date =~ s/"//g; print '<CENTER><p><b><font face="Arial" size="4">RESULTS:</font></ +b> '; printf("<b><font face=\"Arial\" color=\"#0000FF\" size=\"4\">%s %s + = %s %s as of %s, %s </font></b></p>",&commify($q->param("a")), $q-> +param("s"), &commify(($factor*$q->param("a"))), $q->param("t"),$date, +$time); print "</CENTER><P>"; } &PrintFoot(\$q);


    HTML Link Modifier
    on Dec 23, 2001 at 21:45 UTC
    by OeufMayo

    As strange as it seems, I couldn't find here a code that cleanly modifies HREF attributes in A starting tags in a HTML page.

    So here's one that I whipped up quickly to answer a question on fr.comp.lang.perl

    It surely could be easily improved to include other links (<script>, <img src="...">, etc.), but you get the idea...

    The only (slight) caveats is that the 'a' starting tag is always lowercased and the order of the attributes are lost. But that should not matter at all.
    Also, this code won't print 'empty' attributes correctly (though I can't think right now of any empty attributes that are legal with 'a')

    To use this script, you have to modify the $new_link variable, and then call the script with the URL of the page to be modified. Every <a href="..."> will have the $new_link added at the start of the href, and the old URL will be properly escaped.

    It is probably useless as is, but with a minimum of tweaking, you can easily do what you want.
    Actually, it might be a good thing to turn this little script into a module where you would only have to do the URL munging, without worrying about the whole parsing stuff...

    Scripted Actions upon Page Changes
    on Dec 08, 2001 at 13:37 UTC
    by rob_au
    This code fragment was written with the intent to minimise some of my system administration overhead by providing a script framework that allowed arbitrary scripting actions to be performed should a web page be modified or updated. The code uses LWP::UserAgent to get a web page and then should the web page have changed since the last execution of the script, as measured by the last_modified header or should this be unavailable, an MD5 digest of the page contents, execute a script subroutine or method.

    Independent subroutines can be specified for different URLs, in the single example provided, the subroutine virus_alert is executed should the Symantec web page have changed since the last execution of the script.

    Local::SiteRobot - a simple web crawling module
    on Nov 24, 2001 at 17:09 UTC
    by rob_au
    Earlier this month, George_Sherston posted a node, where he submitted code for a site indexer and search engine - I took this code and decided to build upon it for my own site and in evaluating it and other options available, I found HTML::Index. This code offered the ability to create site indexes for both local and remote files (through the use of WWW::SimpleRobot by the same author) - This ability for indexing based upon URL was important to me as a great deal of content on the site is dynamic in nature. This was where my journey hit a stumbling block ... WWW::SimpleRobot didn't work!

    So, I set about writing my own simplified robot code which had one and only one function - return a list of crawled URLs from a start URL address.

    #!/usr/bin/perl -w use Local::SiteRobot; use strict; my $robot = Local::SiteRobot->new( DEPTH => 10, FOLLOW_REGEX => '^', URLS => [ '' ] ); my @pages = $robot->crawl; print STDOUT $_, "\n" foreach @pages;

    The code I feel is quite self explanatory - /msg me if you have any questions on usage.

    yet another quiz script
    on Oct 30, 2001 at 22:30 UTC
    by mandog
    There are probably better quizes out there... merlyn has one TStanley has This one differs in that it uses HTML::Template and puts a nice wrong.gif next to questions that are answered wrong. - command-line LiveJournal client
    on Oct 20, 2001 at 15:43 UTC
    by Amoe
    Very simple LiveJournal console client. You need to make a file called 'ljer.rc' in the same dir as the script, containing the following lines: user: my_username password: my_password No command line options required. Yes, I am aware that there are other Perl LJ clients. The problem was, one was appallingly-coded (lots of manually parsed command-line options, hundreds of warnings, no strict) and the other was using Socket (which I dislike - and not even IO::Socket) when LWP is practically built for the task, and it was an 800-line behemoth. This is just a quick script for the impatient. Feel free to cuss as you feel appropriate. xml-rpc update notifier
    on Oct 18, 2001 at 21:22 UTC
    by benhammersley
    A command line perl tool that uses xml-rpc to tell that your blog has been updated. See here Fun from both the command line and cron. uses command line options, so invoke it like this... perl --title=BLOG_TITLE -- url=BLOG_URL
    Dynamically Generate PDF's On The Fly
    on Oct 10, 2001 at 17:00 UTC
    by LostS
    Recently I have had the joy of dynamically generating a PDF on the fly. Alot of people suggested putting the data in a text file then put it into PDF format. However I needed the ability to add graphics and also to add color's. So I did some research and found a nice little module called PDF::Create. You can get the most recent version from . The sad part is most of the developers who made this module have pretty much stoped working on it. What they do have works great... except for the adding of gif's. JPG's work great but not gif's. So here is my code I used to generate my PDF on the fly.

    I contacted the creator of the module in the PDF::Create about due to the errors I was having. He looked at the code and found the problem and fixed it and sent me the updated code... So below my code is the updated :)
    Link Checker
    on Oct 10, 2001 at 10:47 UTC
    by tachyon

    This script is a website link checking tool. It extracts and checks *all* links for validity including anchors, http, ftp, mailto and image links.

    Script performs a recursive search, width first to a user defined depth. External links are checked for validity but are not followed for obvious reasons - we don't want to check the whole web.

    Broken anchors, links are repoted along with the server error. All email addresses harvested are checked for RFC822 compliance and optionally against an MX or A DNS listing.

    More details in pod

    topwebdiff - analyse the output of topweb
    on Sep 14, 2001 at 12:26 UTC
    by grinder
    To make the best use of topweb snapshots, the idea is to generate the files day by day, and then run topwebdiff to pinpoint the ranking changes.

    See also topweb - Squid access.log analyser.
    topweb - Squid access.log analyser
    on Sep 14, 2001 at 12:19 UTC
    by grinder
    I've had a look a number of analysis tools for Squid access logs, but I didn't find anything simple that met my needs -- I just wanted to know how much direct web traffic was pulled down from what sites.

    See also topwebdiff - analyse the output of topweb.
    Currency Exchange Grabber
    on Aug 29, 2001 at 01:40 UTC
    by bladx
    Taken from a POD snippet within the code:

    This progam's use and reason for being created, is simply for anyone that wantsto be able to check the exchange rates for different types of money, such as fromother countries. Say one was going on a trip to Japan, and they live currently in the United States. They would need to find out how much money they should bring, in order to have a good amount, and there isn't an easier way (almost) than to just enter in the amount you want converted using CEG and telling it to convert from what type of money, to the other country's money.
    Cheesy Webring
    on Aug 20, 2001 at 20:26 UTC
    by OeufMayo

    Don't blame me, ar0n had the idea first.

    But nonetheless, it's a fully functional webring.
    You too can make your own and impress your friends, by showing them how many people share you love of Cheesy things.

    Big thanks to virtualsue for hosting the Cheese Ring! (see node below)

    update: Eradicated a couple of bugs that prevented the code to compile. oops.

    update - Thu Aug 23 07:47:26 2001 GMT: The truncate filehandle was not right, but virtualsue spotted it!

    update - Sun Aug 26 13:32:41 2001 GMT: Fixed the check for, thanks crazyinsomniac!

    Sat Sep 22 21:35:45 UTC 2001: Fixed the encoding of characters other than 0-9a-zA-Z.

    On-demand single-pixel GIFs
    on Aug 19, 2001 at 10:17 UTC
    by dws
    A short CGI script for generating a single pixel GIF of a desired color. Useful, for example, when generating HTML that embeds color-coded, image-based bar charts. Ordinarily, using color in this way requires existing GIFs for all colors used. This script removes the need to make all of those GIFs by hand, allowing one to experiment freely.
    IIS Restart
    on Jul 13, 2001 at 21:04 UTC
    by trell
    We had need of automatically restarting IIS when it got to a state where it would not respond to request. This code uses the following... Unix command from MKS Toolkit, (there are others out there) this is to kill the PID when the net stop wont work. (from CPAN) (written from scratch, feel free to upgrade and let me know of improvements) ActiveState Perl 5.6.1 I have this broken into three script, one is the .pm, the second is the actual restart script, and last is a script that checks the socket for content, then calls the restart script if needed.
    Agent00013's URL Checking Spider
    on Jul 11, 2001 at 03:00 UTC
    by agent00013
    This script will spider a domain checking all URLs and outputting status and overall linkage statistics to a file. A number of settings can be modified such as the ability to strip anchors and queries so that dynamic links are ignored.

    (code) Directory of HTML to PostScript and PDF
    on May 30, 2001 at 17:02 UTC
    by ybiC
    Create PostScript and PDF versions of all HTML files in given directory.   Ignore files listed in @excludes.

    Fetches HTML files by URL instead of file so html2ps will process images.   Add <!--NewPage--> to HTML as needed for html2ps to process.   Links *not* converted from HTML to PDF   8^(

    Requires external libraries html2ps and gs-aladdin, but no perl modules.

    From a Perlish perspective, this has been a minor exercise in (open|read)dir, grep, and difference of lists.   As always, comments and critique gladly welcomed.

    Latest update(s):     2001-05-30     22:25

    • Thanks to jeroenes for noticing funky indents and for suggesting cleaner exclusions syntax.
    • Implement and test simpler exclusions syntax.
    • Eliminate redundant code with PrintLocs()
    • Add explanatory comments for html2ps and ps2pdf syntax.
    on May 15, 2001 at 00:28 UTC
    by JSchmitz
    simple webserver load monitor nothing to fancy, probably could be improved on....
    Cam Check
    on May 04, 2001 at 19:45 UTC
    by djw
    This code helps me manage my cam on

    I have a cam at work, and a cam at home - they aren't on at the same time, I have them set on a schedule. I could use just one cam image for both cams but that isn't any fun - I also wanted my page to report to the visitor where the cam image was coming from.

    Anyhow, this script checks my two cam files for creation date (File::Stat) to see which is newer. It aslo checks to see if I have updated my cam image less than 10 minutes ago, if not, its offline.

    For example, lets say the newest file is work.jpg, and its creation time is less than 10 minutes ago - the script changes 3 text files in my web directory (used in Server-Side Includes) to reflect the fact that I'm at work. If the newest file (in this case work.jpg) is older than 10 minutes, then the cam is offline, and it reports that using the text files and SSI.

    I have this script run on my linux box on a schedule using cron.
    Opera Bookmarks to XBEL
    on Apr 24, 2001 at 00:31 UTC
    by OeufMayo
    The python guys (they can't be all bad, after all :) have created a DTD called XML Bookmark Exchange Language which "is a rich interchange format for "bookmark" data as used by most Internet browsers." As I wanted to play with the XML::Writer module and clean up my bookmark files, I ended up whipping up this code. Have fun!

    Update 2001-11-13: Complete rewrite of the adr2xbel script. It follows a bit more closely the python script found in the PyXML library demos.

    DB Mangager
    on Apr 03, 2001 at 01:48 UTC
    by thabenksta

    This program is a web based GUI for your database. It has currently only been tested for MySql, but the idea is for it to work with any DB.

    The program give you a list of tables and allows you to Create, Edit and Drop them, as well as viewing their schema and data. It also provides a command line for other functions

    Feel free to give me feedback/critisim. If you make any major additions, please let me know.

    Update 4/3/2001:
    Added privilege granting and revoking funtions.

    SHTML generator for image viewing
    on Mar 21, 2001 at 19:13 UTC
    by djw

    This utility takes a directory of images, copies files to a specified dir, creates shtml page for each block of pictures (how many per block is decided in config), can prompt for a description for each pic (otherwise uses image name), and creates a menu.shtml and an shtml page for each block needed depending again on how many pics per page you want and how many pics you have.

    It requires you to have server-side includes enabled in your specified web dir so that you can include header, footer, and menu server-side include files. It also puts an image at the top right of each page, and a "created by" thing at the top left - but you can change that to have some other text, or nothing if you like.

    I thought about expanding this thing to use a text file for image descriptions, and adding html support, but I don't need those features so I decided not to. It certainly can be done easily enough.

    You can see a demo at which has some pics from around my office.

    btw, I'm really sorry about the spacing that got mangled by using notepad. This was written in Vi with a 4 space tab, but got converted because I posted this from a win32 box and I used notepad..../me sighs.

    ciao, djw
    on Jan 27, 2001 at 07:15 UTC
    by Anonymous Monk
    MysqlTool provides a web interface for managing one or more mysql server installations. It's actually a pretty big application, with about 3000 lines of code spread across nine different modules. I'm not sure it this is a proper posting for Code Catacombs, but must everyone who's seen it and uses mysql & perl on a regular basis has loved it.
    Live365 Broadcaster Info grabber
    on Jan 18, 2001 at 08:13 UTC
    by thealienz1
    I was working one day on setting up an radio station on, and I noticed that their main http site was really slow. So, what I did was made a script that checks my broadcasting information really fast. Thanks PERL. Basically all you have to do is type in the username and the script does the rest for you. You do have to make sure that you do have the LWP modules installed. Otherwise I trust that you know how to use this trully useless script. Thanks PERL.
    on Apr 16, 2001 at 02:50 UTC
    by Masem
    A notes-to-self CGI script, useful for those (like me) that have multiple locations where they work, and want a quick, low-overhead way to leave notes to themselves on a central server (eg no DBI). Note that the script has no security checks, but this can easily be done at web server level.
    SSI Emulation Library
    on Jan 13, 2001 at 10:37 UTC
    by EvanK
    An alternative to using big clumsy modules when you need to emulate ssi in perl's been written to theoretically work on both win32 and *nix systems, though I've only gotten to test it on windows. works fine for me though. any comments and feedback welcome.
    (code) Toy Template
    on Jan 06, 2001 at 03:01 UTC
    by ybiC
    Toy Template

    Simple scripted website generator.
    Uses, HTML::Template and param('page') to eliminate duplication of code and markup in toy website.   Also uses CSS to separate style from structure (as much as possible, anyway).   Clearly not suited for anything industrial-strength.

    Code and CSS files go in a web-published directory.   Common, content and template files go in a non-webpub directory.   Each web page in the site is defined by its' own content file.

    Thanks to ovid, chipmunk, Petruchio, davorg, chromatic and repson for helping me figure this out.

    Most recent update:   2001-05-10
    Hashamafied a few of the passel o'scalar variables.

    Update: 2002-05-11 note to self: look into Storable to potentially reduce response times, or Data::Denter so don't have to unTaint incoming data structure.   Props to crazy for suggesting Denter and for initial benchmark strongly favoring Storable as perf king.

    use Benchmark    qw( cmpthese timethese );
    use Storable     qw( freeze thaw );
    use Data::Denter qw( Indent Undent );
    timethese( 2_000, { Data::Denter => \&DENTOR, Storable => \&STORKO });
    print "\n\n\n";
    cmpthese( 2_000, { Data::Denter => \&DENTOR, Storable => \&STORKO });
    sub DENTOR {
      my $in  = Indent \%BLARG;
      my %out = Undent $in;
    sub STORKO {
      my $in  = freeze \%BLARG;
      my %out = %{ thaw($in)};
    Benchmark: timing 2000 iterations of Data::Denter, Storable...
    Data::Denter: 12 wallclock secs (11.71 usr +  0.01 sys = 11.72 CPU) @ 170.69/s (n=2000)
    Storable:     6 wallclock secs  ( 5.60 usr +  0.00 sys =  5.60 CPU) @ 357.27/s (n=2000)
    Benchmark: timing 2000 iterations of Data::Denter, Storable...
    Data::Denter: 12 wallclock secs (11.80 usr +  0.00 sys = 11.80 CPU) @ 169.53/s (n=2000)
    Storable:     6 wallclock secs  ( 5.62 usr +  0.00 sys =  5.62 CPU) @ 356.06/s (n=2000)
                  Rate Data::Denter     Storable
    Data::Denter 170/s           --         -52%
    Storable     356/s         110%           --
    Cookbook Recipe Bot
    on Dec 11, 2000 at 11:32 UTC
    by epoptai
    There are two scripts here, and a sample crontab. gets & stores daily cookbook recipes from Written to run as a cronjob (no output except the file) & includes a sample daily crontab to automate the process. lists the saved recipes in alphabetical order. The viewer requires two images to work properly, 1. a small transparent gif named blank.gif, and 2. a Perl Cookbook cover image such as these at fatbrain and amazon, both stored in the same dir as the scripts & saved files.

    Update: fixed url to point to the new recipe location at

    US Library of Congress perl module
    on Dec 10, 2000 at 13:24 UTC
    by eg

    A perl module to access the library of congress' book database.

    Unlike Amazon or Barnes and Noble, you can't just look up a book in the Library of Congress database with an ISBN; you need to first initialize a session with their web server. That's all this module does, it initializes a session for you and returns either a url of a book's web page or a reference to a hash containing that book's data (author, title, etc.)

    fix bad HTML comments
    on Nov 29, 2000 at 09:45 UTC
    by chipmunk

    This Perl filter fixes bad HTML comments, such as <!----------- really ---------bad ------ comments ---------->. (Such comments are bad because, according to the spec, each -- -- pair within <! > delimits a comment. This means <!-- -- --> is not a complete comment, for example.)

    The code reads in the entire file, then finds each occurrence of <!-- ... --> and uses tr/// to squash each run of hyphens to a single hyphen. The assignment to $x is necessary because $1 is read-only.

    Dynamic Well Poisoner
    on Oct 25, 2000 at 21:50 UTC
    by wombat
    "Poisoning the well" is a term sometimes used to mean feeding a bot false information, either to make it crash, or to devalue the rest of the data that it is supposed to be mining. We all hate spambots. They're the ones that scour webpages, looking for little <a href=mailto:> tags. They then take down your email address and some spammer then sells it on a "100,000,000 VALID WORKING ADDRESSES" CD-Rom to other unpleasent people. You can all help stop these evil robots, by putting up a Poisoned Well on your homepage.

    Basically it's a list of randomly generated email addresses that look valid (cause they're made of dictionary words), have valid DNS entries, but then bounce when the spammers try to send mail to them. This program has a twofold purpose. Older Poison Wells just generate [a-z]{random}@[a-z]{random}.[com org net]. This one takes domains that you specify from a list. Thus, you can put domains you don't like in the list, and then cause THEM to have the burden of sending back lots of bounced messages. As stated before, any spambot worth its silicon would check to see if the address was a valid domain. This would circumvent that.

    For my list of evil domains, I've put the six top generators of banner ads. Especially the ones that are suspected of selling personal data. >:-D

    Some of the amusing email addys that this generated were
    Velar Central File Submitter
    on Sep 30, 2000 at 05:08 UTC
    by strredwolf
    The main code I use to send artwork to the Vixen Controled Library. Try it out as an example to code a HTTP POST w/multipart client. Oh, it does do authorization, but I've removed my password (so nyah!)
    Serving Images from Databases
    on Sep 23, 2000 at 00:51 UTC
    by Ovid
    If our Web server receives an image request and the image is not found, the Web server calls a cgi script to serve the image. The script analyzes the request and serves the appropriate image from MS SQL 7.0 Server.

    I have posted this code to show an extensible method of serving images in the event that any monks are called upon to write a similar script.

    This can easily be modified to allow calls directly from a CGI script in the following format:

    <img src="/cgi-bin/images.cgi?image=category1234" height=40 width=40 alt="some text">

    Then, you'd add the following to your code:

    use CGI; my $query = new CGI;
    Then, substitute the following line:
    $ENV{'QUERY_STRING'} =~ m!([a-zA-Z]+)(\d+)\.($types)$!;
    With this line:
    $query->param('image') =~ m!([a-zA-Z]+)(\d+)\.($types)$!;
    Disable animated GIFs and blinking text in Netscape
    on Sep 20, 2000 at 22:16 UTC
    by Anonymous Monk
    Perl script that disables/enables blinking text and/or animated GIFs in Netscape. I've tried this on Netscape for Intel Linux, Digial Unix, and AIX, and I've been told it also works on Netscape for M$ Windoze. Animated GIFs will still run through the cycle one time, stopping on the last frame. As you can see, all it actually does is call perl again to replace a couple of strings inside the Netscape binary to cripple the blinking text and animated GIF support.
    Home Page Manager
    on Sep 17, 2000 at 22:17 UTC
    by Ovid
    So, you want to read in the morning before work, at night when you get home, and every Wednesday at 5:00 PM your favorite Web site issues an update. Rather than scramble for your bookmarks or search through links on your links bar, here's a utility which stores homepages based upon time. Set this script as your home page and enjoy! If you don't have a homepage set up for a particular day/time, it sends you to the default home page that you have specified.

    It's definitely beta quality, so any suggestions would be appreciated.

    on Sep 12, 2000 at 08:42 UTC
    by araqnid
    Given a set of photo images (e.g. JPEG) and an XML file with descriptions, generate mini-images and HTML pages to navigate through them. Allows photos to be arranged hierarchially (sp?). Allows for terms in descriptions to be indexed and a cross-references page to be built. Unfortunately, the HTML templates are hardcoded atm :(
    TLD generator
    on Jul 17, 2000 at 04:56 UTC
    by j.a.p.h.
    Ever get a bit ticked off about the lack of TLDs/domains available? I wrote this little program to print out every possable 3-6 letter TLD. If ICANN would approve them all, I doubt it would be a problem anymore. But there's litte to no chance of that happening (at least any time soon).
    Slashdot Headline Grabber for *nix
    on Jul 11, 2000 at 03:02 UTC
    by czyrda

    Gets the Slashdot headlines every 30 minutes.

    Daily Comix
    on Jul 07, 2000 at 00:57 UTC
    by Arjen
    I created this lil script a couple of months ago to have all the comix I read in the morning on one page. I also wanted to be able to chose what to see and I wanted to be able to add comix without having to modify the script itself.

    The plugins are simple config files with a piece of code as a value which gets eval'ed. The script then puts them all in a nice overview on 1 single page.

    An example plugin (UserFriendly) is included in the code section.


    Apache log splitter/compressor
    on Jun 12, 2000 at 18:06 UTC
    by lhoward
    This program is designed to read from an apache logfile pipe. It automatically compresses the data and splits it into files each containg one day's worth of data. The directory to write the log files to should be set in the environmental variable LOG_DIR. Log files are named access_log_YYYY_MM_DD.N.gz. Here is how I have apache configured to call this program.

    CustomLog |/path/to/ combined

    Apache Timeout module
    on Jun 09, 2000 at 06:00 UTC
    by jjhorner

    I wrote this mod_perl handler to give easy timeouts to restricted web pages. It is very elementary, but useful. Please give me some comments at my email address above if you wish.

    It requires a directory "times" under your /usr/local/apache/conf/ directory, owned by the user:group running the Apache child processes, for your timestamp files.

    Usage: See in-code docs.

    Update v0.21

    • I added better docs and fixed a bug or two.
    • I also moved most of the config info into the httpd.conf file and only moved configurable stuff to .htaccess.
    • Added concept of Minimum Time Out and Mode.

    Update v0.20

    • I sped up the routine that checks time since last visit. It now stats a file, grabs the number of seconds since last modification, and uses that for $last_time. Then opens the time file rw to update the modification time.
    • I added option to put the DEBUG mode into the .htaccess file.

    TO DO:

    • Write documentation
    • Make into format usable on CPAN
    Dark Theme for /. through Perl
    on May 04, 2000 at 23:07 UTC
    by PipTigger
    Here's a little script I wrote for myself since I like light text on dark backgrounds (thanks again for the nice PerlMonks theme Vroom!) and /. doesn't have one... I know it's pretty suckie and could be a lot simpler. If you can make it better, please email it to me ( as I use it everyday now. It doesn't werk yet for the ask/. section but when I find some time, I'll add that too. I hope someone else finds this useful. TTFN & Shalom.

    p.s. I tried to submit this to the CUFP section but it didn't work so I thought I'd try here before giving up. Please put the script on your own server and change the $this to reflect your locale. Thanks!
    webster client
    on May 04, 2000 at 21:07 UTC
    by gregorovius
    This is a simple web client that will bring a word definition from the UCSD web dictionary into your X terminal.

    If people are interested I also have a server-simpler client version that allows this program to be deployed on a large number of machines without having to install the LWP and HTML packages it uses on each of them.
    Usage: % webster_client word
    Resolve addresses in web access logs
    on Apr 29, 2000 at 01:20 UTC
    by ZZamboni
    Where I work, apache is configured not to resolve IP addresses into names for the access logs. To be able to properly process the access logs for my pages with a log-processing program (I use webalizer) I wrote the following script to resolve the IP addresses. Note that the local domain name needs to be changed for when the resolved name is local (machine name only, without domain). This happens sometimes when the abbreviated name is before the full name in /etc/hosts, for example.

    Updated: as suggested by kudra, added a comment to the code about double-checking the name obtained, and why we don't do it in this case.

    RSS Headline Sucker
    on Apr 27, 2000 at 22:30 UTC
    by radixzer0

    Make your own Portal! Amaze your friends! Confuse your enemies!

    This quick hack takes a list of RSS feeds and pulls the links into a local database. If you don't know what RSS is, it's the cool XML headline standard used by My Netscape. Lots of sites provide RSS feeds that you can use to make headline links on your own site (like, ZDNet, etc.). I used this to make a little headline scroller using DHTML on our company Intranet. This script works best with a scheduler (like cron) to update on a periodic basis.

    For a comprehensive list of available feeds, take a look at

    Comments/improvements more than welcome ;)

    Resume logger
    on Apr 07, 2000 at 21:04 UTC
    by turnstep
    Small script to check if anyone has looked at my online resume, and if so, mails the information to me. Run as a periodic cronjob; exits silently if no new matches are found.
    Poor Man's Web Logger
    on Apr 05, 2000 at 21:56 UTC
    by comatose

    Since I don't have access to my ISP's server logs but still want to have some idea of who's visiting my website, I developed this little script that can integrate easily with any existing site. All you need is CGI access of some type.

    To install, save the script so it is executable. Also, you'll need to set the $home, $logfile, $ips (IP addresses you want to ignore), and %entries (labels and expressions to match) variables. Be sure to "touch" the logfile and make it writable by the web server's user.

    Pick an image, preferably a small one, on your page.

    <img src="/cgi-bin/showpic/path_to_pic_in_document_root/pic.jpg">

    Each time someone accesses the page with that image, an entry is made in the log with the date, time, and either hostname or IP address. Here's an example of the output. Enjoy.

    Wed Apr 05 13:08:26 2000 resume
    Wed Apr 05 13:29:29 2000
    Thu Apr 06 01:31:47 2000
    Slashdot Headline Grabber for Win32
    on May 03, 2000 at 04:33 UTC
    by httptech
    A Win32 GUI program to download Slashdot headlines every 30 minutes and display them in a small window.
    on Oct 04, 1999 at 22:34 UTC
    by gods
    Server Monitor via Web
    on May 23, 2000 at 03:02 UTC
    by BigJoe
    This is a Quick script to keep track of the load on the server. There are 3 files that are needed to be made. The HTML file to access it the history text file and the script. The HTML diplay is very generic. click here to see it work
    OPPS - Grab weather from yahoo (2)
    on May 24, 2000 at 10:18 UTC
    by scribe
    Grabs weather from yahoo. One of my first perl scripts
    and my first use of IO::Socket. It could use some work
    but I only use it as part of an efnet bot.
    Seti stats
    on May 24, 2000 at 14:02 UTC
    by orthanc
    An Anon Monk sent in something similar to SOPW so I modified the code a bit and here it is. Could do with a bit more error checking but it works.
    on May 25, 2000 at 13:55 UTC
    by ask
    Neat little program to lookup and display quotes from Includes robust caching so it's fast enough to run from your .bash_profile. Does not even load the http libs when displaying cached information. Documentation included in the file in POD format.
    Stock Quotes
    on May 24, 2000 at 20:39 UTC
    by scribe
    Retrieves basic stock quotes from yahoo.
    Uses IO::Socket to connect to webserver.
    Can only retrieve one quote at a time now.

    Sample output:
    (( ELON 51.625 -3.5625 ) ( 49 -- 56.5 ) ( Vol: 1646200 ))
    Powerball Frequency Analyzer
    on May 04, 2000 at 18:56 UTC
    by chromatic
    This little darling grabs the lottery numbers from Powerball's web site and runs a quick analysis of the frequency of the picks. It also makes recommendations for the next drawing.

    I run it from a cron script every Thursday and Sunday morning.
    on Apr 27, 2000 at 17:58 UTC
    by ergowolf
    This program checks for books from fatbrain, but it can be easily be modified to search any site's page.
    wsproxy: Perl Web Proxy
    on May 20, 2000 at 22:58 UTC
    by strredwolf
    wsproxy is a web proxy in perl! It has filtering and caching abilities, but you need to tell it what to ban or save. It works very well with netpipes, or even inetd!

    To use it, try faucet 8080 -io

    v0.4 and 0.5 added banning of Javascript and HTML pages.
    v0.6 had a file locking bug nuked and preliminiary tempcache support
    v0.7 now has an embedded gif (here) v0.8 now has mostly functional (self-cleaning) tempcache support. strip grabber
    on Mar 15, 2000 at 01:08 UTC
    by billyjoeray
    This script can be run on regular intervals to download the latest strips from I'm putting this here because its a little big for the front page, but for whoever reads this, I'd like some tips on how to clean up my code and some wisdom on how to easily write code with 'use scrict;' I always seem to be needing to use variables in a global sense. Full Size version forwarder :)
    on Mar 30, 2000 at 11:46 UTC
    by ash
    You hate having to load the small version of userfriendly for then to click on the link to the fullsize one? Well. This script forwards you directly to todays full size userfriendly strip :) But since userriendly doesn't update before 9 a.m in norway, i've inserted this little line: ($a[2]<9)&&$a[3]--; so that you'll be forwarded to yesterdays strip if it's not up yet. Change the 9 to your timezone :)