Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Web Stuff

by vroom (Pope)
on May 25, 2000 at 20:53 UTC ( #14809=sourcecodesection: print w/replies, xml ) Need Help??
on Sep 05, 2002 at 17:56 UTC
by LTjake
This is a module to help access XML-RPC methods provided by the covers project. It needs to have Frontier::Client (and its dependencies) installed. HUGE thanks to Petruchio for fixes, rewrites and guidance.

The covers project is "...a database of cover songs (songs performed by an artist other than the original performer) with the intention of creating cover 'chains.'"
Marge - the Interactive Marginalia Processor
on Jul 08, 2002 at 14:34 UTC
by alfimp
For letting people write in the margins of your pages. Doubles as a simplistic Wiki.
on Jun 25, 2002 at 04:11 UTC
by elusion
This is a script I wrote to grab sites from the web to stick on my hard drive and/or handheld device. It follows links to a certain depth, can download images, follow offsite links, and remove some unwanted html.

Code reviews/critiques are welcomed and requested.

Updated: Jun. 28, 2002

  • Squashed 2 bugs
  • UserAgent support
  • Proxy support
  • Progress report
(code) HTTP connectivity log, or nail up dial connection
on Apr 23, 2002 at 01:08 UTC
by ybiC

Fetch a random web page from a list of URLs.   Waits a random length of time between fetches.

Written with two purposes in mind:
  * log internet connectivity
  * nail up analog dial connection
Mostly the latter, so my ISP won't drop my connection during looong downloads.

There are a number of commandline options+arguments.   Perhaps the more interesting are

  • --verbose=1 which reports only delta of success/fail ala zeno from Internet Connection Uptime Logger
  • --errDelay=1 to cause almost-immediate retry on fetch failure
  • --daemonize which backgrounds and disconnects so vty can be closed
  • --logging to print progress, options, and versions to file

From a Perlish perspective, this has been an exercise in rand, IO::Tee and Proc::Daemon, plus Getopt::Long bobs that I ended up not using.   I started out "use"ing Proc::Daemon, but had to copy+tweak subs from it to allow logging-to-file while daemonized.

Thanks to tye, fletch, tachyon, lemming, chmrr and others for tips and suggestions.   And as always, constructive criticism is welcome and invited.

on Apr 09, 2002 at 00:53 UTC
by lshatzer
This will detect if it is a URL, file, or html and pass it to HTML::TokeParser, and returns the HTML::TokeParser object. (This was my first venture into inheritance.)
Updated: Changed a few things from Amoe's suggestions.
Post a journal entry to
on Apr 06, 2002 at 08:14 UTC
by rjray

This script is derived from one of the examples I developed for the chapter of my book that introduces the SOAP::Lite module.

It posts a journal entry to the journal system at, without using the browser interface, and without faking a POST to the CGI interface. It uses instead the still-alpha SOAP interface that use.perl currently offers.

All elements of the SOAP request may be specified on the command-line, including entry subject, whether to enable comments, uid and passwd, and the option of reading the entry from a file (the default is to read from STDIN, filter-like). It also will attempt to set the authentication credentials from your Netscape cookie file. But as I said, those can be set by the command-line (and the cmd-line switch overrides the Netscape cookie).

In a first-response to this code posting, I will post a XEmacs defun (which should also work under GNU/Emacs) that I use to post from within XEmacs by selecting the text I want to upload into the region. More on that in the follow-up node.


Web monitor
on Apr 05, 2002 at 21:12 UTC
by zeroquo
It an Win32 based scripting like tried to connect to URL site whith validation or not, the second portion its the ini generator. Enjoy
Muse - Personal Interlinked Encyclopedia Builder
on Apr 04, 2002 at 05:50 UTC
by Pedro Picasso
Do you like Everything2 and Perlmonks but can't seem to get the code to work on your machine because you're running Mandrake and not Debian or RedHat and while EveryDevel is a great company with a great content manager, their documentation leaves a lot to the imagination, and you honestly don't have time to sift through database application code when all you want is to have your own easily maintainable set of interlinked content? Well this is the script for you!

It's a simple CGI hack that imitates PerlMonks/Everything2's noding. You type a "musing" with links to other musings in brackets. It's great for keeping your own interlinked encyclopedia of personal notes.
on Mar 30, 2002 at 17:54 UTC
by Juerd
Because the popular gnuvd is broken, I made this quick hack to query the Van Dale website for dictionary lookups. It's a quick hack, so no production quality here ;) Oh, and please don't bother me with Getopt or HTML::Parser: Don't want to use Getopt because I don't like it, and can't use HTML::Parser because has a lot of broken HTML, and because regexes are easier (after all, it's a quick hack because I can't live without a Dutch dictionary).

This probably isn't of much use to foreigners :)

Update (200306081719+0200) - works with html updates now. image splitter
on Mar 30, 2002 at 13:55 UTC
by djw
Split has one purpose: crop 654x300 images into 6 equal sized smaller images that span two rows, and three columns. Html output (one page per image) is optional (on by default).

I decided to do this after I talked to a buddy of mine Chris (from the sexy about a site called

We were looking at some of the very cool art in the photo albums and saw that some people cut up a single larger picture into 6 pieces so they could fit the entire thing into one page of the album (you will have to go check out the site to see what I mean). Chris was telling me that this process can take a long time, and I mentioned I could write something to automate it.

TaDA! Split was created.

This program was written specifically for the image gallery, but it could be expanded for your own use if you feel like it. Or maybe you just need a chunk of code from it for something you are doing.

Thought I'd post it anyhow.
Sample: my car!

Thanks, djw
Frankenpage: Build on-the-fly customizable web pages from a directory of files
on Mar 23, 2002 at 11:08 UTC
by S_Shrum
I made this script to deal with the hassle of having to maintain 4 different resumes for job recruiters. Split up a page into individual files and place them into a folder. Point the script at the folder (via the PATH parameter in the script call or by defining it as a default) and it will allow you to define a page with the files. Upon submitting the form, the script gets called again but this time will create the document from the parts you requested. The resulting URL can then be used as a link in your other web pages to display the frankenstein'ed page later (Get it? frankenpage...never mind). Acts as a pseudo SSI engine (I guess you could call it that). For more information, latest version, etc., you can view the script white paper here. It could probably do more but hey, I wrote it in like 30 minutes. "IT'S ALIVE!" Journal into RSS Feed
on Mar 14, 2002 at 08:10 UTC
by rjray

The use Perl; site is starting to provide SOAP interface support to the journal system there. Much of the interface is still alpha, and I'm trying to help out pudge by testing what he has, and offering suggestions here and there as I run into things.

This is a simple demonstration/proof-of-concept script that uses SOAP::Lite and XML::RSS to turn the last 15 journal entries from a given user into a RSS 1.0 syndication channel.

Caveats: There is not yet an interface for turning a nickname into a user-ID, so you have to provide the user ID as the first (and only required) parameter to the script. You may provide the nickname as the second parameter. The titles and such are very simple, as this goes for minimal network traffice over feature-bloat. But it's mostly for illustrative purposes. As the interface fleshes out and becomes more stable, I expect to have a more functional version of this. Oh, and it writes to STDOUT, making it suitable as a filter to other XML/RSS processing apps.

LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
on Feb 26, 2002 at 16:09 UTC
by gregorovius
A subclass of LWP::UserAgent that replicates Netscape's behavior on redirects after a POST request (ie. it will follow POST redirects but it will turn them into GETs before doing so ). I believe Microsoft's IE behaves like this as well.

A lot of web applications rely on this non-standard behavior in browsers so I think it would be a good idea to integrate this to LWP. See Redirect after POST behavior in LWP::UserAgent differs from Netscape for reference.

Look for the XXX marker in the code to see where this code differs from the one in LWP::UserAgent.
on Feb 14, 2002 at 18:27 UTC
by jjhorner
For those of us who have to administer IIS installations, we find it backward that the mmc (THE IIS ADMIN TOOL!) can't set the DefaultLogonDomain property of the MSFTPSVC. Instead of using some loopy VB way to do it, I wrote a simple utility and use it now. Easy, straightforward and a small example of ADSI code.
on Feb 06, 2002 at 14:44 UTC
by shockme
After reading this thread, I found myself facing some downtime, so I decided to throwt his together. I needed a small project with which I could begin learning, and this seemed a good candidate. It's not perfect and could use some tweaking here and there, but it scratches my itch.

Kudos to dws for his help when I became inevitably entagled. Sadly, I have yet to attain CGI-zen.

Modify .htaccess files
on Feb 04, 2002 at 22:38 UTC
by cjf
Uses Apache::Htpasswd to add, delete, or change the password of a user in a .htaccess file. User input is web-based using forms, includes an authorization check.
CGI::Application::Session - A stateful extension to CGI::Application
on Jan 28, 2002 at 17:34 UTC
by rob_au
While writing some code based upon the CGI framework concepts discussed here, I put together this small segment of code which introduces very simple session data storage into CGI::Application-inherited classes. This module, very much alpha with respect to code development level, allows sessional data to be stored server-side through the use of Apache::Session::File with _session_id being stored within a client cookie.

Use of this module is very simple with the replacement of the use base qw/CGI::Application/; pragma with use base qw/CGI::Application::Session/;. All data to be stored and retrieved should be placed within the $self->{'__SESSION_OBJ'} hash. Additional new method parameters include SESSIONS_EXPIRY, SESSIONS_NAME and SESSIONS_PATH to set the respective parameters of the client-side ID cookie.

This code can be expanded and improved upon greatly, but it demonstrates a proof-of-concept for session utilisation within a state-based CGI engine.

on Jan 21, 2002 at 03:10 UTC
by Amoe

Replacement for the WWW::Search::Google module. I apologise for the scrappiness of the code, but at least it works.

Thanks crazyinsomniac and hacker.

Update 06/03/2002: Surprisingly, this module still works . After all the changes that Google has gone through since the time I first released it, I would expect it to have broken a long time ago, considering it parses HTML rather than some stable format. There's an interesting story at slashdot about googling via SOAP - maybe this is the future direction this module could take?

Script Stripper
on Dec 26, 2001 at 02:31 UTC
by OeufMayo

The following act can act as a HTML script filter, stripping Javascript, VBScript, JScript, PerlScript, etc. from the HTML code.

This weeds out all the "scriptable" events from the HTML 4.01 specifications and all the <script> elements.

It takes a filename as argument, or if there's no argument, read from STDIN. All the output is done on STDOUT.

This piece of code should be pretty reliable, but I'd be interested to know if there's a flaw in this code.

Yahoo Currency Exchange Interface
on Dec 13, 2001 at 22:20 UTC
by cacharbe
A while back I had to re-write our intranet stock listing application that retreived stock information from the Yahoo! finance site (I didn't write the original). It was a huge cluge, and took forever to render, as each stock symbol, 401k symbol etc, was a seperate LWP request, and each request had to be parsed for the necessary data. The application began breaking down when Yahoo went through some major HTML/Display rewrites.

While working on the re-write I discovered something that turned 75 LWP requests into two requests; one for the indices information, and one for the list of symbols for which I needed data. The discovery was that Yahoo has a publically facing application server that offers up all financial data in CSV rather than html. I could request ALL the symbols I needed at once, and it was returned in orderly, well formatted, easy to parse CSV.

Needless to say, I saw a significant performance increase.

Yesterday I was asked if I could create a small web application that allowed users to get current currency exchange rates, and rather than re-inventing the wheel, I went back to the stock work I had done and found that the same server would do the exchange for you if given the correct URL. I have included an example of a correct URL, but I am leaving getting the correct Country Currency codes as an exercise to the user. They are readily available from a few different sources.

Usage might be somethig like:

use LWP::Simple; use LWP::UserAgent; use CGI; $|++; print "Content-type: text/html\n\n"; &Header(); &PrintForm(); my $q = new CGI; if ($q->param("s") && $q->param("s") ne "DUH" && $q->param("t")ne "DUH +"){ my $sym = $q->param("s").$q->param("t")."=X"; my ($factor, $date, $time) = (&GetFactor($sym))[2,3,4]; $time =~ s/"//g; $date =~ s/"//g; print '<CENTER><p><b><font face="Arial" size="4">RESULTS:</font></ +b> '; printf("<b><font face=\"Arial\" color=\"#0000FF\" size=\"4\">%s %s + = %s %s as of %s, %s </font></b></p>",&commify($q->param("a")), $q-> +param("s"), &commify(($factor*$q->param("a"))), $q->param("t"),$date, +$time); print "</CENTER><P>"; } &PrintFoot(\$q);


HTML Link Modifier
on Dec 23, 2001 at 21:45 UTC
by OeufMayo

As strange as it seems, I couldn't find here a code that cleanly modifies HREF attributes in A starting tags in a HTML page.

So here's one that I whipped up quickly to answer a question on fr.comp.lang.perl

It surely could be easily improved to include other links (<script>, <img src="...">, etc.), but you get the idea...

The only (slight) caveats is that the 'a' starting tag is always lowercased and the order of the attributes are lost. But that should not matter at all.
Also, this code won't print 'empty' attributes correctly (though I can't think right now of any empty attributes that are legal with 'a')

To use this script, you have to modify the $new_link variable, and then call the script with the URL of the page to be modified. Every <a href="..."> will have the $new_link added at the start of the href, and the old URL will be properly escaped.

It is probably useless as is, but with a minimum of tweaking, you can easily do what you want.
Actually, it might be a good thing to turn this little script into a module where you would only have to do the URL munging, without worrying about the whole parsing stuff...

Scripted Actions upon Page Changes
on Dec 08, 2001 at 13:37 UTC
by rob_au
This code fragment was written with the intent to minimise some of my system administration overhead by providing a script framework that allowed arbitrary scripting actions to be performed should a web page be modified or updated. The code uses LWP::UserAgent to get a web page and then should the web page have changed since the last execution of the script, as measured by the last_modified header or should this be unavailable, an MD5 digest of the page contents, execute a script subroutine or method.

Independent subroutines can be specified for different URLs, in the single example provided, the subroutine virus_alert is executed should the Symantec web page have changed since the last execution of the script.

Local::SiteRobot - a simple web crawling module
on Nov 24, 2001 at 17:09 UTC
by rob_au
Earlier this month, George_Sherston posted a node, where he submitted code for a site indexer and search engine - I took this code and decided to build upon it for my own site and in evaluating it and other options available, I found HTML::Index. This code offered the ability to create site indexes for both local and remote files (through the use of WWW::SimpleRobot by the same author) - This ability for indexing based upon URL was important to me as a great deal of content on the site is dynamic in nature. This was where my journey hit a stumbling block ... WWW::SimpleRobot didn't work!

So, I set about writing my own simplified robot code which had one and only one function - return a list of crawled URLs from a start URL address.

#!/usr/bin/perl -w use Local::SiteRobot; use strict; my $robot = Local::SiteRobot->new( DEPTH => 10, FOLLOW_REGEX => '^', URLS => [ '' ] ); my @pages = $robot->crawl; print STDOUT $_, "\n" foreach @pages;

The code I feel is quite self explanatory - /msg me if you have any questions on usage.

yet another quiz script
on Oct 30, 2001 at 22:30 UTC
by mandog
There are probably better quizes out there... merlyn has one TStanley has This one differs in that it uses HTML::Template and puts a nice wrong.gif next to questions that are answered wrong. - command-line LiveJournal client
on Oct 20, 2001 at 15:43 UTC
by Amoe
Very simple LiveJournal console client. You need to make a file called 'ljer.rc' in the same dir as the script, containing the following lines: user: my_username password: my_password No command line options required. Yes, I am aware that there are other Perl LJ clients. The problem was, one was appallingly-coded (lots of manually parsed command-line options, hundreds of warnings, no strict) and the other was using Socket (which I dislike - and not even IO::Socket) when LWP is practically built for the task, and it was an 800-line behemoth. This is just a quick script for the impatient. Feel free to cuss as you feel appropriate. xml-rpc update notifier
on Oct 18, 2001 at 21:22 UTC
by benhammersley
A command line perl tool that uses xml-rpc to tell that your blog has been updated. See here Fun from both the command line and cron. uses command line options, so invoke it like this... perl --title=BLOG_TITLE -- url=BLOG_URL
Dynamically Generate PDF's On The Fly
on Oct 10, 2001 at 17:00 UTC
by LostS
Recently I have had the joy of dynamically generating a PDF on the fly. Alot of people suggested putting the data in a text file then put it into PDF format. However I needed the ability to add graphics and also to add color's. So I did some research and found a nice little module called PDF::Create. You can get the most recent version from . The sad part is most of the developers who made this module have pretty much stoped working on it. What they do have works great... except for the adding of gif's. JPG's work great but not gif's. So here is my code I used to generate my PDF on the fly.

I contacted the creator of the module in the PDF::Create about due to the errors I was having. He looked at the code and found the problem and fixed it and sent me the updated code... So below my code is the updated :)
Link Checker
on Oct 10, 2001 at 10:47 UTC
by tachyon

This script is a website link checking tool. It extracts and checks *all* links for validity including anchors, http, ftp, mailto and image links.

Script performs a recursive search, width first to a user defined depth. External links are checked for validity but are not followed for obvious reasons - we don't want to check the whole web.

Broken anchors, links are repoted along with the server error. All email addresses harvested are checked for RFC822 compliance and optionally against an MX or A DNS listing.

More details in pod

topwebdiff - analyse the output of topweb
on Sep 14, 2001 at 12:26 UTC
by grinder
To make the best use of topweb snapshots, the idea is to generate the files day by day, and then run topwebdiff to pinpoint the ranking changes.

See also topweb - Squid access.log analyser.
topweb - Squid access.log analyser
on Sep 14, 2001 at 12:19 UTC
by grinder
I've had a look a number of analysis tools for Squid access logs, but I didn't find anything simple that met my needs -- I just wanted to know how much direct web traffic was pulled down from what sites.

See also topwebdiff - analyse the output of topweb.
Currency Exchange Grabber
on Aug 29, 2001 at 01:40 UTC
by bladx
Taken from a POD snippet within the code:

This progam's use and reason for being created, is simply for anyone that wantsto be able to check the exchange rates for different types of money, such as fromother countries. Say one was going on a trip to Japan, and they live currently in the United States. They would need to find out how much money they should bring, in order to have a good amount, and there isn't an easier way (almost) than to just enter in the amount you want converted using CEG and telling it to convert from what type of money, to the other country's money.
Cheesy Webring
on Aug 20, 2001 at 20:26 UTC
by OeufMayo

Don't blame me, ar0n had the idea first.

But nonetheless, it's a fully functional webring.
You too can make your own and impress your friends, by showing them how many people share you love of Cheesy things.

Big thanks to virtualsue for hosting the Cheese Ring! (see node below)

update: Eradicated a couple of bugs that prevented the code to compile. oops.

update - Thu Aug 23 07:47:26 2001 GMT: The truncate filehandle was not right, but virtualsue spotted it!

update - Sun Aug 26 13:32:41 2001 GMT: Fixed the check for, thanks crazyinsomniac!

Sat Sep 22 21:35:45 UTC 2001: Fixed the encoding of characters other than 0-9a-zA-Z.

On-demand single-pixel GIFs
on Aug 19, 2001 at 10:17 UTC
by dws
A short CGI script for generating a single pixel GIF of a desired color. Useful, for example, when generating HTML that embeds color-coded, image-based bar charts. Ordinarily, using color in this way requires existing GIFs for all colors used. This script removes the need to make all of those GIFs by hand, allowing one to experiment freely.
IIS Restart
on Jul 13, 2001 at 21:04 UTC
by trell
We had need of automatically restarting IIS when it got to a state where it would not respond to request. This code uses the following... Unix command from MKS Toolkit, (there are others out there) this is to kill the PID when the net stop wont work. (from CPAN) (written from scratch, feel free to upgrade and let me know of improvements) ActiveState Perl 5.6.1 I have this broken into three script, one is the .pm, the second is the actual restart script, and last is a script that checks the socket for content, then calls the restart script if needed.
Agent00013's URL Checking Spider
on Jul 11, 2001 at 03:00 UTC
by agent00013
This script will spider a domain checking all URLs and outputting status and overall linkage statistics to a file. A number of settings can be modified such as the ability to strip anchors and queries so that dynamic links are ignored.

(code) Directory of HTML to PostScript and PDF
on May 30, 2001 at 17:02 UTC
by ybiC
Create PostScript and PDF versions of all HTML files in given directory.   Ignore files listed in @excludes.

Fetches HTML files by URL instead of file so html2ps will process images.   Add <!--NewPage--> to HTML as needed for html2ps to process.   Links *not* converted from HTML to PDF   8^(

Requires external libraries html2ps and gs-aladdin, but no perl modules.

From a Perlish perspective, this has been a minor exercise in (open|read)dir, grep, and difference of lists.   As always, comments and critique gladly welcomed.

Latest update(s):     2001-05-30     22:25

  • Thanks to jeroenes for noticing funky indents and for suggesting cleaner exclusions syntax.
  • Implement and test simpler exclusions syntax.
  • Eliminate redundant code with PrintLocs()
  • Add explanatory comments for html2ps and ps2pdf syntax.
on May 15, 2001 at 00:28 UTC
by JSchmitz
simple webserver load monitor nothing to fancy, probably could be improved on....
Cam Check
on May 04, 2001 at 19:45 UTC
by djw
This code helps me manage my cam on

I have a cam at work, and a cam at home - they aren't on at the same time, I have them set on a schedule. I could use just one cam image for both cams but that isn't any fun - I also wanted my page to report to the visitor where the cam image was coming from.

Anyhow, this script checks my two cam files for creation date (File::Stat) to see which is newer. It aslo checks to see if I have updated my cam image less than 10 minutes ago, if not, its offline.

For example, lets say the newest file is work.jpg, and its creation time is less than 10 minutes ago - the script changes 3 text files in my web directory (used in Server-Side Includes) to reflect the fact that I'm at work. If the newest file (in this case work.jpg) is older than 10 minutes, then the cam is offline, and it reports that using the text files and SSI.

I have this script run on my linux box on a schedule using cron.
Opera Bookmarks to XBEL
on Apr 24, 2001 at 00:31 UTC
by OeufMayo
The python guys (they can't be all bad, after all :) have created a DTD called XML Bookmark Exchange Language which "is a rich interchange format for "bookmark" data as used by most Internet browsers." As I wanted to play with the XML::Writer module and clean up my bookmark files, I ended up whipping up this code. Have fun!

Update 2001-11-13: Complete rewrite of the adr2xbel script. It follows a bit more closely the python script found in the PyXML library demos.

DB Mangager
on Apr 03, 2001 at 01:48 UTC
by thabenksta

This program is a web based GUI for your database. It has currently only been tested for MySql, but the idea is for it to work with any DB.

The program give you a list of tables and allows you to Create, Edit and Drop them, as well as viewing their schema and data. It also provides a command line for other functions

Feel free to give me feedback/critisim. If you make any major additions, please let me know.

Update 4/3/2001:
Added privilege granting and revoking funtions.

SHTML generator for image viewing
on Mar 21, 2001 at 19:13 UTC
by djw

This utility takes a directory of images, copies files to a specified dir, creates shtml page for each block of pictures (how many per block is decided in config), can prompt for a description for each pic (otherwise uses image name), and creates a menu.shtml and an shtml page for each block needed depending again on how many pics per page you want and how many pics you have.

It requires you to have server-side includes enabled in your specified web dir so that you can include header, footer, and menu server-side include files. It also puts an image at the top right of each page, and a "created by" thing at the top left - but you can change that to have some other text, or nothing if you like.

I thought about expanding this thing to use a text file for image descriptions, and adding html support, but I don't need those features so I decided not to. It certainly can be done easily enough.

You can see a demo at which has some pics from around my office.

btw, I'm really sorry about the spacing that got mangled by using notepad. This was written in Vi with a 4 space tab, but got converted because I posted this from a win32 box and I used notepad..../me sighs.

ciao, djw
on Jan 27, 2001 at 07:15 UTC
by Anonymous Monk
MysqlTool provides a web interface for managing one or more mysql server installations. It's actually a pretty big application, with about 3000 lines of code spread across nine different modules. I'm not sure it this is a proper posting for Code Catacombs, but must everyone who's seen it and uses mysql & perl on a regular basis has loved it.
Live365 Broadcaster Info grabber
on Jan 18, 2001 at 08:13 UTC
by thealienz1
I was working one day on setting up an radio station on, and I noticed that their main http site was really slow. So, what I did was made a script that checks my broadcasting information really fast. Thanks PERL. Basically all you have to do is type in the username and the script does the rest for you. You do have to make sure that you do have the LWP modules installed. Otherwise I trust that you know how to use this trully useless script. Thanks PERL.
on Apr 16, 2001 at 02:50 UTC
by Masem
A notes-to-self CGI script, useful for those (like me) that have multiple locations where they work, and want a quick, low-overhead way to leave notes to themselves on a central server (eg no DBI). Note that the script has no security checks, but this can easily be done at web server level.
SSI Emulation Library
on Jan 13, 2001 at 10:37 UTC
by EvanK
An alternative to using big clumsy modules when you need to emulate ssi in perl's been written to theoretically work on both win32 and *nix systems, though I've only gotten to test it on windows. works fine for me though. any comments and feedback welcome.
(code) Toy Template
on Jan 06, 2001 at 03:01 UTC
by ybiC
Toy Template

Simple scripted website generator.
Uses, HTML::Template and param('page') to eliminate duplication of code and markup in toy website.   Also uses CSS to separate style from structure (as much as possible, anyway).   Clearly not suited for anything industrial-strength.

Code and CSS files go in a web-published directory.   Common, content and template files go in a non-webpub directory.   Each web page in the site is defined by its' own content file.

Thanks to ovid, chipmunk, Petruchio, davorg, chromatic and repson for helping me figure this out.

Most recent update:   2001-05-10
Hashamafied a few of the passel o'scalar variables.

Update: 2002-05-11 note to self: look into Storable to potentially reduce response times, or Data::Denter so don't have to unTaint incoming data structure.   Props to crazy for suggesting Denter and for initial benchmark strongly favoring Storable as perf king.

use Benchmark    qw( cmpthese timethese );
use Storable     qw( freeze thaw );
use Data::Denter qw( Indent Undent );

timethese( 2_000, { Data::Denter => \&DENTOR, Storable => \&STORKO });
print "\n\n\n";
cmpthese( 2_000, { Data::Denter => \&DENTOR, Storable => \&STORKO });

sub DENTOR {
  my $in  = Indent \%BLARG;
  my %out = Undent $in;

sub STORKO {
  my $in  = freeze \%BLARG;
  my %out = %{ thaw($in)};

Benchmark: timing 2000 iterations of Data::Denter, Storable...
Data::Denter: 12 wallclock secs (11.71 usr +  0.01 sys = 11.72 CPU) @ 170.69/s (n=2000)
Storable:     6 wallclock secs  ( 5.60 usr +  0.00 sys =  5.60 CPU) @ 357.27/s (n=2000)

Benchmark: timing 2000 iterations of Data::Denter, Storable...
Data::Denter: 12 wallclock secs (11.80 usr +  0.00 sys = 11.80 CPU) @ 169.53/s (n=2000)
Storable:     6 wallclock secs  ( 5.62 usr +  0.00 sys =  5.62 CPU) @ 356.06/s (n=2000)
              Rate Data::Denter     Storable
Data::Denter 170/s           --         -52%
Storable     356/s         110%           --

Cookbook Recipe Bot
on Dec 11, 2000 at 11:32 UTC
by epoptai
There are two scripts here, and a sample crontab. gets & stores daily cookbook recipes from Written to run as a cronjob (no output except the file) & includes a sample daily crontab to automate the process. lists the saved recipes in alphabetical order. The viewer requires two images to work properly, 1. a small transparent gif named blank.gif, and 2. a Perl Cookbook cover image such as these at fatbrain and amazon, both stored in the same dir as the scripts & saved files.

Update: fixed url to point to the new recipe location at

US Library of Congress perl module
on Dec 10, 2000 at 13:24 UTC
by eg

A perl module to access the library of congress' book database.

Unlike Amazon or Barnes and Noble, you can't just look up a book in the Library of Congress database with an ISBN; you need to first initialize a session with their web server. That's all this module does, it initializes a session for you and returns either a url of a book's web page or a reference to a hash containing that book's data (author, title, etc.)

fix bad HTML comments
on Nov 29, 2000 at 09:45 UTC
by chipmunk

This Perl filter fixes bad HTML comments, such as <!----------- really ---------bad ------ comments ---------->. (Such comments are bad because, according to the spec, each -- -- pair within <! > delimits a comment. This means <!-- -- --> is not a complete comment, for example.)

The code reads in the entire file, then finds each occurrence of <!-- ... --> and uses tr/// to squash each run of hyphens to a single hyphen. The assignment to $x is necessary because $1 is read-only.

Dynamic Well Poisoner
on Oct 25, 2000 at 21:50 UTC
by wombat
"Poisoning the well" is a term sometimes used to mean feeding a bot false information, either to make it crash, or to devalue the rest of the data that it is supposed to be mining. We all hate spambots. They're the ones that scour webpages, looking for little <a href=mailto:> tags. They then take down your email address and some spammer then sells it on a "100,000,000 VALID WORKING ADDRESSES" CD-Rom to other unpleasent people. You can all help stop these evil robots, by putting up a Poisoned Well on your homepage.

Basically it's a list of randomly generated email addresses that look valid (cause they're made of dictionary words), have valid DNS entries, but then bounce when the spammers try to send mail to them. This program has a twofold purpose. Older Poison Wells just generate [a-z]{random}@[a-z]{random}.[com org net]. This one takes domains that you specify from a list. Thus, you can put domains you don't like in the list, and then cause THEM to have the burden of sending back lots of bounced messages. As stated before, any spambot worth its silicon would check to see if the address was a valid domain. This would circumvent that.

For my list of evil domains, I've put the six top generators of banner ads. Especially the ones that are suspected of selling personal data. >:-D

Some of the amusing email addys that this generated were
Velar Central File Submitter
on Sep 30, 2000 at 05:08 UTC
by strredwolf
The main code I use to send artwork to the Vixen Controled Library. Try it out as an example to code a HTTP POST w/multipart client. Oh, it does do authorization, but I've removed my password (so nyah!)
Serving Images from Databases
on Sep 23, 2000 at 00:51 UTC
by Ovid
If our Web server receives an image request and the image is not found, the Web server calls a cgi script to serve the image. The script analyzes the request and serves the appropriate image from MS SQL 7.0 Server.

I have posted this code to show an extensible method of serving images in the event that any monks are called upon to write a similar script.

This can easily be modified to allow calls directly from a CGI script in the following format:

<img src="/cgi-bin/images.cgi?image=category1234" height=40 width=40 alt="some text">

Then, you'd add the following to your code:

use CGI; my $query = new CGI;
Then, substitute the following line:
$ENV{'QUERY_STRING'} =~ m!([a-zA-Z]+)(\d+)\.($types)$!;
With this line:
$query->param('image') =~ m!([a-zA-Z]+)(\d+)\.($types)$!;
Disable animated GIFs and blinking text in Netscape
on Sep 20, 2000 at 22:16 UTC
by Anonymous Monk
Perl script that disables/enables blinking text and/or animated GIFs in Netscape. I've tried this on Netscape for Intel Linux, Digial Unix, and AIX, and I've been told it also works on Netscape for M$ Windoze. Animated GIFs will still run through the cycle one time, stopping on the last frame. As you can see, all it actually does is call perl again to replace a couple of strings inside the Netscape binary to cripple the blinking text and animated GIF support.
Home Page Manager
on Sep 17, 2000 at 22:17 UTC
by Ovid
So, you want to read in the morning before work, at night when you get home, and every Wednesday at 5:00 PM your favorite Web site issues an update. Rather than scramble for your bookmarks or search through links on your links bar, here's a utility which stores homepages based upon time. Set this script as your home page and enjoy! If you don't have a homepage set up for a particular day/time, it sends you to the default home page that you have specified.

It's definitely beta quality, so any suggestions would be appreciated.

on Sep 12, 2000 at 08:42 UTC
by araqnid
Given a set of photo images (e.g. JPEG) and an XML file with descriptions, generate mini-images and HTML pages to navigate through them. Allows photos to be arranged hierarchially (sp?). Allows for terms in descriptions to be indexed and a cross-references page to be built. Unfortunately, the HTML templates are hardcoded atm :(
TLD generator
on Jul 17, 2000 at 04:56 UTC
by j.a.p.h.
Ever get a bit ticked off about the lack of TLDs/domains available? I wrote this little program to print out every possable 3-6 letter TLD. If ICANN would approve them all, I doubt it would be a problem anymore. But there's litte to no chance of that happening (at least any time soon).
Slashdot Headline Grabber for *nix
on Jul 11, 2000 at 03:02 UTC
by czyrda

Gets the Slashdot headlines every 30 minutes.

Daily Comix
on Jul 07, 2000 at 00:57 UTC
by Arjen
I created this lil script a couple of months ago to have all the comix I read in the morning on one page. I also wanted to be able to chose what to see and I wanted to be able to add comix without having to modify the script itself.

The plugins are simple config files with a piece of code as a value which gets eval'ed. The script then puts them all in a nice overview on 1 single page.

An example plugin (UserFriendly) is included in the code section.


Apache log splitter/compressor
on Jun 12, 2000 at 18:06 UTC
by lhoward
This program is designed to read from an apache logfile pipe. It automatically compresses the data and splits it into files each containg one day's worth of data. The directory to write the log files to should be set in the environmental variable LOG_DIR. Log files are named access_log_YYYY_MM_DD.N.gz. Here is how I have apache configured to call this program.

CustomLog |/path/to/ combined

Apache Timeout module
on Jun 09, 2000 at 06:00 UTC
by jjhorner

I wrote this mod_perl handler to give easy timeouts to restricted web pages. It is very elementary, but useful. Please give me some comments at my email address above if you wish.

It requires a directory "times" under your /usr/local/apache/conf/ directory, owned by the user:group running the Apache child processes, for your timestamp files.

Usage: See in-code docs.

Update v0.21

  • I added better docs and fixed a bug or two.
  • I also moved most of the config info into the httpd.conf file and only moved configurable stuff to .htaccess.
  • Added concept of Minimum Time Out and Mode.

Update v0.20

  • I sped up the routine that checks time since last visit. It now stats a file, grabs the number of seconds since last modification, and uses that for $last_time. Then opens the time file rw to update the modification time.
  • I added option to put the DEBUG mode into the .htaccess file.


  • Write documentation
  • Make into format usable on CPAN
Dark Theme for /. through Perl
on May 04, 2000 at 23:07 UTC
by PipTigger
Here's a little script I wrote for myself since I like light text on dark backgrounds (thanks again for the nice PerlMonks theme Vroom!) and /. doesn't have one... I know it's pretty suckie and could be a lot simpler. If you can make it better, please email it to me ( as I use it everyday now. It doesn't werk yet for the ask/. section but when I find some time, I'll add that too. I hope someone else finds this useful. TTFN & Shalom.

p.s. I tried to submit this to the CUFP section but it didn't work so I thought I'd try here before giving up. Please put the script on your own server and change the $this to reflect your locale. Thanks!
webster client
on May 04, 2000 at 21:07 UTC
by gregorovius
This is a simple web client that will bring a word definition from the UCSD web dictionary into your X terminal.

If people are interested I also have a server-simpler client version that allows this program to be deployed on a large number of machines without having to install the LWP and HTML packages it uses on each of them.
Usage: % webster_client word
Resolve addresses in web access logs
on Apr 29, 2000 at 01:20 UTC
by ZZamboni
Where I work, apache is configured not to resolve IP addresses into names for the access logs. To be able to properly process the access logs for my pages with a log-processing program (I use webalizer) I wrote the following script to resolve the IP addresses. Note that the local domain name needs to be changed for when the resolved name is local (machine name only, without domain). This happens sometimes when the abbreviated name is before the full name in /etc/hosts, for example.

Updated: as suggested by kudra, added a comment to the code about double-checking the name obtained, and why we don't do it in this case.

RSS Headline Sucker
on Apr 27, 2000 at 22:30 UTC
by radixzer0

Make your own Portal! Amaze your friends! Confuse your enemies!

This quick hack takes a list of RSS feeds and pulls the links into a local database. If you don't know what RSS is, it's the cool XML headline standard used by My Netscape. Lots of sites provide RSS feeds that you can use to make headline links on your own site (like, ZDNet, etc.). I used this to make a little headline scroller using DHTML on our company Intranet. This script works best with a scheduler (like cron) to update on a periodic basis.

For a comprehensive list of available feeds, take a look at

Comments/improvements more than welcome ;)

Resume logger
on Apr 07, 2000 at 21:04 UTC
by turnstep
Small script to check if anyone has looked at my online resume, and if so, mails the information to me. Run as a periodic cronjob; exits silently if no new matches are found.
Poor Man's Web Logger
on Apr 05, 2000 at 21:56 UTC
by comatose

Since I don't have access to my ISP's server logs but still want to have some idea of who's visiting my website, I developed this little script that can integrate easily with any existing site. All you need is CGI access of some type.

To install, save the script so it is executable. Also, you'll need to set the $home, $logfile, $ips (IP addresses you want to ignore), and %entries (labels and expressions to match) variables. Be sure to "touch" the logfile and make it writable by the web server's user.

Pick an image, preferably a small one, on your page.

<img src="/cgi-bin/showpic/path_to_pic_in_document_root/pic.jpg">

Each time someone accesses the page with that image, an entry is made in the log with the date, time, and either hostname or IP address. Here's an example of the output. Enjoy.

Wed Apr 05 13:08:26 2000 resume
Wed Apr 05 13:29:29 2000
Thu Apr 06 01:31:47 2000
Slashdot Headline Grabber for Win32
on May 03, 2000 at 04:33 UTC
by httptech
A Win32 GUI program to download Slashdot headlines every 30 minutes and display them in a small window.
on Oct 04, 1999 at 22:34 UTC
by gods
Server Monitor via Web
on May 23, 2000 at 03:02 UTC
by BigJoe
This is a Quick script to keep track of the load on the server. There are 3 files that are needed to be made. The HTML file to access it the history text file and the script. The HTML diplay is very generic. click here to see it work
OPPS - Grab weather from yahoo (2)
on May 24, 2000 at 10:18 UTC
by scribe
Grabs weather from yahoo. One of my first perl scripts
and my first use of IO::Socket. It could use some work
but I only use it as part of an efnet bot.
Seti stats
on May 24, 2000 at 14:02 UTC
by orthanc
An Anon Monk sent in something similar to SOPW so I modified the code a bit and here it is. Could do with a bit more error checking but it works.
on May 25, 2000 at 13:55 UTC
by ask
Neat little program to lookup and display quotes from Includes robust caching so it's fast enough to run from your .bash_profile. Does not even load the http libs when displaying cached information. Documentation included in the file in POD format.
Stock Quotes
on May 24, 2000 at 20:39 UTC
by scribe
Retrieves basic stock quotes from yahoo.
Uses IO::Socket to connect to webserver.
Can only retrieve one quote at a time now.

Sample output:
(( ELON 51.625 -3.5625 ) ( 49 -- 56.5 ) ( Vol: 1646200 ))
Powerball Frequency Analyzer
on May 04, 2000 at 18:56 UTC
by chromatic
This little darling grabs the lottery numbers from Powerball's web site and runs a quick analysis of the frequency of the picks. It also makes recommendations for the next drawing.

I run it from a cron script every Thursday and Sunday morning.
on Apr 27, 2000 at 17:58 UTC
by ergowolf
This program checks for books from fatbrain, but it can be easily be modified to search any site's page.
wsproxy: Perl Web Proxy
on May 20, 2000 at 22:58 UTC
by strredwolf
wsproxy is a web proxy in perl! It has filtering and caching abilities, but you need to tell it what to ban or save. It works very well with netpipes, or even inetd!

To use it, try faucet 8080 -io

v0.4 and 0.5 added banning of Javascript and HTML pages.
v0.6 had a file locking bug nuked and preliminiary tempcache support
v0.7 now has an embedded gif (here) v0.8 now has mostly functional (self-cleaning) tempcache support. strip grabber
on Mar 15, 2000 at 01:08 UTC
by billyjoeray
This script can be run on regular intervals to download the latest strips from I'm putting this here because its a little big for the front page, but for whoever reads this, I'd like some tips on how to clean up my code and some wisdom on how to easily write code with 'use scrict;' I always seem to be needing to use variables in a global sense. Full Size version forwarder :)
on Mar 30, 2000 at 11:46 UTC
by ash
You hate having to load the small version of userfriendly for then to click on the link to the fullsize one? Well. This script forwards you directly to todays full size userfriendly strip :) But since userriendly doesn't update before 9 a.m in norway, i've inserted this little line: ($a[2]<9)&&$a[3]--; so that you'll be forwarded to yesterdays strip if it's not up yet. Change the 9 to your timezone :)
Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2018-11-14 07:48 GMT
Find Nodes?
    Voting Booth?
    My code is most likely broken because:

    Results (164 votes). Check out past polls.