Introduction
Yes, I know the title is bad code. I wanted something memorable so we can point back to this node again and again.
We've seen it again and again. Everybody and their dog at one time or another seems to have toyed with an alternative to CGI.pm. If you think it's too bloated, try CGI::Lite, but don't go rolling your own. This node and (hopefully) the resulting thread, is just something convenient to toss to newbies who aren't aware of the issues involved.
Commons Problems with Alternatives
Here are some common reasons not to use alternatives to CGI.pm:
- Your version probably won't allow for file uploads. For a good example of why, please check out japhy's online CGI course (particularly chapter 2).1
- Did you know that color=blue&color=red is a valid query string? Most alternatives don't properly handle multiple values for one parameter. Those that do typically use a null byte (ASCII zero) to deal with this. This leaves the potential for opening up a nasty security hole.2
- Typically, these alternatives do not allow for any delimeter besides the ampersand. Semi-colons are sometimes used to delimit name/value pairs, but you'd never know it examining most home made alternatives.
- When was the last time you saw a hand-rolled version verify that the length of data read from STDIN matched $ENV{ CONTENT_LENGTH }? If the browser screws up, you could have corrupt data, but if you don't verify the content length, you'll never know. This, being an intermittant bug, is incredibly difficult to debug.
Those are some of the biggies. The following is a list of complaints that, while not directly related to the "hand-rolled" problem, tend to crop up in the code of those who insist upon doing it themselves.
Related Problems
If you want instant verification of this stuff, use Super Search and search for CONTENT_LENGTH in the text of articles. Not all are applicable, but there are some real doozies out there. Here's my favorite:
use CGI qw/:standard/;
read(STDIN, $formdata, $ENV{'CONTENT_LENGTH'});
@pairs = split(/\&/, $formdata);
foreach $pair (@pairs){
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%0D%0A/\n/g;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$FORM{$name} = $value;
}
This person is using CGI.pm but still (incorrectly) hand-parsing the data.
Benefits of CGI.pm
No sense in showing you the stick if I don't bother with the carrot.
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.
Footnotes
- Yeah, I know that I have an online CGI course, also. The course that japhy is preparing seems to be much more of a rigorous analysis than mine. Mine is targeted at a different audience. (read: japhy's Perl is way better than mine so I pander to the masses :-).
- Why should that be a security hole? If you only have one file with a given name, you won't be separating them with null bytes, right? Not necessarily. A wily cracker can simply add another parameter with the same name and your script will politely add a null byte for you. Of course, proper taint checking will stop this, but so will using CGI.pm.
- I don't know who first started the annoying habit of trying to strip out SSI's in the parameter processing routine, but here's the potential benefit: let's say you let users sign up at your site and create a home page. You use CGI to capture their home page data and write it to an HTML file, but you don't want to allow people to run SSIs (a huge security hole, if you're configured wrong). This code will strip out SSIs, HTML comments, and everything between them if you have more than one. See Death to Dot Star! if you're unfamiliar with that issue.
(ichimunki) re: use CGI or die;
by ichimunki (Priest) on Jan 11, 2001 at 17:41 UTC
|
Okay. I need a place to vent on the topic of CGI and this seems like the perfect place.
I've been doing some research on HTML4, parsing HTML, and related topics. I've recently been trying to build a browser in Perl (and yes, I use the modules when I know about them).
As the very first parsing I did, I grabbed all the <h1> to <h6> tags. These tags, when used correctly, should give a good outline of what's on the page. But guess what? I checked a lot of major sites, like search engines, news sites, then some great discussion sites that are success stories for Perl, then a few random Monk home pages. Almost nobody uses these tags. I thought there was a problem with my program! Everybody is using the <font> tag instead of the classic header notations.
Then I got curious, so I headed to an HTML validator. I checked all the same pages again. I found two pages that were even close to "valid" compared to the standard. And one of those was w3's own home page. The other was missing a single alt tag on a gif.
So here's my sore spot. Using CGI.pm is obviously recommended. I can't think of a single reason not to use something that comes with every default install of Perl. That would be like writing foreach loops to perform an action on list elements instead of using map. But even those that I can't imagine are not using CGI.pm (slashdot or perlmonks, for instance) do not generate valid HTML. Not even close according to the error report.
While I've seen plenty of loud complaints when people roll their own form parsers, I am not seeing those same loud complaints when people (mis)use CGI to generate suboptional HTML or who use it to parse the forms, but then completely ignore it for generating HTML. It would appear that even those people who are using CGI.pm can't count on it to put in some default alt tag for their gifs when they forget-- it creates correctly formed HTML that may be gibberish according to the standards.
The module might as well be CGI::ParseForms and skip all the HTML building routines, for the ways it seems to be used in the wild. And frankly, given how much trouble I have with fonts that get too small, or pages that are completely unreadable in text-only mode (yes, I like to browse in Lynx sometimes just to get away from all the image rendering issues and time wasted waiting for them download over the modem), I'd like to see us make stronger, more frequent recommendations to use CGI for building HTML and then to remember that using it is no guarantee of perfect HTML either.
Some of the above is altered and Update: based on responses below, I'm not sure what I said that muddied my point. What I'm saying is simple. Feel free to keep harping away on the poor souls who roll their own parsing routines instead of using CGI.pm. But please, consider applying the same critical eye to people who use only 5% of the functionality of the module and continue to hand code HTML (often hard coding large chunks of it into their scripts), or who use the module to create crummy HTML by subverting the fact that while it writes well-formed HTML it does not validate tags, attributes, or block/inline nesting.
| [reply] |
|
The problem is that many years ago it was decided by the
powers that be that browsers would be lenient towards bad
HTML. This is generally seen as a Bad Thing. As you've seen,
the vast majority of the web is now made up of invalid
HTML.
Using the HTML shortcuts in CGI.pm helps in one way as
a construction like:
ul(li([1 .. 10]));
will at least be well-formed, unfortunately it
doesn't prevent you doing something like:
p(font({size=>'larger', color=>'red'}, 'Heading'));
instead of
h1('heading');
and using CSS to handle the appearance.
I haven't looked at a new version of CGI.pm for some
time, but I'm hoping that it either has or will soon have
an XHTML mode, but that still won't stop people from
Doing The Wrong Thing :( You can't get away from the fact
that it's the web page author's responsibility to create
valid HTML.
The only option is for browsers to suddenly stop working
on invalid X?HTML, but the chances of that happening are
appoximately zero.
Dave...
(who tries to validate all of his web pages, but admits that a few errors do creep in)
--
<http://www.dave.org.uk>
"Perl makes the fun jobs fun
and the boring jobs bearable" - me
| [reply] [d/l] [select] |
|
| [reply] |
|
|
> The only option is for browsers to suddenly stop
> working on invalid X?HTML, but the chances of that
> happening are appoximately zero.
Suddenly? No. Most of the web is still a non-wellformed
mixture of HTML3, HTML4, and imaginary tags made up by
specific browsers. However, current browsers
do choke on non-wellformed markup if
it is served with a content-type of text/xml, and that's
a first step. As things like XSLT and RDF start to catch
on, sites that want to harness the value of those things
will have to be redone in wellformed XML, and that's
that. (They won't necessarily have to provide and
validate against Schemata, but we have to start someplace.)
Incidentally, if CGI.pm is now improved to the point
of being capable of producing anything that remotely
resembles XHTML, maybe I should have another look at it;
I've been avoiding it because of two things, and one was
the execrable state of its output. If that has been
shaped up, maybe the other thing (the tendency to
obfuscate the Perl code) has been improved too, since I
looked at it (which has been a bit), and I
should have a second look.
--jonadab
| [reply] [d/l] |
|
|
|
|
| [reply] |
|
use CGI qw( glark yurp );
my $q= CGI->new();
print $q->h1( "This is not really HTML" );
print glark( { flinge=>"worz", plutch=>"erff" } );
print yurp( { huid=>"queez", urst=>"hmmph" } );
print $q->font( { crypet=>"swoom", whalk=>"47" } );
which produces
<H1>This is not really HTML</H1>
<GLARK FLINGE="worz" PLUTCH="erff">
<YURP URST="hmmph" HUID="queez">
<FONT WHALK="47" CRYPET="swoom">
| [reply] [d/l] [select] |
|
| [reply] |
|
I hate hammers and screwdrivers. Well, not exactly hate them
because they come with every toolbox it seems but I want to
complain about how people use them. People are always
using these tools to build things that are dangerous. The
planks on the deck are loose, the shelves in the bookcase
are wobbly, people try and open cans with them, etc. We should
change their names to "nail-driver" and "threaded-metal-cylinder-turner"
until we can fix these tools to alert the user when they
are using them incorrectly or at least get the hammer to
countersink, putty, and sand.
No offense ichimunki, I'm in your camp on this, I just
think that you shot at the wrong criminal. People write
shitty HTML with any tool, CGI can't make things worse
and frequently makes things better.
That is a ++ to ichimunki in case I was equally unclear =)
--
$you = new YOU;
honk() if $you->love(perl)
| [reply] |
Re: use CGI or die;
by gildir (Pilgrim) on Jan 11, 2001 at 16:51 UTC
|
For Apache's mod_perl there is excelenet CGI.pm alternative:
libapreq. But this is only available under mod_perl, as it uses Apache's internals.
| [reply] |
|
I totally agree with you gildir. This is the method I prefer, it's a lot faster, we tested this through our module site, the site is quick to load because it doesn't have to access CGI.pm everytime. Apache::Request and Apache::Cookie are great too. =D
| [reply] |
Re: use CGI or die;
by Maclir (Curate) on Jan 11, 2001 at 04:50 UTC
|
There are also other tools / modules available. I have used Embperl http://perl.apache.org/embperl/index.html, which uses CGI.pm under the covers. I am not sure about HTML::Mason, but I would not be surprised it if also uses CGI.pm too. There are probably other HTML generation / templating / web site generation tools that use CGI.pm
| [reply] |
|
Just as an FYI, HTM::Mason does provide access to the CGI.pm.
You are advised to not access the structure directly, even
though you can. Mason provides access through object
references.
Also, you are able to take advantage of the HTML constructs
as documented in the Mason FAQ here
I'm inclined to believe that using the CGI constructs is
probably a "good thing" as you get the side benefit,
as merlyn states: Re: Re: use CGI or die;, of CGI.pm spitting out XHTML.
| [reply] |
|
My favourite for embedding Perl into *ML is
Apache::ASP. The homepage is
here.
Last time I checked,
it relied on CGI.pm for file upload and allowed to mix
freely CGI.pm and Apache::ASP calls in the same page.
I find the Active Server Pages model (with Perl as
a programming language) quite
useful when writing Web applications.
-- TMTOWTDI
| [reply] |
Re: use CGI or die;
by ColonelPanic (Friar) on Jan 12, 2001 at 02:08 UTC
|
Another huge debugging benefit of CGI is CGI::Carp
use CGI::Carp qw(fatalsToBrowser);
This is invaluable for figuring out a CGI problem. Not only do you see errors, but you can easily insert your own die(); statements to see what's going on, instead of printing your own header and HTML in several lines.
| [reply] [d/l] |
|
| [reply] |
|
perl foo.pl > foo.html
but it would just whirl and give no output via CGI. I found that
the code looks and behaves perfectly via CGI unless
both -w and fatalsToBrowser are enabled! Shut
either one off (leaving the other on) and it works fine.
I found a certain loop in the program that seems to cause
this by throwing pairs of =cut around, but the loop seems
mundane and similar to the others.
Is this odd interplay between -w and fatalsToBrowser documented?
Update: I said i'd node the entire script to craft
in a day or two but am finding it difficult to abstract a
simplified example. So i'll just suggest that if your
error-free cgi script mysteriously hangs, turning off
either -w or fatalsToBrowser may help.
dws - Try this for a good html :
™
| [reply] [d/l] [select] |
|
| [reply] |
Re: use CGI or die;
by Anonymous Monk on Feb 04, 2013 at 11:05 UTC
|
| [reply] |
|
|