Nocturnus has asked for the wisdom of the Perl Monks concerning the following question:
Dear monks,
I have a site which consists only of cgi scripts; these are programmed in Perl. The scripts are generating the HTML content dynamically, including links to other Perl CGI scripts. The scripts are usually called with certain parameters via query string, and that is working so far.
I am using the routines from CGI.pm (of course) to parse / fetch the URL query parameters; I am using Apache 2.2 as HTTP daemon (Debian sqeeze).
Now, when calling such scripts without any parameters (i.e. without query string after the script name / path), the scripts are seeing an unwanted parameter "keywords" (which does not have a value).
I have googled and noticed that it might have something to do with ISINDEX tags in HTML. But I am definitely not using such tags. Another site suggested that this parameter is automatically created by CGI.pm if there is no ampersand and no = in the query string.
Anyways, I just would like to know how that parameter efficiently could be removed at the CGI level. Of course, this must be done in a way which still leaves the possibility to have a wanted parameter which is named "keywords".
I hope I have described the problem clearly.
Regards,
Nocturnus
Re: Unwanted parameter when executing CGI scripts
by tobyink (Canon) on Jan 04, 2013 at 14:27 UTC
|
use CGI;
my $cgi = CGI->new;
for ($ENV{QUERY_STRING}) {
delete $cgi->{param}{keywords} if length && !/[&=]/;
}
... rest of the code goes here ...
Of course, the better answer is: stop using CGI.pm; use Plack.
PS: yes, it does have to do with <isindex>. This is a very old HTML tag that was the predecessor to modern HTML forms. It submitted just a single field which was intended as a "search" field. Because only a single field was ever submitted, there was no need for the "&fieldname=" bits of the query string. Thus CGI.pm assumes that when there is no ampersand nor equals sign in the query string, an isindex-style query has been made.
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Wow, that was fast!
But wouldn't that kill a "wanted" parameter "keywords" as well? In other words, as I wrote above, if I call
/cgi-bin/script.pl
the "keyword" parameter should not be existent, but if I call
/cgi-bin/script.pl?keywords=foo
the "keyword" parameter should be available in the script.
Could you please let me know if I have misunderstood the concept of your suggestion?
Regards,
Nocturnus
| [reply] [Watch: Dir/Any] |
|
Please forget about my comment above. It was relating to the first version of your answer and does not make much sense any more; unfortunately, I have posted when I was not logged in and thus can't delete it now.
I think your solution will work. Thank you very much - I'll try that way. Nevertheless, I was hoping that there is something more elegant. I will look into Plack.
Thanks again,
Nocturnus
| [reply] [Watch: Dir/Any] |
|
After thinking again about your code, I have one question and one additional remark. Please note that C and assembler is my main business, so I am usually using the sort of Perl syntax which is similar to C, and I might have misunderstood what you are doing in your code. Anyways:
delete $cgi->{param}{keywords} if length && !/[&=]/;
Do I get this right: The "keywords" parameter is deleted if the length of the query string is >0 and the query string does not contain = or &? The code would then fail when the script would be called without any query string (which was the situation where I was surprised by the problem the first time). If I am right, we should write
delete $cgi->{param}{keywords} if (!defined || !length || !/[&=]/;
instead. Correct?
One more problem / remark: Suppose the script is called via
/cgi-bin/script.pl?Test
CGI.pm then would generate a parameter "keywords=Test", and we would remove that parameter accordingly. But that would be only the half of the way: I think that "Test" in that case should be a KEY in the parameter list which has an empty (or undefined) value. How exactly should we handle this? What is the difference between the following calls (from the viewpoint of the CGI specification)?
/cgi-bin/script.pl
/cgi-bin/script.pl?
/cgi-bin/script.pl?Test
/cgi-bin/script.pl?Test=
Of course, I'd test myself before boring others, but I am not sure if it would be a good idea to find out by using CGI.pm or Plack, because I now have learned that (at least) CGI.pm does unexpected things, so I could not use that for testing. Not sure about Plack, though.
Regards,
Nocturnus
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
You needn't test for whether $ENV{QUERY_STRING} is defined or not, because undefined things automatically have zero length.
"CGI.pm then would generate a parameter "keywords=Test", and we would remove that parameter accordingly. But that would be only the half of the way: I think that "Test" in that case should be a KEY in the parameter list which has an empty (or undefined) value."
Then you could try something like:
for ($ENV{QUERY_STRING}) {
$cgi->{param}{delete $cgi->{param}{keywords}} = ""
if length && !/[&=]/;
}
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
Re: Unwanted parameter when executing CGI scripts
by sundialsvc4 (Abbot) on Jan 04, 2013 at 16:12 UTC
|
It also occurs to me to wonder, and with considerable alarm, “exactly what is it about your present algorithm which makes you care that an ‘extra’ parameter value is appearing in the data?” Of particular concern to me is that perhaps the software is iterating through the parameters given and attempting to do something with them “no matter what they are,” on the very naïve and very dangerous assumption that only the “expected” parameter-names could actually be there. (PHP was most-notorious for this in its earliest days, when it would “helpfully” introduce every POST/GET variable that it saw, right into the variable pool. “Helpful” indeed it was, if one is not thinking of malice, and of course it didn’t last long. But there are still probably some vulnerable web-sites out there that are still being hacked because of it.) Your application should know exactly what variable-names it might consider. It should look only for submitted variables under those names, at the exclusion of any and all others. It should, furthermore, validate each of these, using a regular-expression (say), before accepting any of them. Never allow the user (which could well actually be a malicious “bot!”) to “stuff” anything into your application’s brain.
(Edit: Parameter validation should always occur in two places. First, the user interface should filter out typographic errors before attempting to submit anything to the host. But second, the host should validate everything it receives, both syntactically then semantically. I am of the opinion that, if anything is found to be wrong in second-stage verification, the host perhaps should throw-out all of it, and perhaps using 400 Bad Request. Ditto the presence of “unexpected” parameters in the input, on the rationale that it is “a bug, at best” for them to be present. The client-side validation is for user convenience (and to reduce network bandwidth); the second is for the server’s own protection and to facilitate the detection of legitimate errors “somewhere” in the software. (Even in 100%-innocuous scenarios, the hardest thing about troubleshooting any problem is knowing that a problem exists.)
| [reply] [Watch: Dir/Any] |
|
Of course, you are right, and I have already implemented such checks at different levels of the application architecture.
Nevertheless, there is one situation where script A just should "pass" all parameters to another script B. To be precise, A generates HTML code with <a href> to B, where the href's URI contains all parameters which had been included in the call of A. Generating this link is done by generic code which is in a module which is used by several of the scripts; thus, when generating the link, the parameters are not checked. This is no security problem since B will check it's parameters for correctness when called.
In nearly all browsers, when moving the mouse to the generated link, the complete destination URI of the link, including the parameters, is visible (e.g. in the status bar). Now, if script A is called WITHOUT parameters (which is perfectly acceptable), the generated link to script B contained a query string ("?keywords=") where no query string should be.
This worries users, makes debugging more complicated, and is ugly, so I would like to change that.
I will do it the way which has been proposed above, but I was hoping that we could "configure" CGI.pm somehow to disable that behavour, or that there is another more elegant way.
Regards,
Nocturnus
| [reply] [Watch: Dir/Any] |
Re: Unwanted parameter when executing CGI scripts
by Anonymous Monk on Jan 05, 2013 at 00:05 UTC
|
Now, when calling such scripts without any parameters (i.e. without query string after the script name / path), the scripts are seeing an unwanted parameter "Keywords" (which does not have a value).
Trivial to test
$ perl -MData::Dump -MCGI -e " dd( CGI->new(q{?noampernoequal}) )"
bless({
".charset" => "ISO-8859-1",
".fieldnames" => {},
".parameters" => ["keywords"],
"escape" => 1,
"param" => { keywords => ["?noampernoequal"] },
"use_tempfile" => 1,
}, "CGI")
$ perl -MCGI -e " print CGI->new(q{?noampernoequal})->param "
keywords
$ perl -MCGI -e " print CGI->new(q{?noampernoequal})->keywords "
?noampernoequal
$ perl -MCGI -e " print for CGI->new(q{?noampernoequal})->keywords "
?noampernoequal
$ perl -MCGI -e " print for CGI->new(q{?a=b;noampernoequal})->keywords
+ "
Easy to remove use Data::Dump;
use CGI;
my $q = CGI->new( q{?noampernoequal});
dd $q;
dd $q->param;
dd $q->keywords;
$q->delete('keywords') if $q->keywords;
dd $q->param;
dd $q;
__END__
bless({
".charset" => "ISO-8859-1",
".fieldnames" => {},
".parameters" => ["keywords"],
"escape" => 1,
"param" => { keywords => ["?noampernoequal"] },
"use_tempfile" => 1,
}, "CGI")
"keywords"
"?noampernoequal"
()
bless({
".charset" => "ISO-8859-1",
".fieldnames" => {},
".parameters" => [],
"escape" => 1,
"param" => {},
"use_tempfile" => 1,
}, "CGI")
I don't think I've ever used this feature in 10 years :) | [reply] [Watch: Dir/Any] [d/l] [select] |
|
Well, I think that
$q->delete('keywords') if $q->keywords;
will delete the "keywords" parameter in every case, i.e. regardless if it has been automatically generated by CGI.pm or if it actually has been passed via query string. But as I wrote in my initial post, I would like to remove it only in the former case (for example when the script has been called without any parameters), but I want to keep it if it has been "actively" passed via query string.
Thus, I think I really have to check if there is no query string, or if there is a query string which does not contain = or &, and remove the "keywords" parameter accordingly.
Anyways, thank you very much for your suggestion. I have learned a much from how you use dd and Data::Dump.
Regards,
Nocturnus
| [reply] [Watch: Dir/Any] [d/l] |
|
will delete the "keywords" parameter in every case, i.e. regardless if it has been automatically generated by CGI.pm You're wrong :) I didn't read past that statement, but if you need convincing you are wrong, read the CGI.pm source
| [reply] [Watch: Dir/Any] |
|
|
|
|
Re: [SOLVED] Unwanted parameter when executing CGI scripts
by Nocturnus (Beadle) on Jan 07, 2013 at 08:36 UTC
|
Well, there is more misbehavior in CGI.pm. For example, if we call
/cgi-bin/script.pl?keywords=bla
then $q->url_param contains the key "keywords", but the respective value is empty. That means that you can't pass a parameter named "keywords" via query string if you use $q->url_param. The value of the "keywords" parameter will just be overwritten by CGI.pm.
Due to the problems mentioned in this thread, and due to the problem described above, I dumped CGI.pm and now parse the query string myself (with the help of uri_unescape). My code is now working like expected.
Regards,
Nocturnus
| [reply] [Watch: Dir/Any] [d/l] |
|
Well, there is more misbehavior in CGI.pm.... Nope, wrong again.
But I'm glad you've found a solution that works for you
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
|