http://www.perlmonks.org?node_id=19154
Category: CGI Programming
Author/Contact Info Daniel Cormier, dcormier@themaker.net
Description: I don't use CGI.pm execpt to pull the params from the users request. I prefer to wrote my own raw HTML, so I never use that feture of CGI.pm. I'm sure there are more of you out there who do the same thing, so I'm posting what I use to pull params from a request. It pulls the users cookie, form (for either get or post) and puts them into 3 hashes: %cookie, %form and %params. %params contains %form and %cookie. If there's a param with the same name in the form and cookie, it's over-written by the form, but still accessable through %cookie. I've done a good amount of testing and it seems fool proof, so you just know someone will be making a better fool. Enjoy!
sub get_params {
	my @cookie_data = split(/; /, $ENV{'HTTP_COOKIE'});
	foreach $cookie (@cookie_data){
		(my $one, my $two) = split(/=/, $cookie);
		$cookies{$one} = $two;
		$params{$one} = $two;
	}

	if ($ENV{'REQUEST_METHOD'} eq "GET"){
		my @form_data = split(/&/, $ENV{'QUERY_STRING'});
		foreach $form (@form_data){
			(my $one, my $two) = split(/=/, $form);
			if ($one =~ m/\+/){
				$one =~ s/(\+)/" "/eg;
			}
			if ($two =~ m/\+/){
				$two =~ s/(\+)/" "/eg;
			}
			$one =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg;
			$two =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg;
			$form{$one} = $two;
			$params{$one} = $two;
		}
	}

	if ($ENV{'REQUEST_METHOD'} eq "POST"){
		read(STDIN, my $form_data, $ENV{'CONTENT_LENGTH'});
		my @form_data = split(/&/, $form_data);
		foreach $form (@form_data){
			(my $one, my $two) = split(/=/, $form);
			if ($one =~ m/\+/){
				$one =~ s/(\+)/" "/eg;
			}
			if ($two =~ m/\+/){
				$two =~ s/(\+)/" "/eg;
			}
			$one =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg;
			$two =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg;
			$form{$one} = $two;
			$params{$one} = $two;
		}
	}

	return %cookie, %form, %params;
}
Replies are listed 'Best First'.
RE: Get those parameters without CGI.pm
by Ovid (Cardinal) on Jun 21, 2000 at 07:17 UTC
    This is an example of why the best Perl books say we should stick with Lincoln Stein's CGI module rather than try to write this ourselves.

    The above code, aside from reinventing the wheel, has a variety of problems. The most immediately obvious one is the following section of code:

    foreach $form (@form_data){ (my $one, my $two) = split(/=/, $form); if ($one =~ m/\+/){ $one =~ s/(\+)/" "/eg; } if ($two =~ m/\+/){ $two =~ s/(\+)/" "/eg; } $one =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg; $two =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg; $form{$one} = $two; $params{$one} = $two; }
    This code is duplicated! It should be a subroutine.

    If you are going to reinvent the wheel, you should only do so when you are really comfortable with the language in question. Otherwise, these problems creep in and later down the road, when you're still using your reinventions, you'll come back to them and find they need to be rewritten to get around issues like this.

    And why are the hashes %forms and %params duplicated? The way they're created above, they are identical. I could go on, but that's enough for now.

    Cheers!

      No, %form and %params are not identical. %form contains (of course) the form data submitted with the users request. %params contains %cookie and %form. If there is a param with the same name in both hashes, it's overwritten by %form. However, if you know you want the value from the cookie, and a form was submitted with the same param, you can still get it from the %cookie hash.

      And thank you for your suggestion on how I could make the code better (instead of just telling me that it was flawed and not telling me anything useful).

RE: Get those parameters without CGI.pm
by davorg (Chancellor) on Jun 21, 2000 at 13:39 UTC

    Have you considered what your code will do when faced with a query string like:

    key1=val1&key2=val2a&key2=val2b

    which is a perfectly valid query string. Your %params hash will end up looking like:

    %param = (key1 => 'val1', key2 => 'val2b');

    The first value of key2 will have been lost. CGI.pm's parameter routines handle this correctly. If you have a key which has mulitple values, then the param function will return a list of values.


    --
    <http://www.dave.org.uk>

    European Perl Conference - Sept 22/24 2000
    <http://www.yapc.org/Europe/>
Re: Get those parameters without CGI.pm
by coolmichael (Deacon) on Jun 13, 2002 at 19:18 UTC
    I've been thinking about trying to do this for a while now. It would be an interesting challenge to try and write something like CGI.pm. It would be nice to make it smaller, possibly chop it into smaller pieces. I hope you understand that I'm not going to attack you for trying to write a replacement.

    You wrote:

    Well, instead of just telling me it's bad and doesn't solve anything, how about telling me what I could do to improve it? I doubt you wrote wonderful, perfictly optimised code shortly after you started learning perl.

    Ok here goes.

    1. It doesn't pass use strict. In order to make that happen, you need to declare these varriables:
      $cookie, %cookie, %params, $form, %form
    2. In the body of the function you store the cookie information in %cookies but you return %cookie. I really don't think that's right at all. use strict; would have caught that one.
    3. $one and $two are bad varriable names. It would be more readable if you had used $cookiename and $cookievalue, but that is more of a style thing.
    4. It is a bad idea to store name=value pairs in cookies, but a lot of websites do just that. I don't think your cookie code will work on that. For example (which is an actual google cookie) this isn't the correct result.

      Cookie:ID=7b6dfe0a3ce82015:FF=4:LD=en:NR=20:TM=1023854170:LM=1023854194:S=wEgMDFiEqMG
      one:ID
      two:7b6dfe0a3ce72015:FF

    5. I don't understand why you need to put your cookie data in to your %params. That just duplicates data needlessly. It also has a bug as you mentioned, that form data will override your cookie data in the %params. Why not just avoid the situation all together and keep %cookies, %params, and %form seperate?
    6. Ovid is quite right. You are using the same regular expression on four lines. This is a bad thing. It should be a function.
      s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg;
    7. Other monks have mentioned that you don't properly handle ALL valid query strings (ie ?foo=bla&foo=lsi will ignore the second foo parameter)
    8. Merlyn is right, your regular expression to decode the hex strings is slow.
    9. s/\+/" "/eg isn't wrong, it's just not the best way to do it. If you insist on using s/// then try s/\+/ /g; (you don't need the /e modifier and you don't need quotes around your space.) a faster way still would be to use the tr operator I think. You don't need if $one =~ m/\+/. Also, you've got that same s/\+/" "/eg regular expressioin in four places. That would be much better off as a seperate subroutine. You'll see why if you change all four of them.
    10. return %cookies, %form, %params; doesn't do what you probably think it does. It returns one list, not three. In perl you can not pass or return more than one list. for example, with this query string

      http://localhost/cgi-bin/japh.cgi?abc=123&foo=bar&foo=were&blar&cose

      print "Content-Type: text/plain\n\n&";
      my (%cookie, %form, %params) = get_params();
      print Dumper \(%cookie, %form, %params)

      output the following:

      $VAR1 = {
            'foo' => 'were',
            'cose' => undef,
            'abc' => '123',
            'blar' => undef,
            };
      $VAR2 = {};
      $VAR3 = {};
      
            

    That is ten things that I see needing to be changed or fixed. Number 10 is the biggie. Once you have parsed your query string and filled %form and %params you don't get them returned the way you seem to be expecting. Also, because of the way that perl combines the hashes into lists, you can't tell ahead of time which keys will over ride which. I would suggest that you return references to your hashes, to keep them seperate.

RE: Get those parameters without CGI.pm
by swiftone (Curate) on Jun 21, 2000 at 18:48 UTC
    I haven't tried it, but if you are looking to avoid the overhead of CGI.pm, you might look into CGI_Lite.pm, which isn't as expansive, but has presumably been tested for problems.

    Another solution is the Perl 4 library cgi-lib.pl (or cgilib.pl). It's been through a great deal of testing, and you can always lift the subs right out of it if you don't want separate libraries.

    CGI.pm is better all around, but sometimes you do want something else.

      Lincoln has said for some time that he knows that CGI.pm is too big and is planning to split the module out into a number of smaller modules. No idea when this might happen tho'.

      There was also some talk about building a series of smaller modules from scratch on the ny.pm mailing list few months ago. Michael Schwern was leading the attack, but nothing has made it into his CPAN directory yet!


      --
      <http://www.dave.org.uk>

      European Perl Conference - Sept 22/24 2000
      <http://www.yapc.org/Europe/>
RE: Get those parameters without CGI.pm
by httptech (Chaplain) on Jun 21, 2000 at 18:38 UTC
    I use CGI.pm because it makes debugging from the command line much easier when you can enter your name/value pairs on STDIN before executing.
RE: Get those parameters without CGI.pm
by merlyn (Sage) on Jun 22, 2000 at 02:50 UTC
    I voted this code down. It doesn't solve anything in any way that makes it worth throwing away CGI.pm. It also does the de-hex slower than it needs.

    Basically, it's cargo cult code. Just remove it.

    -- Randal L. Schwartz, Perl hacker

      Well, instead of just telling me it's bad and doesn't solve anything, how about telling me what I could do to improve it? I doubt you wrote wonderful, perfictly optimised code shortly after you started learning perl.
        I didn't write great code right away. Sometimes, I don't even write good code now, and I'm always looking for feedback about what I could have done different or better.

        The point is that a beginner does not rewrite an expert's code (CGI.pm here) and say "I have a better way". This is not hubris. This is arrogance.

        Only when you understand why CGI.pm is the size that it is, can you build upon the art as an improvement. There are a lot of things that have to be done right to handle security; and a lot of others that also have to be done to handle portability. There's a cost to that, and you gotta pay the piper at some point; not cheat and pretend you can get away without it.

        -- Randal L. Schwartz, Perl hacker

        Which is, of course, why using modules is such a good idea. That way beginners don't need to write own (potentially flawed) code. They can reuse code written by experts and debugged by usage on thousands of servers over many years.


        --
        <http://www.dave.org.uk>

        European Perl Conference - Sept 22/24 2000
        <http://www.yapc.org/Europe/>
        He implied how to improve it, delete it and use CGI.pm. One way to become a better programmer is to not reinvent wheels which already roll.

        Cheers,
        KM

Re: Get those parameters without CGI.pm
by Anonymous Monk on May 05, 2010 at 08:14 UTC
    many thanks for inspiration ;)
    sub get_cgi_params_parse ($;$){ my ($str, $params) = @_; my $res = {}; $str =~ tr/+/ /; foreach (split /&/, $str){ my ($one, $two) = split /=/, $_; $one =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg; $two =~ s/%([\da-fA-F][\dA-Fa-f])/pack("C", hex($1))/eg; if (exists $res->{$one}){ $res->{$one} .= "\n".$two; }else{ $res->{$one} = $two; } $params->{$one} = $res->{$one}; } return $res; } sub get_cgi_params (;$$$$){ my ($get_str, $post_str, $cookie_str, $force_get) = @_; my $res = {_names => [], get => {}, post => {}, cookies => {}, vars +=> {}}; $force_get = 0 unless defined $force_get; $cookie_str = $ENV{'HTTP_COOKIE'} unless defined $cookie_str; if ($cookie_str){ foreach (split /; /, $cookie_str){ m/^(.+?)(=(.*))?$/; $res->{cookies}->{$1} = $3; } } if ($force_get or $ENV{'REQUEST_METHOD'} eq "GET" ){ $get_str = $ENV{'QUERY_STRING'} unless defined $get_str; $res->{get} = get_cgi_params_parse $get_str, $res->{vars} if $ge +t_str; } if ($ENV{'REQUEST_METHOD'} eq "POST"){ read(STDIN, $post_str, $ENV{'CONTENT_LENGTH'}) unless defined $pos +t_str; $res->{post} = get_cgi_params_parse $post_str, $res->{vars} if $po +st_str; } return $res; }

      It is interesting to see how people are against the idea of writing a replacement of CGI.PM because new code is worse to begin with. However one big point in favor of this replacement code is that you can see how CGI works under the hood.

      365 days of using CGI.PM as a library can never teach you that. I may not want to read the CGI.PM library precisely because it is mature and full of error checks and utilities that do not let me see the "essence". so such code at least has a __teaching__ aspect and should be applauded for that.

      And maybe it is okay to reinvent the wheel once in a while otherwise we can never make it rounder..

        It is interesting to see how people are against the idea of writing a replacement of CGI.PM because new code is worse to begin with.

        Hi, you must be new :)

        For years after CGI.pm appeared, people were still brokenly reinventing it

        For years after CGI.pm was made a core perl module, people were still brokenly reinventing it, and then asking why their code is broken and how to fix it

        It is such a frequent and impressive waste of time that Ovid wrote use CGI or die; and Ovid's CGI Course 10 years ago, 6 years after CGI obsoleted all imitators

        And maybe it is okay to reinvent the wheel once in a while otherwise we can never make it rounder..

        Yes, its ok to reinvent a wheel, like reinventing aspirin, but it has to be an obvious improvement, and you have to be able to make it yourself and be prepared for the appropriate amount of criticism

        But, if you're advocating hitting yourself over the head with a rock as a replacement for aspirin , expect the worst :)

        Take a look through the bug reports and changes files some time -- anyone reinventing CGI.pm is smart enough to make thrice as many mistakes :)

        See CGI::Simple, its CGI.pm minus the html generation code (ie, its faster), written 10 years ago, through advanced reuse concept of "copy" and "paste"

        See also PSGI/Plack, written 2 years ago, it is the next generation , beyond the CGI (note, not CGI.pm )

        Sometimes it's a good thing, sometimes it isn't. See 935499 for a discussion on that topic.

        What you have to really understand by heart when attempting such a project: Those matters are far more complex than they look. I know, i wrote my own Webserver.

        While the basics - e.g. getting the most used stuff working - is most times not that difficult, the devil is often in the details. You'll have to make the more obscure features work (like handle multipart forms for file uploads in your case and also parse headers like if-modified-since). You'll do error handling, handle interrupted connections, intentionally misformed data (hacking attempts) and such.

        And - if you publish your solution and people base their projects on it, it would be your implied responsibility to keep the API stable while still continuing development.

        If you are willing to embark on that journey, you are very welcome.

        Don't count on it beeing an easy journey or a quick one. If you succeed, it will be a personal triumph for you that may include widespread adoption of your solution. But you better plan to work hard for months or years to get to that point.

        BREW /very/strong/coffee HTTP/1.1
        Host: goodmorning.example.com
        
        418 I'm a teapot