http://www.perlmonks.org?node_id=147874

Recently there have been multiple meditations about how CGI.pm could be made considerably faster. I've been emailing with David James, the author of CGI.pm version 3.0, which is available from the official CGI.pm site but is not (yet) maintained by Lincoln Stein. Like others here, I downloaded 3.0 and was confused about its merits over previous versions. It seemed faster; and it didn't seem to have any interface changes from the somewhat out-of-date CGI.pm version 2.7, (versus Lincoln's latest, 2.8) but I needed more info and David was kind enough to fill me in.

In short, it has no interface changes at all from 2.7, barring bugs, and is twice as fast. I think it's worth plugging as a candidate replacement; I imagine all it needs is thorough bug-testing and a bit of encouragement for Lincoln that it meets peer review. Is it too late for potential inclusion in perl 5.8? Maybe, but maybe not... I've copied the substance of David's email below.


On Sun, 3 Feb 2002, David James wrote:

> Hi Daniel,
>
> Sorry for the delay! But hopefully this email will be worth the wait.
>
> > 1) are there any interface changes at all?  The documentation is
> >    identical to CGI 2.7x.
> Nope, use it just like CGI.pm. Barring bugs, usage should be identical.
>
> > 2) in what ways are CGI3 better?  I see a different set of sub-modules,
> >    and CGI.pm itself is 1/2 the size.  Does it work identically with less
> >    code?  Are there other efficiency improvements?
> What's different? It's twice as fast, and runs on average way less code,
> thanks to the fact that the big CGI.pm is split off into many sub-modules,
> which are loaded as needed. The new CGI.pm itself in fact has very little
> code -- the bulk of the size of the file is due to the documentation.
>
> To improve efficiency, I replaced slow things (like procedure calls and
> copying arrays and sequential searches) with faster things (like inlined
> procedures, inplace editing, and hash lookups).
>
> The end result of all this is that CGI.pm 3.01 is at least twice as fast as
> CGI.pm 2.80. Here's a few of the major efficiency changes:
> - The heavily used rearrange routine is now passed a hash instead of an
> array. It still supports the old calling method, but the new calling method
> saves rearrange a lot of work!
> - The self_or_default and self_or_cgi routines are now no longer called at the
> top of every function. Instead they are only called if needed, using the
> following criteria:
>     - Whenever a new CGI object is created, it is created as a "CGI::Object"
> object. In this case we know that the function is being called in an object
> oriented manner -- thus the function can be called directly with no wrapper.
>     - Whenever functions are exported, they are exported from the CGI::Func
> namespace. In this case we know that the function is being called in a
> functional manner, and thus we must always add the default object to the @_
> array.
>     - If functions are called in the CGI:: namespace, we do not know whether
> functions are being called in the object-oriented fashion or in the
> function-oriented fashion. Thus we must do the "self_or_default" check. This
> only applies to programs which directly call the CGI::* functions without
> exporting them.
> - Almost every function copies @_ into @p when calling self_or_default. This
> is now unnecessary now that self_or_default has been deprecated. Now we can
> just deal with @_ directly.
> - escapeHTML is slow. It has several global substitutions when only one
> substitution is needed. I've patched it to be faster.
> - CGI::Object::escapeHTML now does in-place editing for internal usage.
> However CGI::escapeHTML still copies its arguments for sake of compatibility.
>
> Here's some raw data gathered on my machine to show how I improved CGI.pm's
> speed:
> CGI.pm (original) runs at 157.98 pages per second
> CGI.pm (with rearrange patch) runs at 182.82 pages per second
> CGI.pm (also with self_or_default patch) runs at 221.24 pages per second   
> CGI.pm (also with @p patch) runs at 228.31 pages per second
> CGI.pm (also with escapeHTML patch, but without inplace editing) runs at
> 240.38 pages per second
> The rest of the changes (not listed here), including inplace editing for
> escapeHTML, add another 50% to CGI.pm's speed, leaving the new CGI.pm 3.01
> running at 336.70 pages per second.
> 
> The main problem with CGI.pm 3.01 is that it's missing a lot of Lincoln's   
> latest changes. He works on the 2.x line and not on the 3.x line. (It'd be 
> great if we could convince him to work on 3.x!) The raw data for most of my 
> observations above is included below. You can verify it yourself with the  
> benchmark-object.pl script included with CGI.pm 3.01. (I changed it to run
> 1000 iterations on my machine as 200 is insufficient to get a proper count on
> my machine). Obviously your numbers won't be comparable to mine since we have
> different speed machines, but you should see the same ratio of speeds between
> the different versions.
> 
> I've created a patched version of 2.80 that is 50% faster than Lincoln's 
> version. Even though it's still 50% slower than 3.01, perhaps it's the best
> thing to use since it includes Lincoln's latest changes and bugfixes. It's  
> also more conservative than CGI 3.01 (it's more similar to Lincoln's stuff)
> so maybe it'd be easier to convince Lincoln to adopt this as the latest
> version of CGI.pm. Let me know if you want this version.
> 
> Thanks for taking a look at my code! I hope you like it :)
> 
> David James
> 
> RAW DATA:
> CGI.pm (original) Benchmark
> timethis 1000:  7 wallclock secs ( 6.33 usr +  0.00 sys =  6.33 CPU) @
> 157.98/s
> (n=1000)
> CGI.pm (with rearrange patch)
> timethis 1000:  5 wallclock secs ( 5.47 usr +  0.00 sys =  5.47 CPU) @
> 182.82/s
> (n=1000)
> CGI.pm (also with self_or_default patch):
> timethis 1000:  5 wallclock secs ( 4.52 usr +  0.00 sys =  4.52 CPU) @
> 221.24/s
> (n=1000)
> CGI.pm (also with @p patch):
> timethis 1000:  4 wallclock secs ( 4.37 usr +  0.01 sys =  4.38 CPU) @
> 228.31/s
> (n=1000)
> CGI.pm (also with escapeHTML patch):
> timethis 1000:  4 wallclock secs ( 4.16 usr +  0.00 sys =  4.16 CPU) @
> 240.38/s
> (n=1000)
> CGI.pm 3.01
> timethis 1000:  3 wallclock secs ( 2.97 usr +  0.00 sys =  2.97 CPU) @
> 336.70/s
> (n=1000)
>  
> CGI.pm 2.80 (original) DProf Printout
> Total Elapsed Time = 6.470529 Seconds
>   User+System Time = 6.430529 Seconds
> Exclusive Times
> %Time ExclSec CumulS #Calls sec/call Csec/c  Name
>  27.6   1.778  2.046  24024   0.0001 0.0001  CGI::escapeHTML
>  25.0   1.609  1.574   8010   0.0002 0.0002  CGI::Util::rearrange
>  15.2   0.980  0.804  39048   0.0000 0.0000  CGI::self_or_default
>  11.2   0.724  2.184   1001   0.0007 0.0022  CGI::checkbox_group
>  6.48   0.417  1.628   1001   0.0004 0.0016  CGI::popup_menu
>  5.13   0.330  0.346   4009   0.0001 0.0001  CGI::param
>  4.96   0.319  0.410   1001   0.0003 0.0004  CGI::start_html
>  4.34   0.279  0.893   1001   0.0003 0.0009  CGI::hidden
>  3.27   0.210  0.192   4004   0.0001 0.0000  CGI::p
>  2.95   0.190  0.441   1001   0.0002 0.0004  CGI::_textfield
>  2.49   0.160  0.630   1001   0.0002 0.0006  CGI::url
>  2.33   0.150  0.132   4004   0.0000 0.0000  CGI::_checked
>  1.71   0.110  0.119     25   0.0044 0.0048  CGI::_compile
>  1.57   0.101  7.054      4   0.0254 1.7634  Benchmark::__ANON__
>  1.56   0.100  0.211   1001   0.0001 0.0002  CGI::submit
> 
> CGI.pm 2.80 (with all my patches) DProf Printout
> Total Elapsed Time = 4.664679 Seconds
>   User+System Time = 4.722357 Seconds
> Exclusive Times
> %Time ExclSec CumulS #Calls sec/call Csec/c  Name
>  27.7   1.310  1.226  24024   0.0001 0.0001  CGI::escapeHTML
>  15.2   0.720  0.692   8010   0.0001 0.0001  CGI::Util::rearrange
>  13.6   0.646  1.530   1001   0.0006 0.0015  CGI::checkbox_group
>  8.00   0.378  1.032   1001   0.0004 0.0010  CGI::popup_menu
>  5.51   0.260  0.355   1001   0.0003 0.0004  CGI::start_html
>  4.53   0.214  4.869      4   0.0536 1.2174  Benchmark::__ANON__
>  4.45   0.210  0.498   1001   0.0002 0.0005  CGI::hidden
>  3.18   0.150  0.474   1001   0.0001 0.0005  CGI::startform
>  3.18   0.150  0.215   1001   0.0001 0.0002  CGI::submit
>  2.96   0.140  0.305   1001   0.0001 0.0003  CGI::url
>  2.75   0.130  0.116   4009   0.0000 0.0000  CGI::param
>  2.54   0.120  0.129     25   0.0048 0.0052  CGI::_compile
>  2.54   0.120  0.106   4004   0.0000 0.0000  CGI::p
>  2.33   0.110  0.096   4004   0.0000 0.0000  CGI::_checked  
>  2.33   0.110  0.225   1001   0.0001 0.0002  CGI::_textfield
>  
> CGI.pm 3.01 DProf Printout
> Total Elapsed Time = 3.346381 Seconds
>   User+System Time = 3.346381 Seconds
> Exclusive Times
> %Time ExclSec CumulS #Calls sec/call Csec/c  Name
>  32.8   1.100  0.992  24024   0.0000 0.0000  CGI::Object::Html::escapeHTML
>  30.1   1.010  0.978   7007   0.0001 0.0001  CGI::Object::rearrange
>  10.1   0.338  0.846   1001   0.0003 0.0008  CGI::Object::Html::popup_menu
>  8.91   0.298  1.067   1001   0.0003 0.0011  CGI::Object::Html::checkbox_group
>  6.57   0.220  0.529   1001   0.0002 0.0005  CGI::Object::Html::hidden
>  5.38   0.180  0.153   6006   0.0000 0.0000  CGI::Object::Html::__ANON__
>  4.18   0.140  0.217   1001   0.0001 0.0002  CGI::Object::Html::start_html
>  3.88   0.130  0.279   1001   0.0001 0.0003  CGI::Object::Html::_textfield
>  3.02   0.101  3.631      4   0.0254 0.9078  Benchmark::__ANON__
>  2.99   0.100  0.095   1001   0.0001 0.0001
> CGI::Object::Html::previous_or_default
>  2.99   0.100  0.186   1001   0.0001 0.0002  CGI::Object::Html::startform
>  2.69   0.090  0.177   1001   0.0001 0.0002  CGI::Object::Html::submit
>  2.39   0.080  0.071   2002   0.0000 0.0000
> CGI::Object::Html::_set_values_and_labels
>  1.79   0.060  0.150      3   0.0200 0.0499  main::BEGIN
>  1.49   0.050  0.041   2003   0.0000 0.0000  CGI::Object::State::param

2002-02-27 Edit by Corion - fixed runaway <A> tag

2002-02-27 Small clarification by da

Replies are listed 'Best First'.
(kudra: changed timeline) Re: CGI.pm version 3
by kudra (Vicar) on Feb 27, 2002 at 13:06 UTC
    Not sure how much this has to do with it, but I just wanted to mention that the 5.8 schedule is revised. I don't follow the mailing list myself, but H. Merijn forwarded from p5p to Amsterdam.pm.
    2002-03-04	5.7.3
    2002-03-25	Code Freeze for RC1
    2002-04-08	RC1
    
    My guess is it won't be a matter of time so much as the fact that the module maintainer hasn't adopted it.
        No, that's the original timeline, which was linked in the root node. I'm talking about the revised timeline, which isn't on use perl; (as far as I know) but was posted on p5p. It puts more time between 5.7.3 and RC1.
Does Lincoln Recommend It? - Re: CGI.pm version 3
by metadoktor (Hermit) on Feb 27, 2002 at 13:45 UTC
    CGI.pm Version 3.01 Version 3.0 of CGI.pm provides a modularized design and significant performance enhancements, courtesy David James. Please try and report any bugs or misfeatures to me.

    Hmm...if Lincoln is not maintaining it then why does he want people to report the problems with 3.0 to him?

    metadoktor

    "The doktor is in."

      That's a good question. I've emailed Lincoln asking about his thoughts on CGI v.3; my assumption is that he does recommend it, since he calls it CGI.pm version 3. However, he also added new features to version 2.

      ___ -DA
        This was the response I got back from Lincoln Stein:
        
        Actually I've scheduled version 3 to be removed from CPAN altogether.
        It has too many bugs for production use and although many people send
        me bug reports, no one has volunteered to help me fix them.
        

        ___ -DA