Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: CGI::Simple vs CGI.pm - Is twice as fast good enough?

by vladb (Vicar)
on Feb 16, 2002 at 02:13 UTC ( [id://145800]=note: print w/replies, xml ) Need Help??


in reply to CGI::Simple vs CGI.pm - Is twice as fast good enough?

I think, yes, denying the importance of using slimmer and faster variants of certain common tools/modules is not the wisest thing to do. For one, I already see quite a number of CGI scripts that which require performance boost and one way I might do it is by simply converting from 'use CGI' to 'use CGI::Simple;'. This will be especially easy to do since most of my CGI scripts don't require extended features supplied with standard CGI module.

However, as far as 'extended' features go, is it not true that CGI doesn't really load them until they are first requested inside the main code? Basically, CGI keeps this %SUBS hash which contains a whole bunch of subroutines' definitions. These are loaded only on the first time each one is requested. I feel like author(s) of the modules eagerly tried to drive this point across with this comment (ripped from CGI.pm):
###################################################################### +######### ################# THESE FUNCTIONS ARE AUTOLOADED ON DEMAND ########### +######### ###################################################################### +#########

This is followed by the infamous %SUBS hash:
%SUBS = ( 'read_from_client' => <<'END_OF_FUNC', # Read data from a file handle sub read_from_client { my($self, $fh, $buff, $len, $offset) = @_; local $^W=0; # prevent a warning return undef unless defined($fh); return read($fh, $$buff, $len, $offset); } END_OF_FUNC ### MANY OTHER EXCITING SUBS ### );
So, say, even if I went the 'use CGI;' way, the only time wasted here (provided I have no interest in making a call to the CGI::read_from_client() method) is that required for the hash to load. Perl parser wouldn't waste a nanosecond on parsing the actual sub. This is a huge time saver compared to if subs were not nicely hidden inside a hash etc. (the standard way).

I'm wondering if this would explain the fact that CGI::Simple is only 50% faster than CGI?

"There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith

Replies are listed 'Best First'.
Re: Re: CGI::Simple vs CGI.pm - Is twice as fast good enough?
by tachyon (Chancellor) on Feb 16, 2002 at 12:01 UTC

    When you use CGI.pm 700 lines of code get 'compiled' plus 2400 lines of code also get put in the subs hash. While the code in the subs hash is not compiled a fairly large hash must be generated which takes both memory and time. CGI::Simple uses SelfLoader to avoid compiling methods that are rarely used. You do this by placing these methods below a __DATA__ token. At compile time compilation stops at the __DATA__ token. As a result when you use CGI::Simple only 300 lines of code actually get compiled.

    With SelfLoader if you call one of the methods below the data token then (from the docs):

    The SelfLoader will read from the CGI::Simple::DATA filehandle to load in the data after __DATA__, and load in any subroutine when it is called. The costs are the one-time parsing of the data after __DATA__, and a load delay for the _first_ call of any autoloaded function. The benefits (hopefully) are a speeded up compilation phase, with no need to load functions which are never used.

    One of the neat things about SelfLoader is that if you know that you will regularly use methods x, y, and z you can easily tune the module by placing these above the data token. As a result they will be available without using SelfLoader and the runtime overhead of using SelfLoader need never be paid.

    One of the not so neat things is that you have to load SelfLoader to use it, so there is a compile time penalty that you must pay. Fortunately SelfLoader.pm is only 100 lines of code. I was sorely tempted to 'roll my own' as you can do this with much less code provided you do not have to 'cover all the bases' as the module does. This, however, went against the perl concept of using modular code when available. Similarly CGI::Simple uses IO::File->new_tmpfile() to generate a self destructing temp file for file uploads leaving all the details to this module. IO::File is called via a require so you only load it if and when you need it.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      tachyon wrote:

      CGI::Simple uses SelfLoader to avoid compiling methods that are rarely used. You do this by placing these methods below a __DATA__ token. At compile time compilation stops at the __DATA__ token.

      I didn't notice this part at first. mod_perl scripts cannot contain __DATA__ tokens. Do you have a solution for this? I suppose you can make a separate mod_perl implmentation without the __DATA__ token. Since the performance issue you're resolving is load time, this really doesn't apply in this instance. However, then you have CGI::Simple, CGI::Simple::Standard, and CGI::Simple::mod_perl. I don't see a problem with that if you really need those namespaces to address these issues, but I wonder if others would object.

      Cheers,
      Ovid

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        Ah says he. Logically you just delete the use SelfLoader and __DATA__ tag. On testing of loadtimes using a version of CGI::Simple without Selfloader and the __DATA__ tag the entire script loads/compiles slower but is still faster than CGI.pm plus all the methods are compiled and ready to go.

        There would seem to be three options. A mod perl version without the __DATA__ tag. Cutting the module into two parts with the less used functions in another module that gets required in a runtime. A CGI.pm type solution.

        I would favour having a CGI::Simple::mod_perl module that is just the standard module without SelfLoader and the __DATA__ tag as this is easiest to maintain and as most scripts need some tuning to use mod_perl so modifying the use CGI::blah should be no big deal.

        In reality under mod_perl there is no good reason not to just use CGI.pm as the load time, size, etc is not an issue and it is well proven.

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        I didn't notice this part at first. mod_perl scripts cannot contain __DATA__ tokens.

        Since it only applies to scripts which run under Apache::Registry there should be no problems with CGI::Simple and it's usage of __DATA__ under mod_perl.

        --
        Ilya Martynov (http://martynov.org/)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://145800]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-19 14:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found