Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: FastCGI and mod_perl compared

by mugwumpjism (Hermit)
on Aug 27, 2001 at 17:23 UTC ( #108110=note: print w/replies, xml ) Need Help??


in reply to FastCGI and mod_perl compared

Both FastCGI and mod_perl achieve their speed-up by avoiding the overhead of starting a process for each CGI request.

Yes and no. They both do that, but FastCGI achieves greater speedup by keeping a user-defined number of processes for all requests to be processed by. mod_perl, on the other hand, tries to equip every web server with a fully fledged copy of Perl. Unfortunately this means that if a web server is just serving a flat file, that Perl instance is taking up memory but sitting idle.

Both FastCGI and mod_perl claim benefits beyond speed-up. As a novice I don't understand what many of these features are but according to this chart, mod_perl has more of em.

I had a look at that list. Some of the statements are just plain incorrect, such as "need to reboot httpd when script changes on disk" - for mod_perl this is a cronic problem, in some cases a graceful restart just doesn't work, whereas in the worst case with FastCGI all you have to do is signal a FastCGI script, have it exit, it restarts and you didn't even miss a request. Or, if you prefer, you can have your script automatically check whether or not the critical files have changed and then exit, causing them all to be reloaded. In summary, this is more of a problem in apache and less of a problem with FastCGI, but not according to that chart.

The other "features" I would call mainly "hacks". They're doing all sorts of things with bits of the HTTP request to figure out what to do with it, but why bother when you could do all of that in Perl? Treat the HTTP request as a remote procedure call, which really is what it is. Need authentication? Just write code to check it. Just hand off every request to the FastCGI script to decide what to do with it, so you can specify it in Perl code rather than Apache's arcane and non-intuitive configuration syntax.

And what about memory leaks? mod_perl has no recovery. You have to restart - that is, kill then restart, your web server, causing all current connections to be abruptly broken.

You can attribute this chart to pride (aka Hubris) on Doug MacEachern's part; they put a lot of time and effort into what they did, and their pride makes them blind to the faults in mod_perl. Sometimes it takes an upstart to reject the status quo, who is branded a heretic and thrown out. Currently mod_perl simply does not scale and is a bitch to maintain and code under, but no-one will say that because they don't want to face the rejection; as Carlos Casteneda would say, the first challenge of a man of knowledge is overcoming his fear. Larry wall is wrong in stating that Hubris is a virtue of a programmer. It is good to be self-confident, but to be proud implies that you do not know that something is true and are accepting it to be true based on what someone else said. This is also known as lying to oneself, which increases the level of "consensus with the status quo" in your existance and retricts your ability to see truth.

Let me get this clear to you - FastCGI is a technically superior design. The limitations of mod_perl (ie a hundred copies of Perl being idle) are only solved with mod_perl 2.0, which builds threading into mod_perl; great, but IMO entirely inappropriate for a dynamic web server. When will threading in Perl be stable? I wouldn't trust it until Parrot is considered stable. Because the web server and the perl process are not doing very much communication per request, threading is inappropriate. There is two communications per request; one - the detail of the HTTP request. Two - the response. Why bother with threading for that? Have two processes communicating via sockets or even shared memory and be done with it.

The process for installing FastCGI looks a little scary. The docs begin something like: "...you need a FastCGI-savvy version of Perl..." There doesn't seem to be a debian package for FastCGI...

Any recent version of Perl will do. There is a debian package, too:

sam@fractal:~$ apt-cache search fastcgi libcgi-fast-perl - CGI::Fast Perl module. libfcgi-perl - FastCGI Perl module libapache-mod-fastcgi - FastCGI module for Apache. sam@fractal:~$
It might be easier to run the webserver and CGI programs on separate boxes; I'm not clear on how big a feature [this] actually is.

Depends how big your site is. You'd be surprised how much faster it would be; you'd get effects like the web server code not leaving the CPU's internal code cache, each part of the request being served in one CPU time slice, and so on. Adding an extra tier can make the biggest difference in performance once you reach a certain level. But the mod_perl team wouldn't know that - after all, they can't even do it.

Interestingly, I notice that PHP supports running as a FastCGI process.

If the teacher is not respected, not the material not cared for, then confusion will result, no matter how smart one is.  - Tao Te Ching

Replies are listed 'Best First'.
Re: Re: FastCGI and mod_perl compared
by tomhukins (Curate) on Aug 27, 2001 at 18:24 UTC

    Whilst you raise some interesting points, you fail to mention common workarounds that are deployed to deal with the problems you mention. I have found mod_perl does a great job as a high-performance Web application server.

    FastCGI achieves greater speedup by keeping a user-defined number of processes for all requests to be processed by. mod_perl, on the other hand, tries to equip every web server with a fully fledged copy of Perl.

    On any operating system I've used, the fully fledged copy of Perl is shared between all httpd processes using copy-on-write memory. This isn't a problem.

    If a web server is just serving a flat file, that Perl instance is taking up memory but sitting idle.

    The mod_perl guide describes various strategies for dealing with this. Reverse proxying is commonly deployed using a lightweight Apache httpd or Squid. Or you can use a separate httpd, configured differently for serving flat files such as images. Furthermore, you can load your code into memory when the parent httpd starts up, ensuring that code is shared between child httpd processes.

    Some of the statements are just plain incorrect, such as "need to reboot httpd when script changes on disk" - for mod_perl this is a cronic problem

    On a development server, you can use Apache::Reload to avoid such problems. On a production server, your code shouldn't change frequently, anyway.

    And what about memory leaks? mod_perl has no recovery.

    True, but it's easy to avoid most memory leaks in Perl by using good programming practices (use strict, use warnings, etc.).

    Additionally, mod_perl integrates much tighter with Apache than FastCGI. If you're just using Apache::Registry, you won't notice these benefits, but if you write your own mod_perl handlers, which is simple enough, you can hook into the different request phases and other features of Apache. Now I've written handlers, I find it hard going back to the plain old CGI way of doing things for major projects.

Re: FastCGI and mod_perl compared
by echo (Pilgrim) on Sep 10, 2001 at 18:38 UTC
    Yes and no. They both do that, but FastCGI achieves greater speedup by keeping a user-defined number of processes for all requests to be processed by. mod_perl, on the other hand, tries to equip every web server with a fully fledged copy of Perl. Unfortunately this means that if a web server is just serving a flat file, that Perl instance is taking up memory but sitting idle.

    No one in its right mind serves flat files with a mod_perl server. Any decent production server will use a front-end reverse proxy to serve static resources, and hand off requests for dynamic documents to the mod_perl back end. For example I have such a server with 300 front-end processes and only 12 back end mod_perl processes.

    The other "features" I would call mainly "hacks". They're doing all sorts of things with bits of the HTTP request to figure out what to do with it, but why bother when you could do all of that in Perl? Treat the HTTP request as a remote procedure call, which really is what it is. Need authentication? Just write code to check it. Just hand off every request to the FastCGI script to decide what to do with it, so you can specify it in Perl code rather than Apache's arcane and non-intuitive configuration syntax.

    You can certainly write all that code in Perl with mod_perl, that's precisely the point of having hooks into all request phases. Apache encourages modular code, so that although you can do your authentication, url rewriting, logging, etc, in your content handler, just as you would with a regular CGI, it is advisable to use the proper request phase hooks instead. This provides for more reusability. Want to authenticate against a SQL database? Just plug in Apache::AuthenDBI and you're done. I'm not sure what you're talking about when you say that this could all be done in Perl instead: mod_perl precisely lets you do all that in Perl, it just follows the Apache request model so that you don't mix authentication, content delivery, etc, in one script. One of the major benefits of this is that you can use a number of CPAN modules that deal with common tasks.

    Currently mod_perl simply does not scale and is a bitch to maintain and code under, but no-one will say that because they don't want to face the rejection.

    This is at best an unsubstantiated opinion. There are many large sites that use mod_perl. Once you learn to use strict, there's not much more to writing mod_perl code than there is to write regular CGI code.

    Because the web server and the perl process are not doing very much communication per request, threading is inappropriate. There is two communications per request; one - the detail of the HTTP request. Two - the response. Why bother with threading for that? Have two processes communicating via sockets or even shared memory and be done with it.

    Obviously you haven't done much work with mod_perl. A request goes through 11 phases during its lifetime, each of which can be handled by Perl code. That code has access to the full Apache API, which goes way beyond just accessing the handful of environment variables available under CGI. As for threading (not available yet, will be under 2.0)--many think that superior performance can be expected of a threaded server, but the Apache developers still provide the 1.x prefork model in 2.0. mod_perl 2.0 supports threading because Apache 2.0 supports it, you don't have to use it if you don't like it. However one might like the fact that multiple threads will share the Perl interpreter (through 5.6.x cloned interpreters) which will greatly reduce memory requirements). A threaded server's primary benefit isn't for communication between threads, it's about increased performance and reduced resource consumption compared to the multiple process model. At any rate, suggesting sockets or shared memory as a way to improve performance is almost laughable, as these techniques don't scale very well.

    Adding an extra tier can make the biggest difference in performance once you reach a certain level. But the mod_perl team wouldn't know that - after all, they can't even do it.

    I am not sure what you're saying here--mod_perl does not impose a web server architecture for you. It's just a technique to handle dynamic requests. Any moderately busy site will use a multi-tier architecture, whether you're using mod_cgi, mod_perl or PHP. Apples and oranges.

    As to your comments on Doug MacEachern's supposed pride, I don't know where you got that from. Doug is one of the shyest and yet most helpful guys around. He does believe that his product is good, but I've never seen him make any bold statements about its superiority over other products that have the same goals. It's just a tool, choose the tool most suited for the job, and the one you feel most comfortable with.

    If you wish to compare technologies, you'd do a service to your readers if you acquire similar knowledge about each technology you're evaluating, and restrict your comments to the technology themselves. This way you won't have to resort to inflamatory comments about the authors.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://108110]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2022-05-21 11:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (76 votes). Check out past polls.

    Notices?