Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

FastCGI and mod_perl compared

by mandog (Curate)
on Aug 27, 2001 at 03:08 UTC ( #108008=perlmeditation: print w/replies, xml ) Need Help??

This recent thread on improving performance contained a node by mugwumpjism that got me thinking about CGI performance. So to feed my thoughts, I poked around the net a bit and came to the conclusion that:

  1. A good (relatively quick & easy, effective) way for me (a novice) to improve performance on my site & server is to use mod_perl and Apache::Register.
  2. For my purposes, mod_perl and Apache::Register is better than FastCGI

Since I started the afternoon predisposed to notice facts that supported my mod_perl prejudice, I thought it best to get a couple second opinions before installing mod_perl and modifying my scripts based on the information that follows.

Both FastCGI and mod_perl achieve their speed-up by avoiding the overhead of starting a process for each CGI request. This overhead is apparently more noticeable with perl CGI scripts than with compiled C programs. A Perl script has to be re-compiled each time it is run. Speed increases of 400 to 2000 percent are claimed.

Both FastCGI and mod_perl claim benefits beyond speed-up. As a novice I don't understand what many of these features are but according to this chart, mod_perl has more of em.

mod_perl seems to have wider support and more active development.

A FastCGI super search turned up about (8) nodes. My down arrow finger got tired scrolling through all the mod_perl nodes on perlmonks.org

Via google I turned up a few external nodes comparing FastCGI to mod_perl. Several people remarked that they'd found FastCGI support lacking.

FastCGI is no longer bundled with Apache. At modules.apache.org, the page for the FastCGI module was last updated three years ago and the link to the module's page is broken.

The process for installing mod_perl looks clear to me.
  1. Double check the instructions and mod_perl traps
  2. Triple check that I'm at running recent enough versions of Perl, Apache and CGI.pm
  3. su
  4. apt-get install libapache-mod-perl
  5. Edit my CGI scripts to replace print() statements with My_CGI.pm_Object->print()
  6. Copy and paste 5 lines into my httpd.conf
  7. apachectl graceful

The process for installing FastCGI looks a little scary. The docs begin something like: "...you need a FastCGI-savvy version of Perl..." There doesn't seem to be a debian package for FastCGI...

While they aren't relevant to my situation, FastCGI does seem to have strengths that mod_perl lacks.

  1. You can't use mod_perl for C, TCL or Python programs
  2. You can only use mod_perl on apache
  3. It might be easier to run the webserver and CGI programs on separate boxes

I'm not clear on how big a feature the last point actually is. Is there a big difference between running (4) Apache/mod_perl boxes and a database box and running (2) FastCGI boxes, (2) Apache boxes and a database box?

--mandog

Replies are listed 'Best First'.
Re: FastCGI and mod_perl compared
by echo (Pilgrim) on Aug 27, 2001 at 03:27 UTC
    A few points about mod_perl:
    • Apache::Registry is not without its traps. Badly written CGI programs will fail in mysterious ways, because they don't expect to be persistent. For an easier transition mod_perl provides Apache::PerlRun, which will avoid most of these at some performance expense.
    • Besides the performance benefits, mod_perl has another great feature: it lets you go beyond CGI by providing hooks into all request phases, not just content delivery, and it provides access to the Apache API. This means writing Authentication or Logging handlers in Perl, dynamically dispatching to a specific handler at runtime, ... a whole new range of possibilities.
    • As a consequence of this, a lot of useful modules are available, such as Apache::DBI which keeps database connections persistent in a transparent fashion, or Apache::Session which manages... sessions.
    • mod_perl is very actively developped and supported, and there's an extensive guide to help out newbies, and several books are soon to be published.

    About your point 3, most production users of mod_perl use a dual server setup: a front-end Apache serves static documents, and reverse proxies requests for dynamic documents to a mod_perl enabled back-end Apache. These can be on the same, or on different boxes.

Re: FastCGI and mod_perl compared
by z0d (Hermit) on Aug 27, 2001 at 04:59 UTC
    One thing for mod_perl: if Your script has a security hole, the attacker can own Apache. In mod_fastcgi, he can only own that process.

    Unfortunatelly mod_perl has some memory leaks. mod_perl and mod_fastcgi are different in the way they work. It is up to the developer to choose one of them. mod_perl is faster, mod_fastcgi is more secure.
    -- <-- z0d -->
      I don't understand what you are saying here. Apache forks to child processes. So 'owning' Apache only means you 'own' an Apache child process, which sounds a lot like 'owning' a mod_fastcgi process.

      If Apache has been set up by someone who isn't a complete retard it will be suiding to an unprivilidged user the moment it gets a connection request. On most systems this user will be the 'nobody' or 'www-data' user.

      Probably the best you could do with one of those accounts is read the source code to the CGI scripts, and if your scripts are programmed correctly then you are safe.

      In short, mod_perl is secure, just like mod_ anything is. The memory leaks are another story.

      ____________________
      Jeremy
      I didn't believe in evil until I dated it.

        I believe z0d was indirectly refering to the fact that FastCGI can be configured to run as different uids for different users. This is similiar to the suexec functionality of regular CGI programs. I know someone who chose FastCGI over mod_perl for this reason at a webhosting site. Multiple unrelated users have less of a chance to step on each others toes this way.

        So, its not a question of compromising my mod_perl process vs. my fastcgi process... Its a matter of limiting others security holes from becoming my problem.

        p.s. I'm a mod_perl guy myself... but I couldn't refute this aspect when my friend told me about his webhosting gig.

        -Blake

Re: FastCGI and mod_perl compared
by Maclir (Curate) on Aug 27, 2001 at 06:03 UTC
    ++mandog for an excellent summary. I looked into this about 2 years ago, and came to the conclusion that while technically, both mod_perl and FastCGI are probably technical similar - each has their own strengths and weaknesses, mod_perl appeared to have a far greater level of support and use in the community.

    The mod_perl documentation is excellent - even before you start to look beyond the basic distribution to commercial books.

    While the point about some programs misbehaving under mod_perl, it is well documented early in the mod_perl documentation - and it is generally sloppy programming practices that lead to mod_perl problems.

    One inconvenience wiht mod_perl is that because you end up with having the perl execution environment compiled into your apache executable, when you upgrade your verion of perl, you need to reinstall the whole mod_perl / apache - or at least rebuild it. That is not a huge task, but it is a thing to remember. I don't know if FastCGI has a similar restriction.

Re: FastCGI and mod_perl compared
by Cine (Friar) on Aug 27, 2001 at 04:33 UTC
    Just a couple things from me ;)
    • The testing version of mod_perl (libapache-mod-perl) is broken in debian (its compiled for 5.6.0, and perl testing version is 5.6.1). So you need to get the source with apt-get --build source libapache-mod-perl
    • Your 5th is unecessary, since modperl overwrites the core print so that it is transparent that you are using modperl.
    • There is a mod_python...


    T I M T O W T D I
Re: FastCGI and mod_perl compared
by mugwumpjism (Hermit) on Aug 27, 2001 at 17:23 UTC
    Both FastCGI and mod_perl achieve their speed-up by avoiding the overhead of starting a process for each CGI request.

    Yes and no. They both do that, but FastCGI achieves greater speedup by keeping a user-defined number of processes for all requests to be processed by. mod_perl, on the other hand, tries to equip every web server with a fully fledged copy of Perl. Unfortunately this means that if a web server is just serving a flat file, that Perl instance is taking up memory but sitting idle.

    Both FastCGI and mod_perl claim benefits beyond speed-up. As a novice I don't understand what many of these features are but according to this chart, mod_perl has more of em.

    I had a look at that list. Some of the statements are just plain incorrect, such as "need to reboot httpd when script changes on disk" - for mod_perl this is a cronic problem, in some cases a graceful restart just doesn't work, whereas in the worst case with FastCGI all you have to do is signal a FastCGI script, have it exit, it restarts and you didn't even miss a request. Or, if you prefer, you can have your script automatically check whether or not the critical files have changed and then exit, causing them all to be reloaded. In summary, this is more of a problem in apache and less of a problem with FastCGI, but not according to that chart.

    The other "features" I would call mainly "hacks". They're doing all sorts of things with bits of the HTTP request to figure out what to do with it, but why bother when you could do all of that in Perl? Treat the HTTP request as a remote procedure call, which really is what it is. Need authentication? Just write code to check it. Just hand off every request to the FastCGI script to decide what to do with it, so you can specify it in Perl code rather than Apache's arcane and non-intuitive configuration syntax.

    And what about memory leaks? mod_perl has no recovery. You have to restart - that is, kill then restart, your web server, causing all current connections to be abruptly broken.

    You can attribute this chart to pride (aka Hubris) on Doug MacEachern's part; they put a lot of time and effort into what they did, and their pride makes them blind to the faults in mod_perl. Sometimes it takes an upstart to reject the status quo, who is branded a heretic and thrown out. Currently mod_perl simply does not scale and is a bitch to maintain and code under, but no-one will say that because they don't want to face the rejection; as Carlos Casteneda would say, the first challenge of a man of knowledge is overcoming his fear. Larry wall is wrong in stating that Hubris is a virtue of a programmer. It is good to be self-confident, but to be proud implies that you do not know that something is true and are accepting it to be true based on what someone else said. This is also known as lying to oneself, which increases the level of "consensus with the status quo" in your existance and retricts your ability to see truth.

    Let me get this clear to you - FastCGI is a technically superior design. The limitations of mod_perl (ie a hundred copies of Perl being idle) are only solved with mod_perl 2.0, which builds threading into mod_perl; great, but IMO entirely inappropriate for a dynamic web server. When will threading in Perl be stable? I wouldn't trust it until Parrot is considered stable. Because the web server and the perl process are not doing very much communication per request, threading is inappropriate. There is two communications per request; one - the detail of the HTTP request. Two - the response. Why bother with threading for that? Have two processes communicating via sockets or even shared memory and be done with it.

    The process for installing FastCGI looks a little scary. The docs begin something like: "...you need a FastCGI-savvy version of Perl..." There doesn't seem to be a debian package for FastCGI...

    Any recent version of Perl will do. There is a debian package, too:

    sam@fractal:~$ apt-cache search fastcgi libcgi-fast-perl - CGI::Fast Perl module. libfcgi-perl - FastCGI Perl module libapache-mod-fastcgi - FastCGI module for Apache. sam@fractal:~$
    It might be easier to run the webserver and CGI programs on separate boxes; I'm not clear on how big a feature [this] actually is.

    Depends how big your site is. You'd be surprised how much faster it would be; you'd get effects like the web server code not leaving the CPU's internal code cache, each part of the request being served in one CPU time slice, and so on. Adding an extra tier can make the biggest difference in performance once you reach a certain level. But the mod_perl team wouldn't know that - after all, they can't even do it.

    Interestingly, I notice that PHP supports running as a FastCGI process.

    If the teacher is not respected, not the material not cared for, then confusion will result, no matter how smart one is.  - Tao Te Ching

      Whilst you raise some interesting points, you fail to mention common workarounds that are deployed to deal with the problems you mention. I have found mod_perl does a great job as a high-performance Web application server.

      FastCGI achieves greater speedup by keeping a user-defined number of processes for all requests to be processed by. mod_perl, on the other hand, tries to equip every web server with a fully fledged copy of Perl.

      On any operating system I've used, the fully fledged copy of Perl is shared between all httpd processes using copy-on-write memory. This isn't a problem.

      If a web server is just serving a flat file, that Perl instance is taking up memory but sitting idle.

      The mod_perl guide describes various strategies for dealing with this. Reverse proxying is commonly deployed using a lightweight Apache httpd or Squid. Or you can use a separate httpd, configured differently for serving flat files such as images. Furthermore, you can load your code into memory when the parent httpd starts up, ensuring that code is shared between child httpd processes.

      Some of the statements are just plain incorrect, such as "need to reboot httpd when script changes on disk" - for mod_perl this is a cronic problem

      On a development server, you can use Apache::Reload to avoid such problems. On a production server, your code shouldn't change frequently, anyway.

      And what about memory leaks? mod_perl has no recovery.

      True, but it's easy to avoid most memory leaks in Perl by using good programming practices (use strict, use warnings, etc.).

      Additionally, mod_perl integrates much tighter with Apache than FastCGI. If you're just using Apache::Registry, you won't notice these benefits, but if you write your own mod_perl handlers, which is simple enough, you can hook into the different request phases and other features of Apache. Now I've written handlers, I find it hard going back to the plain old CGI way of doing things for major projects.

      Yes and no. They both do that, but FastCGI achieves greater speedup by keeping a user-defined number of processes for all requests to be processed by. mod_perl, on the other hand, tries to equip every web server with a fully fledged copy of Perl. Unfortunately this means that if a web server is just serving a flat file, that Perl instance is taking up memory but sitting idle.

      No one in its right mind serves flat files with a mod_perl server. Any decent production server will use a front-end reverse proxy to serve static resources, and hand off requests for dynamic documents to the mod_perl back end. For example I have such a server with 300 front-end processes and only 12 back end mod_perl processes.

      The other "features" I would call mainly "hacks". They're doing all sorts of things with bits of the HTTP request to figure out what to do with it, but why bother when you could do all of that in Perl? Treat the HTTP request as a remote procedure call, which really is what it is. Need authentication? Just write code to check it. Just hand off every request to the FastCGI script to decide what to do with it, so you can specify it in Perl code rather than Apache's arcane and non-intuitive configuration syntax.

      You can certainly write all that code in Perl with mod_perl, that's precisely the point of having hooks into all request phases. Apache encourages modular code, so that although you can do your authentication, url rewriting, logging, etc, in your content handler, just as you would with a regular CGI, it is advisable to use the proper request phase hooks instead. This provides for more reusability. Want to authenticate against a SQL database? Just plug in Apache::AuthenDBI and you're done. I'm not sure what you're talking about when you say that this could all be done in Perl instead: mod_perl precisely lets you do all that in Perl, it just follows the Apache request model so that you don't mix authentication, content delivery, etc, in one script. One of the major benefits of this is that you can use a number of CPAN modules that deal with common tasks.

      Currently mod_perl simply does not scale and is a bitch to maintain and code under, but no-one will say that because they don't want to face the rejection.

      This is at best an unsubstantiated opinion. There are many large sites that use mod_perl. Once you learn to use strict, there's not much more to writing mod_perl code than there is to write regular CGI code.

      Because the web server and the perl process are not doing very much communication per request, threading is inappropriate. There is two communications per request; one - the detail of the HTTP request. Two - the response. Why bother with threading for that? Have two processes communicating via sockets or even shared memory and be done with it.

      Obviously you haven't done much work with mod_perl. A request goes through 11 phases during its lifetime, each of which can be handled by Perl code. That code has access to the full Apache API, which goes way beyond just accessing the handful of environment variables available under CGI. As for threading (not available yet, will be under 2.0)--many think that superior performance can be expected of a threaded server, but the Apache developers still provide the 1.x prefork model in 2.0. mod_perl 2.0 supports threading because Apache 2.0 supports it, you don't have to use it if you don't like it. However one might like the fact that multiple threads will share the Perl interpreter (through 5.6.x cloned interpreters) which will greatly reduce memory requirements). A threaded server's primary benefit isn't for communication between threads, it's about increased performance and reduced resource consumption compared to the multiple process model. At any rate, suggesting sockets or shared memory as a way to improve performance is almost laughable, as these techniques don't scale very well.

      Adding an extra tier can make the biggest difference in performance once you reach a certain level. But the mod_perl team wouldn't know that - after all, they can't even do it.

      I am not sure what you're saying here--mod_perl does not impose a web server architecture for you. It's just a technique to handle dynamic requests. Any moderately busy site will use a multi-tier architecture, whether you're using mod_cgi, mod_perl or PHP. Apples and oranges.

      As to your comments on Doug MacEachern's supposed pride, I don't know where you got that from. Doug is one of the shyest and yet most helpful guys around. He does believe that his product is good, but I've never seen him make any bold statements about its superiority over other products that have the same goals. It's just a tool, choose the tool most suited for the job, and the one you feel most comfortable with.

      If you wish to compare technologies, you'd do a service to your readers if you acquire similar knowledge about each technology you're evaluating, and restrict your comments to the technology themselves. This way you won't have to resort to inflamatory comments about the authors.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://108008]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (2)
As of 2019-08-24 09:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?