Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Increasing Performance of mod_perl/CGI/MySQL programs

by dkubb (Deacon)
on Feb 12, 2001 at 02:26 UTC ( #57792=note: print w/replies, xml ) Need Help??


in reply to Speed of Template

mod_perl/CGI/MySQL programs are interesting applications to attempt increasing performance of, there are so many moving parts, you need to take a more holistic approach. There's generally a combination of several factors affecting performance. Warning: This will be a long post (and possibly slightly OT), because I am fairly interested in this subject =)

Below is an outline of the areas I believe you need to consider when attemping to increase the performance of your perl database interface. Some many not be possible depending on the resources/time you have available, but I wanted to outline everything I could think of, and allow you to choose what is possible:

  • Minimize Disk IO. Look at the physical hardware you are running the database and/or webserver on. Generally, IO problems contribute the greatest to system slowdowns. Look at purchasing a faster hard drive, more ram if your machine is swapping out to the hard drive, or - in the extreme - dedicating a server to house the database.

  • Look at the setup of your MySQL database . There is a great FAQ about getting maximum performance out of MySQL, here.

    • Analyze your queries, and figure out which take the most amount of time, then learn how to make indexes. Remember, don't just set these and think you're done. I've been using MySQL full-time for just over 2 years now, and I routinely make small tweaks with indexes I thought were perfect a month earlier. Set a monthly or bi-monthly schedule to check your indexes, it is worth it. This is the most important of all the tips I outline in this post. It's not uncommon for indexing to get you a 100% speed increase from a few minutes of tweaking.

    • Here's a neat one, I've read about but not used (yet), Heap Tables. A heap table is an in-memory table that validation-type data should be stored in. You put any information in here that is somewhate small, and constant such as a list of valid Countries.

      When mysql starts it does so with the help of a start-up script. You can change the start-up script so it to performs a series of SQL commands after beginning the server. This can be used to automatically create a HEAP table, and fill it with data from other tables/outside sources. Queries that use the heap table could be greatly sped up. Anytime where you need alot of speed from your "validation" tables, give this a shot.

  • Anaylze all the SQL queries not only with EXPLAIN, but using DBIx::Profile in your perl programs. It will give you a nice breakdown of each SQL query that was run, and tell you which SQL queries are taking the largest amount of time.

  • Never use SELECT * inside an SQL query. Only fetch the columns & rows you need, nothing more. It's a huge waste of resources to ask for information you throw out and never use.

  • Web Server Setup. Look at Apache, and if there is anything you can do to speed it up. Try using Apache::DBI, which caches the database handle. This means that when your script runs it won't need to connect to the database, because the connection is kept open for you. Also see if there are any modules compiled into Apache that are not necessary to the functioning of your website. Consider recompiling with just what you need - go lean.

    Here are two FAQ's on Apache and mod_perl performance you should look at:

  • Write the shortest, cleanest perl code you can to get the job done. The shorter the code, the easier it is to optimize and bug test, since you have less to keep in your head all at once.

  • Use Devel::DProf and Apache::DProf to profile the actual perl code to see where your program is spending the most amount of time. It gives you a nice breakdown of each function in your program and tell you how long each one took. Without doing this, you'll just be just guessing. I once read somewhere that programmers spend most of their time optimizing the wrong section of code. Don't fall into this trap, learn to profile.

  • Now, with having said all that, do not be afraid to use a templating system. Yes, you will get a slight slowdown when compared to embedding the HTML right in the perl code, but the benefits are numerous. You get cleaner code, a seperation of presentation from logic, and more avenues for extra optimizations (which I'll get into next).

    One of my personal favorites is HTML::Template, which allows a complete seperation of logic and design. You can't embed perl code inside it, instead there is a mini templating language you use. There's something about mixing two different syntaxes together (SQL and perl, HTML and perl, etc) that confuses me, which is why I like HTML::Template. Combine this with HTML::Pager for easy paging of database results, and it makes hard things simple.

So, you're database and web server are running at peak performance. Your perl code is optimized and profiled. Want more speed? Now you need to start looking at what happens after the information leaves your server.

  • Do some remote load testing. There are many great services that can do this for you, some even free or offering free trials. A quick search on google turns up hundreds of related sites, two key players in this area are Keynote and Service Metrics. I've used both of these, with good results.

    With this information you can pin-point where a/the speed problem lies. They can tell you if it's your server, or hosting company. You can use this as ammo when negotiating for faster/better service. Also, consider getting an SLA (Service Level Agreement) from your hosting company to gaurantee the speed of your pipe.

  • Don't forget the browser! The browser is an often forgotten piece of the puzzle. You want to make sure that the browser can download and render the HTML in the fastest possible time. How do you do this? A simple answer is to make sure your HTML is XHTML 1.0 compliant, a super clean version of HTML. Now that your HTML is completely inside HTML template files, it will be relatively painless to process them through HTML-Tidy.

    HTML-Tidy is a utility that takes any sort of HTML and outputs cleaned up XHTML compliant code. The theory is that if the browser doesn't have to "guess" at the sizes of images or close "p" tags itself, it can allocate more resources to parsing the HTML, and drawing the screen faster.

  • Look at Apache::GZIP. I've had no experience with using this module, but I hear it can increase download speeds for images and HTML. (even dynamically generated HTML) Please be aware that it could probably cause performance issues on your web server, since it needs to do alot of extra work compressing things on the fly. It's up to you to decide if the extra speed for your users is worth the trade off.

  • Try using HTML::Clean to filter out any extras, such as excess whitespace. You can sometimes compress your output by a further 10-20% using this module. Use this module with caution, it is quite agressive with it's cleaning. I would suggest a lower level of optimization rather than full, as it's been known to play havoc with javascript.

Whew! Sorry for the length of this post. Once I started writing I couldn't stop. Perhaps I should put this into a tutorial.

Anyone have other performance improvement suggestions?

Replies are listed 'Best First'.
Re (tilly) 1: Increasing Performance of mod_perl/CGI/MySQL programs
by tilly (Archbishop) on Feb 12, 2001 at 02:49 UTC
    Very nice, and all good advice.

    To this I would like to add a couple of more relevant links off of my home page. First of all Code Complete has a sample chapter online on Optimization that is worth reading. Secondly those who want to get into great depth on how to design web-servers and network protocols for maximum performance may find many things of interest in The C10K Problem. Another good reference is this rant on Latency vs bandwidth. Many slow applications are slow because of heavy duty interactions across something with bad latency, and not because of lack of bandwidth.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://57792]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2020-02-24 10:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (104 votes). Check out past polls.

    Notices?