Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Benchmarking Tests

by kidd (Curate)
on Apr 30, 2003 at 22:26 UTC ( #254504=perlquestion: print w/replies, xml ) Need Help??
kidd has asked for the wisdom of the Perl Monks concerning the following question:


I spent all morning benchmarking my script on a costumer's site to see if they could be some unecessary server load.

To do this test I used the Benchmark module. I tested all my CGI's and saved their benchmark outputs on different tasks.

This tests took me to a question I've never asked myself before, "What should be the optimal CPU usage for a script?", I tried to search for some posts like this but without success, so I decided to make a post here.

Here are different CPU usages I had from my CGI's:

/apromo/apromo.cgi - 0.06 CPU /baboon/myMail.cgi - 0.15 CPU /baboon/entrada.cgi - 0.14 CPU
I hope someone could guide me or give me some links where I can read about this.

The main goal is to know if my CGI's are efficient enough or if I could get them to be more efficient.


Replies are listed 'Best First'.
Re: Benchmarking Tests
by pzbagel (Chaplain) on Apr 30, 2003 at 23:23 UTC

    Benchmarks are relative. 'Relative to what?', you ask. Well, to the machine you are on and the data you are working with, and the script you are running.

    Remember the Perl adage: There's more than one way to do it. That is the way you have to look at your scripts. What are the different subs doing? Can I rewrite them using some other method. Only by rewriting sections and benchmarking again can you get an idea of what works better or not. Bottom line is that unless you want to start reading the perl source code and seeing how it implements various operations (I sure don't, but some people out there do) then figuring out what works better or worse on your systems on a given dataset is going to be trial and error.

    Areas of your scripts you should study hard to see if there is a more efficient way are:

  • regexes are notoriously easy to make overly complicated. Read Mastering Regular Expressions by Friedl to get a handle of the regex engine.
  • Loops that loop over large chunks of data. What are they doing to the data? Can you unroll the loop or do some of the data transforms more efficiently.
  • Check out Effective Perl Programming by Hall (with Schwartz:) for tons of wiz-bang stuff about efficiency in Perl.

    Essentially, what you want is to look at the code that does the hard tedious work with the most data, these are the sections that may be ripe for streamlining.

Re: Benchmarking Tests
by hv (Parson) on May 01, 2003 at 02:18 UTC

    I have the same problem at work, in that we occasionally have high server loads but I have no idea how to establish what scripts (and what aspects of those scripts) are causing the problems.

    One problem with benchmarking the scripts is that this may not tell you the whole story - if they connect to a database, for example, the efficiency of their database requests will primarily affect the processor use of the server process rather than the script process.

    In my case, I suspected database load was the main problem, so I hacked the database abstraction used by the perl code to log the SQL of every request, and started going through looking for requests that needed indexes, and duplicate requests that could be cached. I found a surprisingly large number of opportunities for improvement, and the system administrator hasn't complained of server load since, so I'm hoping that means I've improved things. :)

Re: Benchmarking Tests
by perrin (Chancellor) on May 01, 2003 at 01:46 UTC
    CPU is not a comprehensive measure of efficiency. If your script runs for a long time or does a lot of I/O, it could tie things up and slow down the overrall server. You would be better off testing your script with an http load generation tool.
Re: Benchmarking Tests
by Abigail-II (Bishop) on May 01, 2003 at 01:56 UTC
    CPU usage alone doesn't mean much. A program could hardly take any CPU usage, and still be "slow". You also need to look at how many memory it uses (and in which segments), how many cache hits does it get, and how many page faults, how much I/O, how many (and which) system calls does it make, how many other resources does it use, etc, etc.

    And also note that some measurements are very system dependent; they'll depend on the OS, the kernel configuration, the hardware installed, and what other processes are running on the box.


Re: Benchmarking Tests
by Anonymous Monk on Apr 30, 2003 at 23:10 UTC

    As little as possible and as much as is required.

    This really is a "how long is a piece of string" question.

    How much your script uses depends first on what it has to do, then on factors like how fast your server is.

    Your question is way too loose to get any useful answers.

    The questions you should be asking your self:

    Does the server ever become overloaded?

    How long does the use have to wait for a page (excluding network time)?

    How is this affected by the number of concurrent users?

    Is anyone complaining?

      Thanks for your reply.

      The thing is that recently my client's sever has been overloaded and they blamed one of my script, that's the reason I checked all of them.

      We are allowed to use 3% of the total CPU usage, so that is like from 3 to 5 CPU's but the server load we are using is between 10 to 15.

      What I wanted to do is check if one of my script where doing the server load of one of the other host, because we are on a shared plan.

      I though that my CPU usage per script was very normal, and we may have up to 20 users at the same time. So I made a little math and thought:

      "If my script is using 0.15 CPU and lets say that 20 users use it at the sames time(it's very unusual), then it would be 3 CPU, that is not close to 10 or to 15".

      I just wanted to be sure that my script aren't using a lot of server load...

        You appear to be mixing up CPU usage and Load. in particular your example of 0.15 CPU * 20 Users = 3 Load doesn't make sense as 0.15 is a percentage and load is absolute

        Load is the number of processes that are waiting for time on the CPU. That would be the 10 or 15

        % CPU is the amount of real time that the process spent actually running eg 0.15.

        So in your example if your script was taking 0.15 CPU over 1 second and you had 20 users running it simultaneously you would be trying to use 20 * 0.15 = 3.00 CPU which means that you will be using 100% of the CPU for 3 seconds. and that would translate into roughly 20 load at 0 seconds 14 load at 1 second 7 load at 2 seconds and 0 load at 3 seconds. Assuming each request ran to completion 1 after the other. Which they probably won't what with the vagaries of I/O and scheduling.

        And of course if you have a multi proc box then the real time taken is % CPU / # of CPU's

        I'm not sure if the %CPU time is the percentage the process uses over a fixed length of time eg 1 second. or if it's the percentage used over the length of time it took to run to completion.

        Anyway I hope that explains the difference between Load and CPU usage ( and I hope it's reasonably accurate). I also hope it helps you find the problem, and of course as other ppl have already said, Do a actual test, *this* is just to help you interpret the numbers a bit better :). Also Load is usually averaged over the last minute 5 minutes and 15 minutes

        Post the code. Your much more likely to get good answers if you do.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://254504]
Approved by benn
and one hand claps...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2018-05-26 18:12 GMT
Find Nodes?
    Voting Booth?