Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Why is this code so much slower than the same algorithm in C?

by wanna_code_perl (Pilgrim)
on Dec 09, 2008 at 02:42 UTC ( #729070=perlquestion: print w/ replies, xml ) Need Help??
wanna_code_perl has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am trying to figure out why this code performs so poorly against its C counterpart:

Perl (complete program)

#!/usr/bin/perl OUTER: for (my $i = 20; ; $i += 20) { foreach (my $j = 1; $j < 20; $j++) { next OUTER if ($i % $j); } print "Number: $i\n"; last; }

C (complete program)

#include <stdio.h> int main(void) { int i, j; for (i = 20; ; i += 20) { for (j = 1; j < 20; j++) { if (i % j) break; } if (j == 20) { printf("Number: %d\n", i); break; } } return 0; }

On my machine, the C variant (Linux GCC 4.3.2, no optimization flags, not stripped) runs in 1.36 +/- 0.02 user+system seconds.

The Perl variant (5.10.0) takes 48.7 user+system seconds on the same machine.

Why is Perl so much slower at running the same algorithm? (Actually the C algorithm is slightly worse due to the extra comparison).

Keep in mind, the purpose of the code samples is completely tangential to this discussion--I am seeing similar performance with all tight processing loops. I am aware of the ability to use native C libraries with Perl (and other optimization techniques), but my question is not about optimization. I just want to get a better idea of what is making Perl so terribly slow with simple loops like the above.

Comment on Why is this code so much slower than the same algorithm in C?
Select or Download Code
Re: Why is this code so much slower than the same algorithm in C?
by monarch (Priest) on Dec 09, 2008 at 03:01 UTC
    At first glance your two algorithms look different.

    But even after re-writing to match your original algorithm this takes 20 odd seconds on my machine. When I add use integer; it cuts it down to 13 seconds.

      Thanks for the comments. Here are a few more results, based on your suggestions and comments:

      Yes, I acknowledged in the original post that the C algorithm is actually a bit less efficient. Putting the missing comparison into the Perl version instead of the loop label, as you have done, actually had no measurable effect on runtime.

      Moving my ($i, j) outside the loop had no measurable effect on runtime. Also not too surprising.

      Running your version exactly as listed took on average 48.8 +/- 0.1 sec—again no measurable change from my previous results.

      True, I could use more rigorous methods to more accurately measure the effect of the above changes, but at best we'd be looking at a fraction of a percentage difference. Nowhere near the 3000+ % difference in the C version.

      Adding use integer; to your version caused it to run in 46.8 sec, which is a marginal improvement. (My test machine has an FPU).

      Not forgetting my original question of why this runs so slow, your optimization ideas do help shed a little light on a few of the factors that may be influencing performance.

      It's perhaps important to underscore that my question is not, "what can I do to speed up this random little demonstration program", but instead, "why is Perl so much slower than C on some arbitrary, computationally expensive algorithm". Perhaps more to the point, "what are the factors influencing the performance of tight loops in Perl". (Or where could I read more about that topic!)

      Hope that makes a bit more sense. Thanks again for your insights!

        Because the C code and the Perl code end up doing roughly equivalent things except that the C code runs through a few dozen machine-language instructions while the Perl code runs through a few dozen "opnodes". Machine-language instructions are the fastest unit of execution on a computer. While the slowness of opnode dispatch (each requires several dozen or even hundreds of machine-language instructions) is one of the motivations for some of the design changes in Perl 6.

        - tye        

Re: Why is this code so much slower than the same algorithm in C?
by GrandFather (Cardinal) on Dec 09, 2008 at 04:18 UTC

    At root Perl is interpreted not compiled. For things like string manipulation and regular expression parsing Perl performs very well because most of the time is spent in well crafted C code (in the interpreter). For code such as your sample which is CPU intensive but doesn't take advantage of Perl's strengths you simply hit the penalty for using an interpreter.

    For most applications that take advantage of Perl's strengths the performance is fine. Even for many applications that don't play so well to Perl's run time strengths, the time to craft a solution in Perl and execute it is very often shorter than the time to write and execute a similar application using a different language, even though the run time may be much shorter in the other language.


    Perl's payment curve coincides with its learning curve.
Re: Why is this code so much slower than the same algorithm in C?
by eye (Chaplain) on Dec 09, 2008 at 04:31 UTC
    Short answer: Perl is an interpreted language and C is a compiled language.

    Long answer: Perl code is executed by a virtual machine that is itself a binary executable for the native hardware of the host. C code is compiled to a binary executable for the native hardware of the host. That extra layer of software is inefficient. Perhaps the surprising thing is how often that inefficiency is irrelevant. It tends to become irrelevant when interacting with users, doing file I/O, doing almost anything with a network, and in a variety of other situations. Where it is relevant is in tasks that are purely CPU intensive, like that presented in your code.

      Perl is an interpreted language ... Perl code is executed by a virtual machine that is itself a binary executable for the native hardware of the host

      No, and No.
      Quote Tom Christiansen (FMTEYEWTK about Compilation vs Interpretation in Perl):

      The perl executable ... has two distinct stages ... It compiles your perl program source into a parse tree. This compiler then performs various optimizations such as ... loading in certain library definitions ...
      Next comes the backend, which is certainly an interpreter of sorts; let's call it a PP interpreter for now, just because. While what it actually executes is a parse tree and not byte code per se, still we would not go wrong in classifying this backend as a byte-code interpreter (like java or python). This is useful in particular when it comes to distinguishing these languages from "pure" interpreters.

      "A core tenant of the greater Perl philosophy is to trust that the developer knows enough to solve the problem" - Jay Shirley, A case for Catalyst.

      No, on both your short and your long answer. Perl isn't interpreted, in the sense that a program isn't executed by taking a statement, parsing/compiling it, running it, then take the next statement and do it over again.

      Perl is more like Java. It's compiled, turned into an internal format (called "the optree") and then executed in its own virtual machine. The difference with Java is that in Java, compiling is a separate action from running the program - in Perl, compiling is the first stage of running.

      Now, for the long answer, the fact that C gets compiled to a native binary and then executed and Perl gets compiled to an optree counts for some difference, but far from all. Even if there was a Perl compiler that turned out native binaries (say there would be a Perl front end to the gcc compiler), then the Perl program would still be much slower than the C program. It's the price you pay for flexibility. In C, if you have an 'int' variable, that you just have an integer. C can just allocate 4 (or 8) bytes and it will know there's always an integer there. Perl doesn't. In Perl, if you access a variable and use it like an int, Perl will first fetch the meta data. It'll test whether the value stored is valid as an integer. Getting that integer requires another fetch (pointer + offset). If the value isn't valid as an integer, it will first have to calculate the integer value, from the string value for instance.

      Perl values (and hence, variables) come will all kinds of neat tricks. Strings become integers/floats and the other way around magically. Strings can expand in length without the programmer having to create a new variable. Variables can be tied, blessed or have some other magic attached. For all this niceness, you have to pay a price. And that prices doesn't only come in the form of memory usage. You pay a hefty runtime price as well.

Re: Why is this code so much slower than the same algorithm in C?
by trwww (Priest) on Dec 09, 2008 at 04:43 UTC

    Hello,

    It would be interesting to see how well the code performs under perl 5.8 and perl 5.6 as well.

    Regards,

      Apples to oranges, but ActiveState 5.8.8 (Windows) on a significantly faster dual-core machine: 28 seconds.

      Unfortunately, I no longer have any good way to run 5.6.

        Apples to oranges...

        Well the script is a constant because all three perl binaries can run it unmodified. Actually, I found 5.6 runs much faster (not surprising, it is a much slimmer app). The top run is using a system provided 5.8, and the second run is a 5.6 install I have for an app I maintain:

        $ time /usr/bin/perl test.pl Number: 232792560 real 0m29.042s user 0m29.039s sys 0m0.004s $ time usr/bin/perl test.pl Number: 232792560 real 0m20.968s user 0m20.961s sys 0m0.006s

        I ran it a few times and the figures were about the same. But I agree, not really useful. It is a bit interesting though.

        Regards,

Re: Why is this code so much slower than the same algorithm in C?
by wanna_code_perl (Pilgrim) on Dec 09, 2008 at 04:49 UTC

    These are all excellent comments. Thank you all--this really helps me understand what is going on under the hood. I had no idea the bytecode had this much overhead.

Re: Why is this code so much slower than the same algorithm in C?
by BrowserUk (Pope) on Dec 09, 2008 at 05:05 UTC

    Let's look at what is really going on when these programs run.

    1. First the C program. It translates into the following assembler:
    2. Now the Perl code. It translates into the following ... um...I terrified to apply a term to this because one or more of the local language laywers is going to leap all over what ever term I use to say that it isn't really 'term' because blah, blah, ... more irrelavancies, ... blah!

      So, whatever terms suits you, which may include one of the following: <opcodes|bytecode|syntax tree|abstract syntax tree|other>.

      (Dodge issue!) This is the output from perl -MO=Concise YourScript.pl:

      So the answer to your question is: when you understand why those four instructions in the C version, require those 700+ lines of assembler for the perl version, then you'll understand why the performance difference exists.

      And also why it isn't a problem!


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Why is this code so much slower than the same algorithm in C?
by CountZero (Bishop) on Dec 09, 2008 at 05:42 UTC
    But you are comparing the execution of a compiled-into-machine-code program with the interpreting plus compiling plus running of a script written in text (ASCII or UNICODE or ...).

    That seems hardly a fair comparison, does it?

    How long did it take you to compile, link and execute this C-program (and you are not allowed to put it in a batch or shell script, because that is part of another language)? I bet the difference would be much less.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      and you are not allowed to put it in a batch or shell script, because that is part of another language

      What? Well, OK, since it's pretty tough to automate compilation and execution without some sort of "script", the following example takes my corrected typing speed into account, too!

      Still less than four seconds, and most of that was thanks to my clumsy typing. Fair enough, no?

      Anyway, compile time is irrelevant for a bunch of reasons, but mainly: compile time will be negligible for any CPU-bound problem of significant size. Such problems are more or less the point of this node.

      Finally, to really look at the compilation step fairly, how many real-world programs (including real-world Perl programs) really need to be recompiled every time they are run? For many applications, an "interpreted" compilation step is just another nail in the comparative performance coffin.

        This is indeed a fair comparison.

        Of course, if you want high execution speed, Perl is not a wise choice. you should choose Assembler or pure machine code then. In almost all other cases though, Perl is "fast enough" and much more convenient to use.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        You don't have throw the baby out with the bath water in order to benefit from good performance:

        #! perl -slw use strict; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'END_C', NAME => '_729090', CLEAN_AFTER_BUILD => 0; #include <stdlib.h> #include <string.h> #include <stdio.h> SV* thing() { int i, j; for (i = 20; ; i += 20) { for (j = 1; j < 20; j++) { if (i % j) break; } if (j == 20) { return newSViv( i ); break; } } } END_C print time; print thing(); print time; __END__ C:\test>729090-IC.pl 1228813945 232792560 1228813946

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Why is this code so much slower than the same algorithm in C?
by LanX (Canon) on Dec 09, 2008 at 11:57 UTC
    Q: Why is Assembler much faster than C?

    A: The benefits of a higher abstraction level has to be paid by performance!

    Might be more interesting also to compare to Java which runs on a VM, too! (But perl is still more abstract)

    Cheers Rolf

    PS: Tuning effects like in Java's JIT -compiler are features Perl6/Parrot tries to implement for "for high-speed code execution".

    UPDATE: More benchmarks!

      Did a quick test of this:
      Perl 5.8.7 under cygwin, 21.5 seconds.
      Java 1.6.0_06 (based on OP C code) , just over a second.

      Col
      The abstractions in the language don't have to be a significant obstacle to performance. In fact, if the language provides the right abstractions, you can probably improve on most C implementations of an algorithm just by using some well-optimized built-in abstractions. Perl currently doesn't come very close to doing this, though it's relatively quick compared to the other popular dynamic "scripting" languages (Ruby, Python, PHP).

      C-style "fixed" typing, especially when dealing with basic numeric types is still hard to beat for a language as dynamic as perl, but as you hinted, a good enough JIT compiler (like Java's) could get pretty close to (and in some cases probably beat) the speed of a C implementation.

      A few existing dynamic languages that have good compilers (some need hinted with type declarations at the tight portions of code - which may be cheating, but good Common Lisp implementations use it, and AFAIK they don't even use a JIT compiler), can already come really close to C's speed.

      As for performance in dynamic languages in general, I'll also drop a link to clojure here - it's a lisp (dynamic typing included) implemented in Java, which for speed is probably close to the top of dynamic languages and has very interesting multithreading/concurrency properties.

      UPDATE It took some time to get it right; I'm new at clojure and the OP's program doesn't translate into idiomatic functional constructs (mainly because the OP's algorithm is stupidly inefficient) - but I've tried to keep it as close to the original constructs as possible:

      (defn inner [i] (loop [j 1] (if (= 0 (rem i j)) (if (< j 20) (recur (+ 1 j)) i) nil))) (defn loops [] (loop [i 20] (let [result (inner i)] (if result (print (format "Number %d\n" result)) (recur (+ i 20)))))) (defn time-me [] (time (loops)))
      Results:
      (time-me) Number 232792560 "Elapsed time: 10865.715 msecs" nil
      For comparison: this is on a 4 months old macbook, where the OPs perl code takes about 19 seconds (which includes compile time, but that's tiny in comparison to the total):
      joost-diepenmaats-macbook:~ joost$ time perl test.pl Number: 232792560 real 0m19.568s user 0m19.078s sys 0m0.060s

      updated again to convert tabs to spaces in pasted code, and again because the timing for the clojure code was way off due to stupid programmer syndrome My code is still way slow compared to the OP's reported C time, but about twice as fast as perl - not that bad for one of my first attempts at clojure.

        > A few existing dynamic languages that have good compilers (some need hinted with type declarations at the tight portions of code - which may be cheating, but good Common Lisp implementations use it, and AFAIK they don't even use a JIT compiler), can already come really close to C's speed.

        thanx for the insight... well another good reason to overcome my parens-allergy ; )

        ... maybe starting with elisp.

        Cheers Rolf

        PS: I suppose the dynamic structure of lisp (data == code) enforces a late compiling (comparable to JIT) and therefore the potential to high optimization ...

Re: Why is this code so much slower than the same algorithm in C?
by ptoulis (Scribe) on Dec 09, 2008 at 13:42 UTC
    BrowserUK gave a pretty precise and technical answer to this problem. What it counts is the actual machine code that is finally executed and not the source code size not even the algorithm.

    A quick and short answer is that there is a huge difference between the int i,j in C and the my $i and my $j of Perl. A scalar in Perl is not the int of C, but rather a complex data structure. It just happens that you treat it as an integer and Perl gracefully obeys, but you could use them even as strings, or references (functions, arrays, hashes) or even entire objects! And you could do that at any time in your code and Perl would not complain.

    Finally Perl being stack-oriented, it has a much slower implementation of loop structures. Even the most simple for(1..1000) {;} will run much slower than the C or Java equivalent. But again, this is the cost for Perl being so much flexible and dynamic.
Re: Why is this code so much slower than the same algorithm in C?
by sundialsvc4 (Abbot) on Dec 10, 2008 at 01:12 UTC

    A language like Perl is centered on the notion that "most of the things that we want to do in data-processing are not CPU-bound; they are I/O-bound." In other words, real-life programs usually spend a small amount of CPU-time setting up for the next I/O operation. The thing which really needs to be "optimized for" is the human time spent writing and debugging and maintaining the program.

    For those few truly CPU-intensive tasks that we must do from time to time, Perl allows you to define C-language extensions to itself or, more easily, to invoke external programs to do certain things. The times when we actually have to do that are in the minority, but they certainly do exist.

    When you devise a CPU-intensive algorithm in "straight Perl," don't be surprised when it takes much longer than "straight C." Also note that the opposite is definitely also true: write a hash-table algorithm in "C" and you sure are wasting your time, vis-a-vis doing the same thing in a couple of lines of Perl. "Tools for the job." Any software tool is going to be human-inefficient at doing a task it was not designed to do, because the design of every tool is a carefully calculated trade-off between opposing goals.

      Perl is really handy for text manipulation. That doesn't preclude that manipulation being CPU-intensive - it often is. For text manipulation tasks, regardless of any IO, Perl can generally easily keep up with compiled languages because the CPU-intensive work is actually done in compiled C (the Perl interpreter is compiled C). If you are using regular expressions, map, grep, index, substr and all the other Perl goodness then you really are not at a speed disadvantage compared with compiled languages.

      On the other hand there are good hash implementations for C++ - a good tool is a good tool in most languages. They aren't as much fun to use as Perl's hashes because strict typing means Perlish techniques are cumbersome, but they are there when you need em.


      Perl's payment curve coincides with its learning curve.
Re: Why is this code so much slower than the same algorithm in C?
by ruzam (Curate) on Dec 10, 2008 at 14:41 UTC
    ptoulis's comment about stack-oriented loop structures got me thinking. What if the inner loop were replaced with individual lines of code? Here's what I came up with:
    sub new { my $i = 0; while ($i += 20) { next if ($i % 1); next if ($i % 2); next if ($i % 3); next if ($i % 4); next if ($i % 5); next if ($i % 6); next if ($i % 7); next if ($i % 8); next if ($i % 9); next if ($i % 10); next if ($i % 11); next if ($i % 12); next if ($i % 13); next if ($i % 14); next if ($i % 15); next if ($i % 16); next if ($i % 17); next if ($i % 18); next if ($i % 19); print "Number: $i\n"; last; } }
    Then I benchmarked it against your original Perl code and got these results:
    s/iter original new original 25.7 -- -63% new 9.57 168% --
    Admittedly, it doesn't scale very well and this example is very case specific, but I'll be paying closer attention to where and how I use loops from now on.
      With a little analysis to omit test values that don't affect the answer and some improvement on the if..next..if..next logic, I was able to tighten that up slightly:
      #!/usr/bin/perl use integer; my $i = 0; while ($i += 20) { last unless ($i % 3 or $i % 6 or $i % 7 or $i % 8 or $i % 9 or $i % 11 or $i % 12 or $i % 13 or $i % 14 or $i % 15 or $i % 16 or $i % 17 or $i % 18 or $i % 19 or $i % 20); } print "Number: $i\n";
        snowhare,
        Can this be reduced even further?
        for values of x > 20 if x % 12 == 0 then x % 6 == 0
        Is there a reason to have them both?

        Update: In other words, order the list in descending order removing any exact multiples of previous items. When you reach 16, one of two things will happen. Either there will be a remainder and it will exit the loop or there won't be a remainder and it will continue. There is no need to test 8 after 16.

        Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://729070]
Approved by kyle
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (8)
As of 2014-12-26 02:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (164 votes), past polls