Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^4: Parrot, threads & fears for the future.

by Corion (Pope)
on Oct 23, 2006 at 12:51 UTC ( #580015=note: print w/replies, xml ) Need Help??

in reply to Re^3: Parrot, threads & fears for the future.
in thread Parrot, threads & fears for the future.

I look at thread management much like I now look at memory management. Of course, Perl sacrifices a lot of performance to the reference counting, but it buys a lot of flexibility, convenience and security by using the automatic memory management instead of leaving the programmer to handle the memory, even if that could be more efficient.

  • Comment on Re^4: Parrot, threads & fears for the future.

Replies are listed 'Best First'.
Re^5: Parrot, threads & fears for the future. (ref counting)
by tye (Sage) on Oct 23, 2006 at 15:04 UTC
    Of course, Perl sacrifices a lot of performance to the reference counting

    I've seen this claim made by the people who refused to reimplement reference counting in Parrot. But the only reference I saw from them that they claimed supported this only said something more like garbage collection really isn't too much terribly slower than reference counting, as far as they noticed.

    Just because you've heard something repeated many times, doesn't mean that there necessarily is any evidence to support it.

    Incrementing and decrementing are two of the fastest things computers know how to do. Adding an increment to a call to malloc() is "in the noise" performance-wise. Certainly for Perl 5, the cost of reference counting to performance is insignificant. And I won't believe that ref counting has much of a performance cost even in a highly-efficient system, until I see some real evidence. Removing an increment assembly op is certainly micro-optimization and it is extremely hard to get micro-optimizations to have much impact on the performance of real code. Your code would have to be doing a ton of taking and removing references and little other real work for the inc/dec to add up to even a non-trivial fraction of the run-time.

    - tye        

      Certainly for Perl 5, the cost of reference counting to performance is insignificant.

      Twelve years after the initial release of Perl 5, there are still reference counting bugs -- and I don't just mean the circular reference problem.

        Yes, I intentionally didn't go into other issues of reference counting, they being outside the point of my node.

        Only recently did I look at how Perl 5 implemented reference counting enough to realize how horribly it was done. I'm not at all surprised that there are still bugs in that implementation. Perl 5 reference counting mostly requires matching each creation and destruction of references by hand. It is a huge manual effort that is very error prone.

        Of course, it isn't easy to design something in plain C that does a fantastic job of enforcing this type of thing. C++ offers great ways of dealing with this (destructors fire when you leave a scope, no matter how you left it; so doing such work in c'tors and d'tors ensures that your inc/decs are paired without having to match anything by-hand and even in the face of exceptions being thrown or early return).

        But you can certainly put more effort into making the system better at encouraging and enforcing proper habits so you don't have to put so much effort into figuring out proper ref count handling by-hand every time you add or change just about any of Perl's source code and not producing a system that is so confusing that it is almost always a source of questions and bugs for new XS coders. But that ship has at least mostly already sailed for Perl 5.

        It wouldn't be trivial to implement a good system for reference counting in plain C, but the failure of Perl 5's implementation to become error-free has more to do with that particular implementation not making the effort up front to make future errors unlikely.

        And it has at least mostly already sailed for Parrot as well. And at this point I'm doubtful that Parrot will ever be successful enough for us to see how big of a mistake this particular choice was. I consider timely and well-order destructors one of the most important tools of C++ but the designers of Parrot had no understanding of this and little interest in trying to understand the arguments of those who did understand this importance. Basing a design decision on ignorance (that was how the decision was made; it was a fiat of "I don't understand why you guys think reference counting is a good idea, but it doesn't matter because I've already decided and I'm writing it and that's that") in the face of several knowledgeable people criticizing that decision isn't usually a formula for success.

        - tye        

      Incrementing and decrementing are two of the fastest things computers know how to do. Adding an increment to a call to malloc() is "in the noise" performance-wise.

      True, but that's not the only place that reference counting can lose performance compared with a decent GC. You'll usually get much better cache localisation, reduced risk of pipeline hazards, and avoid much of the messy issue of synchronisation of counts across threads.

      I gained about 4% speed with some Lisp-ish code a few years back when we switched from a (admittedly fairly naive) reference count to a decent ephemeral GC. Not order-of-magnitude land - but a respectable speed up. I put most of that down to the GC keeping all the live objects in cache.

      Also, it's not just in the call to malloc you're adding the increment. You're doing an increment every time you add a reference to something, and a decrement every time you remove a reference to something. That happens a lot more often than malloc/frees.

      That said - there isn't a "right" answer. Whether ref counting or some other GC variant is going to get you more speed depends on the code's allocation patterns, the hardware you're running on, etc. too. I just want a decent GC in perl so I can have self-referental structures without having to jump through hoops ;-)

        You gained a 4% increase in performance after removing what you don't consider a very good implementation of reference counting? Yep, you've pretty much made my point. Corion's wording was "Of course, Perl sacrifices a lot of performance to the reference counting", which your numbers certainly refute. I suspect Corion's wording closely matches what was claimed in support of not doing reference counting in Parrot.

        Since you describe the ref counting as "fairly naive" and the GC as "decent", the 4% could easily have more to do with quality of implementation than anything else. You've mostly placed an upper bound of 4% on what one might expect in performance improvement going from RC to GC in such a situation. It sounds likely that a "decent" RC could easily give better performance than your decent GC in that situation.

        But even 4% is peanuts. The original use of performance as one reason to justify not doing ref counting obviously didn't imply "RC has too high of a performance cost (4%) and therefore cannot be considered", because that'd just be laughable.

        I put most of that down to the GC keeping all the live objects in cache.

        That sounds like wishful thinking, sorry. One of the worst performance impacts due to cache problem that I recall was caused by declaring an large, multi-dimensional array in the wrong order so that interating through the array jumped across the array and kept way too many of the "live" elements of that array in the memory-resident working set of the process. I want items to be "in the cache" because I used them recently, not because the GC has yet again swapped in every single item that wasn't supposed to have already been free()d, forcing my working set to contain every single thing I've ever allocated, even if I have no use for 95% of them in the near future.

        The potential performance penalty for forcing the working set size up is several orders of magnitude. So getting a 4% performance boost in a best-case in exchange for huge performance penalties in other cases isn't a good trade-off in my book. I'm much more likely to optimize space utilization over CPU utilization because when you run up against the available cache size your performance penalty gets huge fast while decreasing CPU utilization only offers linear gains.

        I wasn't claiming that RC is wonderful and GC is horrid. I was just refuting the oft-repeated point that RC is "very slow". There are certainly advantages to GC. There are certainly advantages to RC. I'd personally rather have both with the option of choosing to not use one or the other (rebuilding the tool would probably be required to make such a choice). If I can only have one, I find it easier to deal with the limitations of RC than to lose its advantages.

        But, clearly, RC is not "very slow". And, now that you've reminded me, I don't find it hard to envision a scenario where GC would be very, very slow. So thanks. (:

        Also, it's not just in the call to malloc you're adding the increment.

        I never said otherwise. You can even see that I understand this because I later say that you'd have to be making and destroying a lot of references without doing much other real work for inc/dec to make much impact. As for your claim of "a lot more" (with an emphasized font for "lot"), I'm doubtful that such emphasis is justified. I wouldn't be surprised if the cost of malloc() still out-weighs the cost of all those inc/decs.

        - tye        

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://580015]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2021-06-17 09:33 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (83 votes). Check out past polls.