Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^4: Are monks hibernating?

by BrowserUk (Pope)
on Feb 17, 2007 at 01:25 UTC ( #600563=note: print w/replies, xml ) Need Help??


in reply to Re^3: Are monks hibernating?
in thread Are monks hibernating?

...the first time I've heard that the major bottle neck in Perl is that sub/methods calls are slow It's common knowledge that Perl is significantly slower than straight C, but are you saying that the bulk of that slow down occurs due to sub/method calls being slow?

I never said "the major bottleneck". And of course it is slower than straight C. It could not be otherwise.

However, here is a partial breakdown of the costs of subroutine and method calls, and various forms of OO:

Rate OO_7 OO_8 OO_10 OO_9 OO_6 OO_5 OO_4 OO_3 OO_2 OO_1 sub2 s +ub1 inline OO_7 70.4/s -- -0% -43% -44% -69% -70% -81% -88% -89% -91% -92% - +95% -99% OO_8 70.5/s 0% -- -43% -44% -69% -70% -81% -88% -89% -91% -92% - +95% -99% OO_10 124/s 76% 76% -- -1% -45% -46% -66% -79% -81% -84% -85% - +91% -98% OO_9 126/s 79% 79% 1% -- -44% -46% -66% -79% -81% -84% -85% - +91% -98% OO_6 227/s 222% 222% 83% 80% -- -2% -38% -62% -65% -71% -73% - +84% -96% OO_5 232/s 229% 229% 87% 84% 2% -- -37% -61% -65% -70% -73% - +83% -96% OO_4 367/s 421% 420% 195% 191% 62% 58% -- -39% -44% -53% -57% - +74% -93% OO_3 600/s 752% 752% 383% 376% 165% 159% 64% -- -9% -23% -29% - +57% -88% OO_2 656/s 832% 831% 428% 421% 189% 183% 79% 9% -- -16% -23% - +53% -87% OO_1 782/s1010% 1009% 529% 520% 244% 237% 113% 30% 19% -- -8% - +44% -85% sub2 847/s1103% 1102% 582% 572% 273% 266% 131% 41% 29% 8% -- - +39% -84% sub1 1388/s1871% 1869%1017%1001% 512% 499% 278% 131% 112% 78% 64% + -- -73% inline5179/s7252% 7246%4067%4009%2182%2134%1312% 763% 689% 562% 511% 2 +73% --

Note the big step changes between

  1. inline code and sub1; the code runs nearly 3 times more slowly.

    The 'body' of the code, a single integer increment is not worth a sub, but it is designed to emphasis the cost of the calls, and OO support code.

  2. sub1 & sub2; name the parameters and your code runs 5 times slower.
  3. OO_3 & OO_4; use getters and setters and your code is now 13 times slower.
  4. OO_6 & OO_9; use Best Practices and your code is 40 times slower.
  5. OO_10 & OO_8; admittedly far better than traditional InsideOut objects which result in code that runs 72 times more slowly.

Of course this is 'worst case'. If your application is IO-bound, or DB-bound, or only creates a few big objects, with a few large methods and only runs for a few seconds; then none of this is of the slightest concern to you.

But if your applications are running cpu-bound over large volumes of data. If you create lots of small objects with small methods and call them lots of times. If your program runs for a few hours when coded inline, or as a few large subroutines. Then if you properly OO-ize it using Class::Std, it could end up running for a day or more for every hour the original ran.

Or to look at it another way, if you take an app that uses properly structured OO approach using Class::Std and you bastardize it to use simple subs and inline code blocks. If the original ran for 6 hours, the destructured version might end up taking 10 minutes!

The point is that even a few percent improvement in the costs of function calling would benefit every Perl application, and could have profound affect on the performance for some types of applications.

Benchmark code

#! perl -slw use strict; package OO_1; ## Blessed scalar; ## No named parameters; ## Direct attribute access (no get/set) sub new { return bless \$_[1], $_[0]; } sub plus1 { ++${ $_[ 0 ] } } package OO_2; ## Blessed scalar; ## Named parameters; ## Direct attribute access (no get/set) sub new { my( $class, $value ) = @_; return bless \$value, $class; } sub plus1 { my( $self ) = shift; ++$$self; } package OO_3; ## Blessed hash; ## Named parameters; ## Direct attribute access (no get/set) sub new { my( $class, $value ) = @_; my $self = { value => $value }; return bless $self, $class; } sub plus1 { my( $self ) = shift; ++$self->{ value }; } package OO_4; ## Blessed hash; ## Named parameters; ## lvalue get/setter sub new { my( $class, $value ) = @_; my $self = { value => $value }; return bless $self, $class; } sub value : lvalue{ my $self = shift; $self->{ value } } sub plus1 { my( $self ) = shift; $self->value()++; } package OO_5; ## Blessed hash; ## Named parameters; ## Separate Get & Set sub new { my( $class, $value ) = @_; my $self = { value => $value }; return bless $self, $class; } sub getValue { my $self = shift; $self->{ value } } sub setValue { my( $self, $newValue ) = @_; $self->{ value } = $newVal +ue } sub plus1 { my( $self ) = shift; $self->setValue( $self->getValue() + 1 ); } package OO_6; ## Inherited blessed hash; ## Named parameters; ## Separate Get & Set; No overides use base 'OO_5'; package OO_7; ## Inside-Out hash; ## Named parameters; ## Separate Get & Set my %values; sub new { my( $class, $value ) = @_; my $self = bless \$value; $values{ $self } = $value; return $self; } sub getValue{ my $self = shift; $values{ $self } } sub setValue{ my( $self, $newValue ) = @_; $values{ $self } = $newValu +e; } sub plus1{ my $self = shift; $self->setValue( $self->getValue() + 1 ); + } package OO_8; ## Inside-Out hash; ## Named parameters; ## Separate Get & Set; No overides. use base 'OO_7'; package OO_9; use Class::Std; ## Class::Std Inside Out hash; ## Named parameters; ## Separate Get & Set my %i :ATTR( :name<i> ); sub plus1 { my $self = shift; $self->set_i( $self->get_i() + 1 ); } package OO_10; ## Inherited Class::Std Inside Out hash; ## Named parameters; ## Separate Get & Set; No overides; use base 'OO_9'; package main; use Benchmark qw[ cmpthese ]; our $MAX ||= 1e3; print "MAX: $MAX"; sub plus1{ $_[ 0 ]++ } sub plus1a{ my( $i ) = @_; ++$i } cmpthese 1, { inline => q[ my $i= 0; ++$i while $i < + $MAX; print'inline:',$i;], sub1=> q[ my $i= 0; plus1( $i ) while $i < + $MAX; print'sub1: ', $i;], sub2=> q[ my $i= 0; $i= plus1a($i ) while $i < + $MAX; print'sub2: ', $i;], OO_1=> q[my $o= OO_1->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_1:', $i;], OO_2=> q[my $o= OO_2->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_2: ', $i;], OO_3=> q[my $o= OO_3->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_3: ', $i;], OO_4=> q[my $o= OO_4->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_4: ', $i;], OO_5=> q[my $o= OO_5->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_5: ', $i;], OO_6=> q[my $o= OO_6->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_6: ', $i;], OO_7=> q[my $o= OO_7->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_7: ', $i;], OO_8=> q[my $o= OO_8->new(0); my $i= 0; $i= $o->plus1() while $i < + $MAX; print'OO_8: ', $i;], OO_9=> q[my $o= OO_9->new({ i => 0 }); my $i= 0; $i= $o->plus1() while $i < + $MAX; print 'OO_9: ', $i; ], OO_10=>q[my $o= OO_10->new({ i => 0 }); my $i= 0; $i= $o->plus1() while $i < + $MAX; print 'OO_10:', $i; ], }; cmpthese -1, { inline=>q[ my $i=0; ++$i while $i < $ +MAX; ], sub1=>q[ my $i=0; plus1( $i ) while $i < +$MAX; ], sub2=>q[ my $i=0; $i= plus1a($i ) while $i < +$MAX; ], OO_1=>q[ my $o= OO_1->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_2=>q[ my $o= OO_2->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_3=>q[ my $o= OO_3->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_4=>q[ my $o= OO_4->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_5=>q[ my $o= OO_5->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_6=>q[ my $o= OO_6->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_7=>q[ my $o= OO_7->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_8=>q[ my $o= OO_8->new(0); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_9=>q[ my $o= OO_9->new({ i => 0 }); my $i=0; $i= $o->plus1() while $i < +$MAX; ], OO_10=> q[ my $o = OO_10->new( {i => 0}); my $i=0; $i= $o->plus1() while $i < +$MAX; ], };

Another way of looking at it is to take a heavily recursive function (Ackermann,) and re-write it to use exactly the same algorithm--repeating all the same motions, stacking and unstacking the same amount of data; building and unwinding the same number of code block frames--but do it without calling a subroutine. Ie. A recursive algorithm coded manually using a loop and a stack.

This shows that using the identical 'recursive' algorithm, but avoiding the sub calls, Ack( 3, 0 ) runs twice as fast. Ack( 3, 8 ) runs 4 times as fast. And by the time you get to Ack( 3, 9 ) it's a whopping 12 1/2 times faster. Note: There are no tricks here. No memoizations, or reduced recursion levels. When the truely recursive code stacks a parameter, this also stack a parameter. When it pops a parameter, this pops one. When it builds a block (sub) frame, this builds a block (while) frame. In as far as it is possible to emulate recursion through iteration this does so.

It really shouldn't be possible to emulate what Perl does when it calls a subroutine, in unoptimized, Perl code, and beat it quite so emphatically.

c:\test>b-recursive-nosub.pl Ack(3,0): 5 Ack(3,0): 5 1 trial of Recursive Ack 3,0 (227us total) 1 trial of Iterative Ack 3,0 (111us total) c:\test>b-recursive-nosub 8 Ack(3,8): 2045 Ack(3,8): 2045 1 trial of Recursive Ack 3,8 (13.698s total) 1 trial of Iterative Ack 3,8 (3.328s total) c:\test>b-recursive-nosub 9 Ack(3,9): 4093 Ack(3,9): 4093 1 trial of Recursive Ack 3,9 (162.104s total) 1 trial of Iterative Ack 3,9 (13.234s total)

The benchmark

use strict; use Benchmark::Timer; my $T = new Benchmark::Timer; no warnings 'recursion'; my($Ack,$Fib,$Tak); $Ack = sub { my ($x, $y) = @_; return $y + 1 if $x == 0; return $Ack->($x - 1, 1) if $y == 0; $Ack->($x - 1, $Ack->($x, $y - 1)); }; sub Ack { my( $x, $y ) = @_; my @stack; push @stack, $x; do { $x = pop @stack; if( $x == 0 ) { ++$y; } elsif( $y == 0 ) { push @stack, $x - 1; $y = 1; } else { push @stack, $x - 1, $x; --$y; } } while @stack; return $y; } my $n = ($ARGV[0] || 0) - 1; $T->start( 'Recursive Ack 3,' . ( $n+1 ) ); printf "Ack(%d,%d): %d\n", 3, $n + 1, $Ack->(3, $n + 1); $T->stop( 'Recursive Ack 3,' . ( $n+1 ) ); $T->start( 'Iterative Ack 3,' . ( $n+1 ) ); printf "Ack(%d,%d): %d\n", 3, $n + 1, Ack( 3, $n + 1 ); $T->stop( 'Iterative Ack 3,' . ( $n+1 ) ); $T->report;

At this point I had hoped to try and produce an annotated assembler level trace of the transitions into and out of a subroutine. I got part way (a couple of hours), through producing that and had finger trouble and managed to throw it all away. Forgive me for not starting over but it is incredibly tedious and requires a lot of concentration, and I'm simply not ready to face that again just yet.

What I will say is, based on my inspection (on my Win32/AS 811 combination), of the disassembled C that is run when a Perl level subroutine is called, there appears to be considerable scope for optimisation.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://600563]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2020-02-23 05:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (102 votes). Check out past polls.

    Notices?