Re^4: Performance, Abstraction and HOP

in reply to Re^3: Performance, Abstraction and HOP
in thread Performance, Abstraction and HOP

I can't image trees implemented with arrays being that much more memory hungary than arrays alone.

Don't imagine--measure :)

P:\test>junk
Array[ 1..10000]: 200056
Sum @array = 50005000
Tree[ 1..10000 ]: 1120016
Sum tree = 50005000
        Rate  tree array
tree  17.1/s    --  -82%
array 95.1/s  455%    --
[download]

I think that 6x bigger and 5x slower pretty much makes my point.

#!/usr/bin/perl -lw
use strict;
use List::Util qw[ sum shuffle ];
use Benchmark qw[ cmpthese ];
use Devel::Size qw[ total_size ];

my $MAX = 10000;
our @array = 1 .. $MAX;
our $tree = make_tree($MAX/2,undef,undef);

$tree = insert( $_, $tree) for shuffle 1 .. $MAX;

print "Array[ 1..$MAX]: ", total_size( \@array );
print "Sum \@array = ", sum( @array );
print "Tree[ 1..$MAX ]: ", total_size( $tree );
print "Sum tree = ", sum_tree( $tree );

cmpthese -1, {
    tree => q[ my $sum = sum_tree( $tree ); ],
    array=> q[ my $sum = sum @array ],
};

exit;

sub sum_tree
{
    my $tree = shift;

    return 0 if not defined($tree);
    return node($tree) + sum_tree(right($tree)) + sum_tree(left($tree)
+);
}

sub insert
{
    (my $elem, my $tree) = @_;

    if(not defined($tree)){ return make_tree($elem, undef, undef); }

    my $curr = node($tree);
    if( $elem == $curr) { return $tree; }
    elsif($elem < $curr)
    {
        return make_tree($curr,
                         insert($elem,left($tree)),
                         right($tree));
    }
    elsif($elem > $curr)
    {
        return make_tree($curr,
                         left($tree),
                         insert($elem,right($tree)));
    }
}

sub make_tree { [$_[0], $_[1], $_[2]] }
sub node  { $_[0]->[0] }
sub left  { $_[0]->[1] }
sub right { $_[0]->[2] }
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

Comment on Re^4: Performance, Abstraction and HOP Select or Download Code

Replies are listed 'Best First'.
Re^5: Perl memory usage by Anonymous Monk on Sep 01, 2005 at 20:02 UTC
I think that 6x bigger and 5x slower pretty much makes my point. Holy crap, now I'm curious. Only 5x slower is faster than what I would have guessed. But 6x larger is crazy. Anyone know how much boxing Perl does? I would have thought that the rough estimate for the size of a scalar number would be say 16 bytes (4 bytes for a pointer, 4 bytes for type/reference-count information, 8 bytes for a double precision floating point number). And the overhead for an array at maybe 16 bytes (8 bytes for a length field, 4 bytes for type/reference-count info, 4 bytes for a pointer to the array of pointers). For the tree structure above (an array composed of one scalar and two arrays) that would be 16 (array overhead)+34(three elements in first array, 4 byte pointers (32 bit machine))+16(the scalar)+216(the left and right branches) = 76 bytes. I guess that's starting to add up, but it is still shy of the 112 bytes measured above. Perl must preallocate space for each array to make growing it faster (maybe 12 elements initially?). Does that sound about right? Any way to get a more slimed down data structure in pure Perl?	[reply]
Re^6: Perl memory usage by kscaldef (Pilgrim) on Sep 02, 2005 at 08:26 UTC
Is 6x really that crazy? Doesn't a typical generic C tree use about 4x? (If you're storing integers in the tree. If you are storing something larger, the overhead is smaller.) The perl array includes at least a reference count and a length that aren't in the C Node struct, so 6x quickly becomes plausible.	[reply]
Re^7: Perl memory usage by Anonymous Monk on Sep 02, 2005 at 14:56 UTC
Well, on a 32bit machine, it is 3x larger... `struct tree { int elem; struct tree left, right; }; int main(int argc, char **argv) { printf("sizeof(int)=%d,sizeof(struct tree)=%d\n", sizeof(int),sizeof(struct tree)); }` [download] ...of course, the 12 bytes from above is a tad bit slimmer than the apparently 112 byte structure on the perl side.	[reply] [d/l]
Re^8: Perl memory usage by kscaldef (Pilgrim) on Sep 02, 2005 at 17:30 UTC
Re^9: Perl memory usage by Anonymous Monk on Sep 02, 2005 at 17:48 UTC

In Section Meditations