http://www.perlmonks.org?node_id=751048


in reply to Inline::C's AoA is much bigger than Perl's

Update:Actually the problems with the code below, I just discovered, are far more serious than a mere memory leak. It blows up if I try to accesses the elements in the table... I guess it's time for me to call it a day!

OK, I'm replying to myself here...

After I posted the original query, I tried a different tack. I streamlined the C function make_aoa_c, as follows:

SV *make_aoa_c( int n_rows, int n_cols ) { int i, j; char *foo = "foo"; AV *table = newAV(); AV *row; for ( i = 0; i < n_rows; ++i ) { row = ( AV * ) sv_2mortal( ( SV * ) newAV() ); for ( j = 0; j < n_cols; ++j ) { av_push( row, newSVpv( foo, 0 ) ); } av_push( table, sv_2mortal( newRV( ( SV * ) row ) ) ); } return newRV( ( SV * ) table ); }
This took care of the size problem for the most part, and greatly improved the speed (though it's still slower than Perl).

Unfortunately, the code has sprung a small memory leak that I can't identify! Changing the number of repetitions to 10, now the output for the C case looks like this:

% perl test_aoa.pl 1 1: 78844 (280041 us) 2: 78884 (247865 us) 3: 78892 (237725 us) 4: 78900 (245755 us) 5: 78908 (235251 us) 6: 78916 (235712 us) 7: 78924 (246926 us) 8: 78932 (237128 us) 9: 78940 (237369 us) 10: 78948 (238528 us)
With every iteration, the memory grows by at least 8kb.

I have tried adding sv_2mortal around various items in the code (e.g. AV *table=(AV *)sv_2mortal((SV *)newAV())), but I get errors like:

Attempt to free unreferenced scalar: SV 0x5b2c620, Perl interpreter: 0 +x603010 at test_aoa.pl line 22.
If anyone can spot the memory leak here, I'd much appreciate it!

the lowliest monk

Replies are listed 'Best First'.
Re^2: Inline::C's AoA is much bigger than Perl's
by almut (Canon) on Mar 16, 2009 at 23:59 UTC

    The following routine works:

    SV *make_aoa_c( int n_rows, int n_cols ) { int i, j; char *foo = "foo"; AV *table = newAV(); AV *row; for ( i = 0; i < n_rows; ++i ) { row = newAV(); for ( j = 0; j < n_cols; ++j ) { av_push( row, newSVpv( foo, 0 ) ); } av_push( table, newRV_noinc( row ) ); } return newRV_noinc( table ); }
    $ ./751041.pl 1 1: 78836 (233391 us) 2: 78872 (216509 us) 3: 78872 (206672 us) 4: 78872 (206775 us) 5: 78872 (206308 us) 6: 78872 (207777 us) 7: 78872 (206677 us) 8: 78872 (206739 us) 9: 78872 (206675 us) 10: 78872 (205979 us)

    No memory leak, and if I dump $table (e.g. using Data::Dumper, with a size of 3 x 3 or so), it holds the expected data...

    (I think you were just doing more mortalizing than necessary...  The newRV_noinc makes sure that the arrays' reference counts stay at 1, so they'll get freed, when the respective outer structure is being freed.)

    For comparison, the pure-Perl implementation (still slightly faster/smaller):

    $ ./751041.pl 0 1: 78696 (213442 us) 2: 78704 (208485 us) 3: 78704 (175431 us) 4: 78704 (175438 us) 5: 78704 (175422 us) 6: 78704 (175486 us) 7: 78704 (175667 us) 8: 78704 (175647 us) 9: 78704 (175682 us) 10: 78704 (175687 us)
      The remaining difference in memory is due to the Perl version knowing exactly how big the array will be from the start (because the whole list is assigned in one go):
      FILL = 999 MAX = 999

      The C version causes the arrays to grow and leaves space for growth:

      FILL = 999 MAX = 1021

      By pre-extending the arrays,

      SV *make_aoa_c( int n_rows, int n_cols ) { int i, j; char *foo = "foo"; AV *table = newAV(); av_extend(table, n_rows-1); /* <---------- */ for ( i = 0; i < n_rows; ++i ) { AV *row = newAV(); av_extend(row, n_cols-1); /* <---------- */ for ( j = 0; j < n_cols; ++j ) { av_push( row, newSVpv( foo, 0 ) ); } av_push( table, newRV_noinc( row ) ); } return newRV_noinc( table ); }

      both the Perl and the C data structures are identical.

      FILL = 999 MAX = 999

      and the process that calls the C version uses less memory (perhaps from reduced stack usage?)

      $ perl test_aoa.pl 1: 78688 (184871 us) 2: 78696 (298376 us) 3: 78696 (196999 us) 4: 78696 (204391 us) 5: 78696 (225786 us) $ perl test_aoa.pl use_xs 1: 78604 (321481 us) 2: 78616 (360377 us) 3: 78616 (219468 us) 4: 78616 (211587 us) 5: 78616 (209231 us)

      The times are comparable, but note this it a busy machine.

      The remaining difference in speed is most likely due to strlen of "foo" being recomputed every time in the inner loop (i.e. the zero in newSVpv(foo, 0) ).

      Precomputing it once (as Perl can do too, because the "foo" in "foo" x $n_cols is by definition fix) — i.e.

      SV *make_aoa_c( int n_rows, int n_cols ) { int i, j; char *foo = "foo"; AV *table = newAV(); AV *row; int len = strlen(foo); av_extend(table, n_rows-1); for ( i = 0; i < n_rows; ++i ) { row = newAV(); av_extend(row, n_cols-1); for ( j = 0; j < n_cols; ++j ) { av_push( row, newSVpv( foo, len ) ); // or newSVpvn(...) } av_push( table, newRV_noinc( row ) ); } return newRV_noinc( table ); }

      makes any XS vs. Perl speed difference go away (or at least statistically insignificant).

      (Without this optimisation I did observe a small, but consistent difference — approx. 5% on average.)