edan has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I need a bit of input here...

I am writing my first perl (V5.6.1) extension...
I am calling a C-function which returns a string in pre-allocated buffer,
like this:

// my_func() puts up to len bytes into buffer int my_func(void *p_struct, char *buffer, int len)

Now, I want to call the function and return the buffer to my perl program.
My first shot at the XS code looked like this:

int my_xs_func(p_struct, OUTLIST buffer, len) void *p_struct; char *buffer; int len; CODE: buffer = (char *)malloc(len); RETVAL = my_func(p_struct, buffer, len); OUTPUT: RETVAL
This returned my int return value and the buffer to my perl program...

Of course, calling malloc() without calling free() rang
the memory-leak bells. So I poked around for a way to
write the XS code so perl will know about the buffer I malloc'd
and GC the buffer automagically...

So now, my code looks like this:

int my_xs_func(p_struct, OUTLIST buffer, len) void *p_struct; SV *buffer; int len; PREINIT: char *tmp_buffer; CODE: tmp_buffer = (char *)malloc(len); RETVAL = my_func(p_struct, tmp_buffer, len); buffer = sv_2mortal(newSVpv(tmp_buffer, 0)); free(tmp_buffer); OUTPUT: RETVAL

I think this solves my memory-leak problem, but
I am by no means sure that this is the right way to do it.
I got to this solution by poking around in comp.lang.perl, and
in perlxs, perlguts, etc., with a healthy dose of trial-and-error.

Can someone tell me if I am going in the right direction, and give a few pointers?



Replies are listed 'Best First'.
Re: Perl XS: garbage-collecting my malloc'd buffer
by Elian (Parson) on Mar 03, 2003 at 17:07 UTC
    This is almost the right way to do it. Altering parameters isn't a particularly perlish thing to do--it's a habit many people have because it's the only way to return multiple values from languages with more limited return semantics, such as C.

    What you should do instead is switch from CODE to PPCODE and XPUSH() both the status and the return SV (which you built almost correctly--calling sv_2mortal on a value in the input list is dicey, but works correctly for return values) so that your function returns two values rather than one.

    In general, you're on the right track--if you need to return a string, return an SV. (I'd go so far as to say always return an SV, but many folks would disagree)

    Anyway, the code probably ought to look like:

    SV * my_xs_func(p_struct, len) void *p_struct; IV len PPCODE: { char *tmp_buffer = (char *)malloc(len); XPUSHs(sv_2mortal(newSViv(my_func(p_struct, tmp_buffer, len) +))); XPUSHs(sv_2mortal(newSVpv(tmp_buffer, 0))); free(tmp_buffer); }

      I'd think that ($len,$buf)= my_xs_func(...) would often be inconvenient and thus not very "perlish" (Perl places a large value on programmer convenience) and I consider the interface provided by Perl's own read and related functions to be "perlish". So I'd go with the original design.

      You should also note that your code assumes that the buffer gets populated with a '\0'-terminated string. [ Update: as does the original node's second code block, so that is probably a safe assumption (: ]

      If I was sure that the values being created were very unlikely to be very large and that they didn't contain internal pointers, then I might go with copying the generated value but only if the integer returned by the function was just the length so I could use:

      my $result= my_xs_funct($struct,$len); my $outlen= length($result);
      Mileage will certainly vary.

                      - tye
        I didn't put that much thought into the process, though I really despise routines that alter their parameters needlessly. (And in this case it really seems that way) The XS code could potentially be cleaned up some, with the length getting subsumed into the length of the resulting SV string, which'd be the better way to do it, I expect. Checking the want status is also reasonable, and having different returns based on it. Either of those (multiple returns, or want-based returns) are 'better' than altering the input, or at least so go my preferences.

        Still, I very much mistrust code that uses returned buffers from libraries. While it's OK in some cases, I've found myself burned often enough with library buffer reuse or freeing at odd times that I much prefer making a copy of the data. It's very rarely large enough to be a performance issue, and those cases can be handled specially. It's generally better, especially for folks who aren't used to writing XS code, to make the copy and optimize it away later, since that gets them the safe case by default.

Re: Perl XS: garbage-collecting my malloc'd buffer (grow)
by tye (Sage) on Mar 03, 2003 at 17:09 UTC

    Your original code didn't have a memory leak. [ Update: Rather, it needn't have a memory leak. Just remove the "buffer = (char *)malloc(len);" which I skipped over when I read it. (: ]

    When you tell XS that you have a "char *" input buffer, it pulls out a pointer to the string value (if any) stored in the scalar you pass in. So that buffer is allocated just like any other scalar string value buffer in Perl and is free()d in the same situations. So you don't need to worry about a memory leak.

    You could think of your original code as "inconvenient" or even "dangerous" because it requires the caller to pass in length($buffer) or at least a value that is smaller than the size of the buffer allocated to the "buffer" scalar that is passed in.

    But I'd probably fix that by forcing the scalar to contain a buffer of the specified size before calling the C function:

    int my_xs_func(p_struct, OUTLIST buffer, len) void *p_struct; SV *buffer; int len; CODE: { char *buf = sv_grow( buffer, len ); /* was: char *buf = SvGROW( buffer, len ); */ RETVAL = my_func( p_struct, buf, len ); } OUTPUT: RETVAL
    see perlguts for more on SvGROW() (see also perlxstut, perlxs, etc.) (:

    Updated to match what I'd use based on the feedback below.

                    - tye
      I think that before you SvGROW you should call SvPV_force. Otherwise (at least this was once true) SvGrow will choke if there is not already a buffer. So, the code would be:
      CODE: { int dummylen; SvPV_force(buffer, dummylen); char *buf = SvGROW( buffer, len ); RETVAL = my_func( p_struct, buf, len ); }
        Yep, that's still the case. SvGROW checks xpv_len, which'll be some bizarre value if the SV's just an IV or NV. Undef might well give it fits too, at least so reads the 5.8.0 sources.

        That was how I had originally written the code in my head then I checked the Perl v5.6.0 source code for SvGROW() and saw that such wasn't needed. It might be a problem for Perl versions prior to that, however. I know I jumped through more hoops when I wrote similar XS code ages ago -- I was hoping it had just been my lack of understanding. (:

        Update: I was mostly checking sv_grow() which is what the SvGROW() macro calls. See elsewhere in this thread for why I'm wrong above.

                        - tye
Re: Perl XS: garbage-collecting my malloc'd buffer
by pg (Canon) on Mar 03, 2003 at 16:37 UTC
    Not about Perl, but for the c code, it is better to check whether tmp_buffer is NULL, after the malloc call. If it is NULL, then the malloc failed, and there is no point to continue, otherwise you would see core dumps somewhere down the road. It is also better to check whether it is a NULL pointer, before you free it.

    (As this is demo, I guess you might have the checking in your real code ;-), just in case…)
Re: Perl XS: garbage-collecting my malloc'd buffer
by meetraz (Hermit) on Mar 03, 2003 at 17:01 UTC
    What I usually do when I have questions like this is to examine the C source that gets generated from your XS file to make sure it all looks ok.

    As a side note, when I think of GC, I think of variables/objects tracked via reference counts, where their memory is free'd automatically by perl when their reference counts reach zero (usually from being destroyed or going out of scope). In your case, you are defining a function, not a variable, and are explicitly free()ing the buffer yourself when its no longer needed, so this isn't really garbage collection.

Re: Perl XS: garbage-collecting my malloc'd buffer
by gmpassos (Priest) on Mar 04, 2003 at 04:25 UTC
    Well, I'm wrapping wxSocket* for wxPerl. Take a look in this function. Is the wxSocketBase::Read, that write directly to a buffer point of a Perl Scalar (SV):
    long wxSocketBase::Read( buf , size , leng = 0 ) SV* buf long size long leng CODE: // Tell that the scalar is string only (not integer, double, utf8. +..) SvPOK_only(buf) ; // Get the original size: int org_size = SvCUR(buf) ; // Grow the scalar to receive the data and return a char* point: char* buffer = SvGROW( buf , size + 1 ) ; // Get the char* point of the end if you want to append data (len +> 0): if ( leng > 0 ) { buffer = SvEND(buf) ;} else { org_size = 0 ;} THIS->Read( buffer , size ) ; // Get the amount of data read in the socket: int nread = THIS->LastCount() ; if ( nread > 0 ) { // null-terminate the buffer, not necessary, but does not hurt buffer[nread] = 0 ; // tell Perl how long the string is SvCUR_set( buf , org_size + nread ) ; } // Return the amount of data read, like Perl read(). RETVAL = nread ; OUTPUT: RETVAL
    The good of this is that you can see how to write or append to a buffer. And is more Perl based.

    Note that "SvPOK_only(buf)" is important! Since if your variable is set for integer, like this:

    my $var = 123 ;
    .. you can't use SvGROW, SvCUR and SvCUR_set!

    Take a look in the POD perlapi and perlguts too.

    Graciliano M. P.
    "The creativity is the expression of the liberty".

Re: Perl XS: garbage-collecting my malloc'd buffer
by edan (Curate) on Mar 04, 2003 at 12:30 UTC

    First, I want to thank everyone for your great replies! I really learned a lot from reading what was written here.

    Okay, so I've tried to absorb what I read, and now my XS code looks something like this:

    SV * my_xs_func(p_struct, len) void *p_struct; IV len PREINIT: int bytes_read; char *tmp_buffer; PPCODE: tmp_buffer = (char *)malloc(len + 1); // + 1 for NULL if (tmp_buffer == NULL) XSRETURN_UNDEF; bytes_read = my_func(p_struct, tmp_buffer, len); // need to NULL-terminate the buffer tmp_buffer[bytes_read] = '\0'; XPUSHs(sv_2mortal(newSVpv(tmp_buffer, bytes_read))); free(tmp_buffer);

    I have incorporated the advice of Elian here, tye here, and pg here.

    I am manipulating the return stack myself, only returning the buffer (the caller can call length), and I'm checking the malloc() return value, and returning undef if it fails. my_func() returns the number of bytes written to tmp_buffer, which is guaranteed to be less than len, so the line that NULL-terminates the buffer should be safe, unless my_func() seriously misbehaves. And I'm using that value for the second argument to newSVpv() instead of zero...

    Any new critiques or comments? Does it look better?



      You don't need to '\0'-terminate the buffer. Perl doesn't care about the '\0' when reading a string value since you specified a length and Perl will append a '\0' onto the end "just in case" after it copies it.

      You could probably simplify the code slightly by using CODE: instead of PPCODE: (you are only ever returning one value) so that you just deal with RETVAL instead of manipulating the stack.

      So it looks fine.

                      - tye