in reply to Re^2: Perl XS: garbage-collecting my malloc'd buffer (perlish?)
in thread Perl XS: garbage-collecting my malloc'd buffer

I didn't put that much thought into the process, though I really despise routines that alter their parameters needlessly. (And in this case it really seems that way) The XS code could potentially be cleaned up some, with the length getting subsumed into the length of the resulting SV string, which'd be the better way to do it, I expect. Checking the want status is also reasonable, and having different returns based on it. Either of those (multiple returns, or want-based returns) are 'better' than altering the input, or at least so go my preferences.

Still, I very much mistrust code that uses returned buffers from libraries. While it's OK in some cases, I've found myself burned often enough with library buffer reuse or freeing at odd times that I much prefer making a copy of the data. It's very rarely large enough to be a performance issue, and those cases can be handled specially. It's generally better, especially for folks who aren't used to writing XS code, to make the copy and optimize it away later, since that gets them the safe case by default.

  • Comment on Re: Re^2: Perl XS: garbage-collecting my malloc'd buffer (perlish?)

Replies are listed 'Best First'.
Re^4: Perl XS: garbage-collecting my malloc'd buffer (perlish?)
by tye (Sage) on Mar 03, 2003 at 18:03 UTC

    No, you can't get Perl to use buffers allocated by other libraries (I don't think it is OK in any cases -- and I've tried and been impressed with how hard Perl makes it to even use magic to get such working). But I wasn't doing that. I was having Perl allocate the buffer and having the C routine use it.

    If you are worried about the C routine (or some other routine from that library) reallocating or free()ing the buffer passed to it, then you'll have problems in those cases even with your solution. To avoid that you'd have to copy the data out of the Perl buffer before calling subsequent routines (something that isn't even shown to be happening). If the C routine in question reallocated or freed the buffer, then the interface to the C routine would simply be broken and there would be no safe way to use it. So I don't see any problem with my approach.

    I despise XS code that is complex. If I wanted to go to the point of checking wantarray etc. then I'd have a Perl subroutine wrapper for the XS code and do such interface massaging there rather than in the XS code. (:

                    - tye
      No, you can't get Perl to use buffers allocated by other libraries
      Oh, sure you can. It's not that big a deal, though you need to mark the SV as readonly or throw some TIE magic on it to make sure the buffer doesn't go moving around without something being able to handle it. You need to be mildly careful to make sure the code's not too fragile, but that's about it.

      As for complex XS code... It's always been my view that if you're bopping back and forth between C and perl, you're making a mistake. The XS code should handle everything it needs to, and should present a simple interface. Dealing with perl's stack, return values, hashes, and other things isn't at all difficult unless you make it so, and it's been my experience that trying to not do things in XS is the easiest way to make things far more difficult than you have to.

      The C/perl boundary should be crossed only once, and the presented interface should be clean, proper, and perlish. Fear or dislike of XS or C is no reason to not do things properly, because to do so leads to messy and buggy code.

      If you had to drop to XS, you might as well do it right, rather than do a half-assed hack job that you patch up in a wrapper perl routine.

        I didn't go very far down the "TIE magic" road as I recall it prevented Perl from directly using the buffer anyway (since tied variables have to be FETCH()ed each time they are used) so I'd be happy to hear how to make a Perl scalar that contains an externally allocated buffer where read-only access to the buffer simply directly accesses the buffer just like a regular Perl scalar.

        I really think there should be a "alloc magic" that, in the case of read-only scalars would only require an external "free" be provided.

        As for our opposite ends of the XS design spectrum... I've had tons of problems with XS modules that tried to do everything in the XS code. In one paragraph you talk about TIE magic and in the next you say that "dealing with perl's ... hashes ... isn't at all difficult". I have yet to see XS code that tries to deal with hashes that manages to deal with tied hashes. The subroutines that are made easily available to XS are for dealing with vanilla things and using them makes your code break in the face of nearly trivial Perl code (such as tied or magic variables).

        I've also seen tons of broken XS code. XS code is very, very easy to get wrong. It is even easy to get it wrong in such a way that it appears to work for you but is still trivial to break (or even quite hard to use without breakage). The less XS code, the less chance of bugs. Also, the less XS code, the easier it is to work around bugs and design flaws in the module because the interesting work is done in Perl where it is much easier to understand, it works in much more cases (isn't bothered by magic and is much less likely to be bothered by changes in Perl version), is tons easier to debug, extend, etc.

        I've also seen XS code that would be very useful except that they insist on only crossing the Perl/C border once and so I'm left with the tiny scraps that the author managed to predict I might want instead of a general-purpose interface that gets the full data over to the Perl side so that I can use the full power of Perl (including other modules) to get something useful done that wasn't exactly the same thing that the original module author was thinking about.

        Nearly all XS modules I've run into have this problem. They cross from Perl to C and provide a rather dull, inconvenient, unperlish interface because it is hard to do otherwise when using just XS (and when they try to they usually don't get it quite right and so you get a sorta perlish interface that is very fragile such as requiring \%hash when providing $hashref often makes more sense and would "just work" if they'd done the hash stuff in Perl instead of in C -- plus not tollerating any magic). Then they do a poor job of trying to write C code that translates the C data into Perl data so that they only have to cross from C to Perl once.

        So I end up with an imperfectly translated representation of parts of the C data and no way to get more/better without modifying their C code, recompiling the module, etc. That usually is more work than just writing my own module (in part because I write most of my module in Perl where I can be very productive).

        If you pretend like you are using Inline::C and avoid being seduced into thinking that learning how to write non-trivial XS code is useful (much less "sexy") then you write Perl/C interfaces that use data in formats that is very easy for C to deal with and don't manipulate Perl data structures from C at all. This means that you get pretty raw data back from C and then have a separate steps that allow you to pick and choose what parts you want translated into Perl data in what ways. And then you have some real power.

        And modules by different authors that interface to the same types of routines become possible to use together because both Perl modules use data that is very close to what the C side expects and so both modules use very similar data (perhaps identical). So you can use a module that was written to allow manipulation of the permissions on Win32 files to manipulate the permissions on Win32 Registry keys.

        I suspect that "cross the line only once" is in part motivated by efficiency desires. Yet I find that it leads to inefficient designs because you translate from Perl to C and then back every time you cross. I much prefer to have the "translate Perl data to C data", "call C code", and "translate C data to Perl data" steps separate because I often want to translate Perl-to-C once and then call multiple C routines on that same data and the XS modules with complex XS code usually allow me to call multiple C routines but force me to translate twice for each such call.

        Similarly, the C routine often returns lots of data of which I only care about a few parts. If I "cross the line only once" then all of the data must be converted right away, which likely wastes time and memory. I'd rather make a call to get the data and then convert just the parts I need. If it will be common to want to convert all of the data, then feel free to also provide an efficient call that does all of the conversion with a single Perl/C line crossing. But don't make that the only option.

        I seriously doubt I'll "convert" you to my position on this. I'm not really trying to. I just wanted to make my position clearer (and perhaps convince some other readers). (:

                        - tye