http://www.perlmonks.org?node_id=1085049

petermogensen has asked for the wisdom of the Perl Monks concerning the following question:

I thought this was an FAQ, but since I've not been able to find a single mention of the problem, I try here.

Suppose I have a C library like this:
typedef struct _person { char* name, ... } *person; typedef struct _record { int a, person *p ....} *record; int new_person(person *out_person_freeable, char *name....); int free_person(person p); int new_record(record *out_record, ....); int free_record(record r); /* free the contained person too */ int record_get_person(record r, person *out_person_notfreeable); /* out_person_notfreeable is owned by the record object and must not b +e freed */

In other words... There's several API calls. Sometime I get a pointer to an object I must free myself later - and at other times I just get to borrow a pointer to a "person" and must NOT free it.

I want to expose "person" and "record" objects to Perl through XS.

So I have an XS wrapper whit a typemap converting the returned pointers to IV and returning a blessed reference to the pointer:

OUTPUT T_MYOBJPRT sv_setref_pv($arg, \"${ntype}\", (void*)$var);

This all works fine. But when such an object reached refcount 0 and DESTROY is called I have to decide whether to call the C library free_*() functions.

And the problem is that I can't see when I get a "person" pointer I get back in DESTROY whether I'm allowed to free it.

Free'ing all "person" pointers would corrupt "record" objects still alive. Not free'ing them would leak. I would have thought there was an idiomatic way of doing this. But I haven't found one. One solution I could imagine is to use PERL_MAGIC_ext and make a note about whether to free or not. But Magic doesn't survive assignments, so it's easy to loose that information as the object is handled in Perl code.

Also, - it have to take into account both the scenarios where "record" survives the "person" and where "person" survives the "record". I know that latter is not how you would normally use this in C, but Perl code usually expects to be able to take any reference it gets with it an let reference counting handle postponing any cleanup.

So: The hackerish solution... What if the XS function returning a "person" which must not be free'd incremented the reference count on the "record" SV (thus postponing its deletion) and at the same time put a pointer to the "record" SV into the NV slot of the "person" SV. ... so when I get a "person" SV in DESTROY, I check whether it has a NV value and if it does, I don't delete it, but instead decrement the reference count on the SV pointed to by the NV value?

It's a hack ... but I don't see where else to store the back-reference for making reference counting handling this. (weakening doesn't survive copy either). NV values seem to survive assignment

But surely, there's an official recommended way to do this? ;-)

Replies are listed 'Best First'.
Re: Managing C library memory in XS
by Corion (Patriarch) on May 05, 2014 at 13:16 UTC

    As long as you always know that something returned by record_get_person may not be freed by your program, you can attack the problem by having two classes, Person::OwnedByPerl and Person::OwnedByLibrary. The destructor for Person::OwnedByPerl would call free_person(), while Person::OwnedByLibrary would have no destructor.

    Of course, you could also store that information of whether to free the person or not on the XS level by having a struct hold the person pointer and a flag whether the XS owns that person or not.

      The destructor for Person::OwnedByPerl would call free_person(), while Person::OwnedByLibrary would have no destructor.

      That would solve the problem of not freeing memory which doesn't belong to you. It wouldn't solve the problem of holdning a Perl reference to an object which used memory already freed by Record::DESTROY call.

      Of course, you could also store that information of whether to free the person or not on the XS level

      I think that has the same problem. If I create Perl code like:
      sub find_person { my $record = new Record(); my $person = $record->get_person(); return $person; } my $p = find_person(); my $name = $p->name();
      ... then I would have an invalid-read. ... and Valgrind confirms it for me.

        Aaah - I didn't understand that connection. The easy approach here would be to let Perl do all your reference counting.

        Make the person a hash entry in your (Perl) Person class, and have ->get_person also store the (Perl) Record object in the Person hash. This is most easy done on the Perl side.

        This way, as long as the Person is alive, the Record object belonging to the Person is also alive. The person data structure will then also stay valid through the Record object.

      Having a flag rather than using two classes would make more sense.
Re: Managing C library memory in XS
by dmitri (Priest) on May 05, 2014 at 12:32 UTC
    And the problem is that I can't see when I get a "person" pointer I get back in DESTROY whether I'm allowed to free it.

    This is the key paragraph and decision point in the code. You should wrap the real free_person() in XS function that would have the logic like this:

    if (person_refcnt(person) == 0) free_person(person); else ; /* Do nothing: free_record() will free the person as well */
    Then from Perl just call this wrapper and let the lower-level logic worry about it.

      I'm not sure I follow you here ... Are you suggesting to make perl code explicitly call the wrapper (instead of relying on DESTROY) and keeping a separate (non perl) refcount for every person pointer in the module?

      That's how I read your suggestion, since the Perl refcount is always zero when DESTROY is called, right...
        Are you suggesting to make perl code explicitly call the wrapper (instead of relying on DESTROY) and keeping a separate (non perl) refcount for every person pointer in the module?

        Call the wrapper from DESTROY.

        The library you are using knows about the references and counts, so you should be able to figure out in your XS code whether to really free an object (person, in this case). In Perl code (DESTROY), simply call your XS function, which will know what to do.

Re: Managing C library memory in XS
by RonW (Parson) on May 05, 2014 at 16:31 UTC

    At first pass, I would probably go with the idea of copying the data to/from Perl variables as needed.

    However, I admit I like the idea of just being able to pass a blessed pointer to Perl. So, how about a middle ground? For the free-able objects, go ahead and pass the bless pointer. For the none-free-able, allocate a struct, memcpy the data, then pass the new pointer as a blessed pointer. Since you allocated the memory, it is free-able.

      My primary problem with the copying solution is that it depends on the library providing complete deep copy functions for all relevant structs. memcpy() is not guaranteed to work.

        Ok. You didn't mention there was deep nesting.

        Somewhat less "hacky" than relying on the NV slot to not get overwritten by Perl, could you you live with an extra layer of indirection? That is, each time you get an unfree-able pointer, allocate a control structure to point to the unfree-able structure. This control structure would also have a field to hold the pointer you were going to put in the NV slot (plus any other fields you might need).

        I would go so far as to predict rather-emphatically that it wouldn’t.   And, in any case, it would be highly dependent upon (read: “would introduce a whale of a dependency upon ...”) the exact internal implementations of that library.   Warning-bells going off all over the place here.

        Also worth heavy consideration here is the temporal issue:   having been given a live memory-address, how long is that address actually going to be “good?”   Your Perl app could now potentially modify (read: “muck around with”) that memory at any future-time.   How predictably-stable do we consider that situation to be?   In other words, how bad are the nasty-bugs going to be, if when they start to occur?   These are design considerations that, to my way of thinking, absolutely overwhelm most all other concerns most of the time.   So to speak, “I don’t really care if it screws-up ‘very quickly.’ ”   I care that the solution is robust and thoroughly diagnosable.   (And yes, it depends entirely upon the library.)

Re: Managing C library memory in XS
by sundialsvc4 (Abbot) on May 05, 2014 at 14:43 UTC

    To be quite honest, these approaches smell really bad to me, because they are basically threatening to put the stability of the interpreter environment at jeopardy by obliging you, in classic “C” style, to have to constantly muck-around with memory issues ... to “have to do things in exactly the right way, throughout your program, ‘or else.’ ”

    To put it another way, you are obliging the Perl programming to know and to correctly account-for the present [memory ...] state of the API library, instead of being able to rely upon Perl-managed, garbage-collected objects as you are ordinarily able to do.   I would therefore design the API wrapper in a way that ruggedly eliminates this hazard.   In other words, don’t pass-around “pointers to” anything.   Have Perl allocate an ordinary string-type or hash-type variable and pour the values you need into it.   (Or, the reverse.)

    If the Perl-side code needs to make persistent references to some C-library object, let your wrapper code maintain its own list of those objects and to assign a short random-but-unique “moniker” string to each.   These monikers are meaningless, are known only to the wrapper and to the Perl application, and are kept in a list maintained by the wrapper.   The wrapper provides the means by which C-library objects can be created and destroyed, and, through the mechanism of monikers, to make specific references to them, but the wrapper never exposes their addresses to anyone.   (If the Perl program fails to supply a valid moniker, say a nice Perl-world exception will be thrown by the wrapper.)   When a memory-address actually gets passed to the C-library, though, the wrapper is confident that this address, which is known only to itself, is “good,” because the wrapper does know the present memory-state of the API library.   The wrapper now has the ability to clean-up any lingering library-objects reliably when the Perl environment ends, because it has a complete list of them in its monikers-table.

    In short, instead of very-dangerously exposing the C-library environment to the Perl environment, use the wrapper as a diplomat:   even though the C-world and the Perl-world do not natively speak with one another, the diplomat undertakes to provide a rugged environment that both of them can understand and which protects them from one another.   The diplomat is also the one that tries to wear the “acid-proof underwear,” absorbing any runtime exceptions that may be thrown and translating those exceptions, if they occur, into appropriate Perl-environment terms.

      It seems you are not only questioning the hack I proposed using the NV field, but the entire idea of returning C objects to perl as blessed references to SVs holding the pointer value (in the case as an IV), correct? But isn't that a rather common approach? I just checked the JSON::XS library which IMHO is one of the best working CPAN modules. It goes even a step further. It returns an entire native C JSON struct byte-for-byte as a PV string ... and the typemap INPUTs it as the SvPVX pointer. - the (char *) pointer to the Perl string only cast as a JSON*. Isn't that also pointer pass-around and very-dangerously exposing the C-library environment to the Perl environment, ?

        Basically, “yes it is.”   And JSON::XS is a wonderful high-performance encoder/decoder for that reason.   But you really aren’t dealing with “multiple persistent internal library-states” and you really aren’t updating anything, either.   Furthermore, that package is intentionally all-about-speed.   The situation suggested to me in the OP appeared to be a little bit different; hence my suggestions for what I thought might work better in the OP’s specific case.   (And they are meant to be nothing more than that.)   There is noone way to do this sort of thing.”   The best way to “manage C-library memory in XS” really depends altogether upon the library in question.