Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Perl XS binding to a struct with an array of chars*

by MaxPerl (Acolyte)
on Nov 20, 2022 at 10:18 UTC ( #11148268=perlquestion: print w/replies, xml ) Need Help??

MaxPerl has asked for the wisdom of the Perl Monks concerning the following question:

Hello Seekers of Perl Wisdom,

I try to make a perl binding to the following struct

struct _Edje_Message_String_Set { int count; char *str[1]; };

At the moment my try of an implementation looks like this (only the important parts):

[...] typedef Edje_Message_String_Set EdjeMessageStringSet; MODULE = pEFL::Edje::Message::StringSet PACKAGE = pEFL::Edje::M +essage::StringSet EdjeMessageStringSet * _new(class,count, val_arr) char *class int count AV *val_arr PREINIT: EdjeMessageStringSet *message; int index; char *string; STRLEN len; CODE: message = malloc(sizeof(Edje_Message_String) + count * sizeof(char + *)); message->count = count+1; for (index = 0; index <= count; index++) { SV *tmp = *av_fetch(val_arr,index,0); string = SvPVutf8(tmp,len); message->str[index] = savepvn(string,len); } RETVAL = message; OUTPUT: RETVAL MODULE = pEFL::Edje::Message::StringSet PACKAGE = EdjeMessageSt +ringSetPtr [...] void str(message) EdjeMessageStringSet *message PREINIT: int count; char **vals; int index; PPCODE: count = message->count; vals = message->str; EXTEND(SP,count); for (index = 0; index <count; index ++) { PUSHs( sv_2mortal( newSVpv( vals[index], 0 ) )); } void DESTROY(message) EdjeMessageStringSet *message CODE: free(message);

Unfortunately I get different errors (e.g. corrupted size vs. prev_size, double_free or corruption (out), segfault, invalid pointer etc.). Following a simple test code:

use pEFL::Edje::Message::StringSet; my $i = 0; while ($i<100) { my @str = ("Hello", "Wordl", "from Perl"); my $str_msg = pEFL::Edje::Message::StringSet->new(@str); my @strings = $str_msg->str(); print "COLORS @strings\n"; $i++; } print "The script goes to the end\n";

Where is my misunderstanding?

PS:

I tried to use perl XS memory allocation, too. But this doesn't help. For example the following doesn't work:

PREINIT: [...] char **val; CODE: New(0,val,count+1, char*); New(0,message,1,EdjeMessageStringSet); message->count = count+1; for (index = 0; index <= count; index++) { SV *tmp = *av_fetch(val_arr,index,0); string = SvPVutf8(tmp,len); New(0,val[index],len,char) val[index] = savepvn(string,len); } Move(val,message->str,count+1,char*); RETVAL = message; OUTPUT: RETVAL

btw. how can one allocate memory in Perl XS with size calculated from different types (here Edje_Message_Signal_Set and char(*))?

Thank you so much for your help!!!

Max

Replies are listed 'Best First'.
Re: Perl XS binding to a struct with an array of chars*
by syphilis (Archbishop) on Nov 20, 2022 at 13:06 UTC
    Hi MaxPerl,

    It looks to me that:
    typedef Edje_Message_String_Set EdjeMessageStringSet; should instead be: typedef struct _Edje_Message_String_Set EdjeMessageStringSet;
    Also the following looks wrong to me:
    message = malloc(sizeof(Edje_Message_String) + count * sizeof(char *)) +;
    The variable "message" is a pointer to a EdjeMessageStringSet type, and memory should be assigned to it as:
    message = malloc(sizeof(EdjeMessageStringSet)); or New(0, message, 1, EdjeMessageStringSet); or (in modern perl) Newx(message, 1, EdjeMessageStringSet);
    I recommend working your way through the various issues using Inline::C.
    It's not a quick fix but, as you become familiar with it, it enables you to test out various options.
    It also enables you to post a demo script of problems you are facing. Others can then run that demo to reproduce (and hopefully assist with) the issue.

    Here's a little Inline::C demonstrating some basics:
    # struct.pl # use strict; use warnings; use Inline C => Config => BUILD_NOISY => 1, CLEAN_AFTER_BUILD => 0, USING => 'ParseRegExp', ; use Inline C => <<'EOC'; struct _Edje_Message_String_Set { int count; char *str[1]; }; typedef struct _Edje_Message_String_Set EdjeMessageStringSet; void struct_size(void) { printf("Size of _Edje_Message_String_Set struct: %d\n", sizeof(EdjeMessageStringSet) ); } EdjeMessageStringSet * _new(char * class, int count, AV * val_arr) { /* Can't be accessed directly from perl unless a typemap is provided */ EdjeMessageStringSet *message; int index; char *string; STRLEN len; Newx(message, 1, EdjeMessageStringSet); if(message == NULL) croak("Failed to allocate memory in _new function"); /* do other stuff ... */ printf("returning EdjeMessageStringSet* from _new\n"); return message; } void DESTROY(EdjeMessageStringSet * x) { /* Can't be accessed directly from perl unless a typemap is provided. Must currently be explicitly called as the EdjeMessageStringSet* object is currently "unblessed". */ Safefree(x); printf("destroyed _new EdjeMessageStringSet*\n"); } void foo(char * pv, int in, AV * arref) { EdjeMessageStringSet *m; m = _new(pv, in, arref); DESTROY(m); } EOC struct_size(); foo("hello world", 2, [1, 2]);
    After the script has been compiled, it outputs (for me):
    Size of _Edje_Message_String_Set struct: 16 returning EdjeMessageStringSet* from _new destroyed _new EdjeMessageStringSet*
    If you look in the ./_Inline/build/ directory you'll find a folder that contains (amongst other things) the XS file that Inline::C automatically generated and used.

    Cheers,
    Rob

      Dear Rob,

      Thank you so much for your hint with Inline:C. I will try this at the next opportunity

      I don't know why, but I solved my problem with the following:

      _new(class,count, val_arr) char *class int count AV *val_arr PREINIT: EdjeMessageStringSet *message; int index; SV *tmp; char *string; STRLEN len; CODE: Newx(message,1,EdjeMessageStringSet); Renewc(message,count+2, char*,EdjeMessageStringSet); if (message == NULL) croak("Failed to allocate memory in _new function\n"); message->count = count+1; for (index = 0; index <= count; index++) { tmp = *av_fetch(val_arr,index,0); string = SvPVutf8(tmp,len); message->str[index] = savepvn(string,len); } RETVAL = message; OUTPUT: RETVAL

      I have to allocate memory for the Edje_Message_String_Set struct and for the array of strings, because it is only at runtime visible, how many strings are in the array. In C this goes through malloc(sizeof(Edje_Message_String_Set) + count * sizeof(char*)). In Perl this works with the Renewc function. The only thing I don't understand is, why I had to reallocate count+1 items of char*. It should be count+1 :-S (because count starts at 1, and the index of the given Perl array starts at 0). (in a C example there is even malloced count-1 * sizeof(char*)...). But whatever, I am happy that it works now...

        I also don't understand this issue with count. If you have something like this:
        typedef struct { int count; char *str[]; } Edje_Message_String_Set;
        You would call malloc for needed memory thusly:
        malloc( sizeof(Edje_Message_String_Set) + (count-1)*sizeof(char *) );
        The sizeof(Edje_Message_String_Set) includes enough space for one integer and one pointer to char. So that is all you need for count==1. If you need an array of 2 pointers to char, then you have to allocate space for one more char*. One pointer to char is included in the smallest struct that you are able to allocate space for. You put a dimension of [1] on the array of pointers. I am not sure that you need that and a blank dimension (no number) may work? This has nothing to do with whether program indices start at zero or one - this just about how much memory do you need for X number of strings?.

        I don't see who manages the destruction of one of these things? Also who manages the memory for the strings themselves? I guess you are doing a shallow copy instead of a clone. Also, the index "for" loop looks pretty weird to me because it looks like it exceeds allocated memory bounds.

        I am curious - what sort of problem are you trying to solve with your XS code?

        Update:
        This memory allocation code looks completely wrong to me:
        I haven't used these Perl memory allocation functions, but from looking at the doc's...

        Newx(message,1,EdjeMessageStringSet); Renewc(message,count+2, char*,EdjeMessageStringSet); if (message == NULL) croak("Failed to allocate memory in _new function\n");
        Ok, with comments:
        void Newx(void* ptr, int nitems, type) void* safemalloc(size_t size) void* Renewc( void *ptr, int size, type, cast ) void Safefree(void* ptr) Newx(message,1,EdjeMessageStringSet); // allocate space for 1 EdjeMessageStringSet // this is enough for just 1 pointer to char // i.e. ok if count==1 // memory is not initialized. Renewc(message,count+2, char*, EdjeMessageStringSet); // Leak some memory from the heap. // Allocate enough space for count+2 char* // copy contents of memory from previous Newx() operation // to this newly allocated memory // Then throw pointer to this new memory away // For extra confusion, also calls "free" on the original pointer!! if (message == NULL) croak("Failed to allocate memory in _new function\n"); // Of course should have checked this after the Newx().
        Ok, so after this, message is a pointer to memory that has already been freed. Some subsequent malloc() could see this memory reassigned to that request and then you are in real trouble! What is saving the day here is that right after this newly unallocated memory block, there is an allocated memory block. Some stuff got copied into this block, but the address of this block got thrown away. So for at least a short time, you can use more memory at the address of "message".

        One issue here is that we are "cheating" by declaring a type whose size can and does actually change! You can't allocate memory for this thing using a method that expects to allocate X number of Y things. So, something like this is needed:

        EdjeMessageStringSet* m = (EdjeMessageStringSet*) safemalloc( sizeof(E +dje_Message_String_Set) + (count-1)*sizeof(char *) ); if (m == NULL) croak("Failed to allocate memory in _new function\n");

        sorry, some typos: "The only thing I don't understand is, why I had to reallocate count+2 items of char*. It should be count-1 :-S (because count starts at 1, and the index of the given Perl array starts at 0)..."

      Hi Rob!

      I think both the OP and you are not quite right about allocating memory for EdjeMessageStringSet. In my post below, see:

      EdjeMessageStringSet* m = (EdjeMessageStringSet*) safemalloc( sizeof(E +dje_Message_String_Set) + (count-1)*sizeof(char *) ); if (m == NULL) croak("Failed to allocate memory in _new function\n");
      First, we have to talk about what this means:
      typedef struct { int count; char *str[1]; // may be fine with [] or even just char* str (no su +bscript at all) } Edje_Message_String_Set;
      Normally a type has a fixed size that is known at compile time and sizeof() works just fine. That is not true in this case. Normally, I would have expected to use a fixed size Edje_Message_String_Set, like this:
      typedef struct { int count; char** string_array; } Edje_Message_String_Set;
      Now Edje_Message_String_Set is a fixed size. To instance one, you allocate memory for the type with sizeof(). Then you allocate memory for an array of pointers to strings with size of count. You put the address of this dynamically allocated array into the variable string_array.

      In theory, you can save one call to malloc() for dynamic array allocation and potentially the space for the char**, by allocating the dynamic array directly inside the Edje_Message_String_Set type rather than having a pointer to the dynamically allocated array.

      This means that you can't use a memory allocation I/F that says give me say 5 of type X. Give a "number of a type" and a type is not sufficient here because the sizeof() the type is actually unknown at compile time.

      To put some numbers on this. Rob, you have a 64 bit machine, so for count of 1, we get 16 bytes. 8 bytes for int and 8 bytes for pointer to char. If say count==3. Then we need to allocate an additional 16 bytes for 2 more char*'s. Hence, we arrive at my safemalloc() math shown above. The first 16 bytes comes from sizeof(Edje_Message_String_Set) but that is only the minimum size to cover the size of 1. Now if we had allocated say 2 * sizeof(Edje_Message_String_Set), that gives us enough space for count==3, not 2!

        Normally a type has a fixed size that is known at compile time and sizeof() works just fine. That is not true in this case.

        Well, I think that the struct, as presented in the original post, does have a definite size (of 16 bytes).
        AFAIK specifying char *str[1] is equivalent to char * str.
        I just went with the spec provided, even though it looked rather odd.

        It did occur to me that the OP might have intended char ** string_array (as you've suggested), and I probably should have pressed MaxPerl about that.
        But, either way, the struct has a definite size - and we can assign memory to it based on that size (which is 16 bytes, on my Windows 11 64-bit system).

        I've no experience with structs that might require varying amounts of memory that can't be known until runtime. (I don't assume that such cases never arise.)

        I expect that the OP's strings have been created separately.
        Therefore, the number and size of them has no impact on the struct's memory allocation - because the struct just takes a pointer to the array of strings, no matter how large that array is.

        Cheers,
        Rob
Re: Perl XS binding to a struct with an array of chars*
by swl (Vicar) on Nov 20, 2022 at 20:43 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11148268]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2022-11-26 20:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?