Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

how to unpack a C struct with length array preceding data array

by johnlumby (Initiate)
on May 22, 2013 at 19:28 UTC ( #1034810=perlquestion: print w/ replies, xml ) Need Help??
johnlumby has asked for the wisdom of the Perl Monks concerning the following question:

my perl app receives a buffer containing this format

{ short lengths[2]; char data[???]; } where there are two items in data, of length lengths[0] and lengths[1] respectively.

I have tried and tried to unpack this into 4 perl variables in a single unpack() call and failed. I had to resort to two unpack()'s, the first to extract the lengths followed by something like

eval('($junk1 , $junk2, $data0, $data1) = unpack "ssA' . $data0_length + . "A" . $data1_length . '",$rcvd_buf;');

which works fine, but can some kind Monk tell me how to do this in one unpack.

Comment on how to unpack a C struct with length array preceding data array
Select or Download Code
Reaped: Re: how to unpack a C struct with length array preceding data array
by NodeReaper (Curate) on May 22, 2013 at 20:01 UTC
Re: how to unpack a C struct with length array preceding data array
by AnomalousMonk (Abbot) on May 22, 2013 at 21:34 UTC
    I had to resort to two unpack()'s ... how to do this in one unpack.

    Doing this in one statement and with one unpack, if it's even possible (and I don't think it is (Update: but I think wrong: see ig's reply)), might qualify as a neat hack, but it's also a hack which you should then look at, sigh, and set aside in favor of a maintainable two-statement solution. What is the practical advantage of doing this in one statement?

    Update: I think I would take a slightly different approach to unpack-ing the final data strings:

    >perl -wMstrict -le "my $struct = qq{\x0d\x00\x0b\x00Just Another Perl Hacker}; ;; my @len = unpack 'ss', $struct; print qq{@len}; ;; my @data = unpack qq{(x[s])2 A$len[0] A$len[1]}, $struct; print qq{'$data[0]' '$data[1]'}; " 13 11 'Just Another' 'Perl Hacker'

    (Note that  '(x[s])2' could just as well be  'x[s] x[s]'.)

      sorry for delay acknowledging your and others' posts - I mistook the first response to mine as some kind of disapproval

      As to your

      What is the practical advantage ...
      I view the task of performing the unpacking of the string as logically one task, not two, and so perhaps it is clearer to any reader how this task is being accomplished if done in one reasonably intuitive statement rather than two. If by practical you were thinking more of performance, then none.

      And I omitted to say originally that what encouraged me to believe that such a statement might exist is this from perlfunc "pack"

      For "unpack", an internal stack of integer arguments unpacked so far is used. You write "/"sequence-item and the repeat count is obtained by popping off the last element from the stack. The sequence-item must not have a repeat count.
      I was not sure what this meant but it seemed relevant. I tried using it in various ways but all failed. I did not find any elucidation of this in perltut. I *think* some of the suggestions offered in this thread are using such sequence-items but I'm not sure; and if so, I'm also not sure how the argument numbers are being pushed onto the stack.

      If anyone can explain what the quoted text means (specifically the integer arguments) or have a simpler example, I would appreciate that.

        If anyone can explain what the quoted text means (specifically the integer arguments) or have a simpler example, I would appreciate that.

        Re:"For "unpack", an internal stack of integer arguments unpacked so far is used. You write "/"sequence-item and the repeat count is obtained by popping off the last element from the stack."

        It means that the template character before the '/', must be one that unpacks an integer. That value is then used as the repeat count for the template character after the '/'.

        So, if you use a template of 'C/A' the 'C' will unpack the first character of the string as a number and then its value will determine how many characters of the string are used by the 'A'.

        Eg. Below, the C unpacks the first character "\x05", which is then used as the repeat count for the 'A', resulting the next 5 characters being unpacked.

        say unpack "C/A", "\x05hellofred";; hello

        Note. The reference to "the stack" refers to Perl's argument stack. As unpack processes each element of its template, the values extracted from the input string are pushed onto that stack, so that when the function returns, they are returned to the caller.

        When the '/' is encountered, the last value pushed onto the stack is popped off again and used as the repeat count for the next template character. Hence, in the example above, the value '5' generated by the 'C' is not returned to the caller.

        The <t1>/<t2> combination (where t1 can be any template char that results in a numeric value) is very useful for streaming communications protocols where you frequently have length-prefixed data items: <len1><data1><len2><data2>.

        Unfortunately, there is no sensible way to use it for your struct format where you have <len1><len2><data1><data2>. (That is, I don't consider the absolute positioning solution above a "sensible" approach as it is not really extensible to more than 2 len/data pairs.

        It will almost certainly always be quicker and cleaner to do that in two steps:

        my( $len1, $len2 ) = unpack "SS", $packed; my( $data1, $data2 ) = unpack "x[SS] A$len1 A$len2", $packed;

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: how to unpack a C struct with length array preceding data array
by ig (Vicar) on May 22, 2013 at 23:32 UTC

    If you really want to do it in one unpack you can...

    use strict; use warnings; use Data::Dumper; my @arr = unpack('(W W @0Wx/a @1W@0Wx/x/a)', "\005\006Hello World"); print Dumper(\@arr);

    Which produces:

    $VAR1 = [ 5, 6, 'Hello', ' World' ];

    But I would never do that, except to prove that I could.

Re: how to unpack a C struct with length array preceding data array
by Anonymous Monk on May 23, 2013 at 03:12 UTC
Re: how to unpack a C struct with length array preceding data array
by bulk88 (Priest) on May 23, 2013 at 11:17 UTC
    pack

    The / template character allows packing and unpacking of a sequence of items where the packed structure contains a packed item count followed by the packed items themselves. This is useful when the structure you're unpacking has encoded the sizes or repeat counts for some of its fields within the structure itself as separate fields.

      Problem with that is it only works(*) where the structure has len/data len/data; but the OP has len1/len2 data1/data2.

      (*Barring playing games with relative or absolute positioning per AnomalousMonk ig's reply.)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I'm playing games with relative or absolute positioning?!? Have you seen ig's reply? I haven't even begun to understand what's going on there, although to be fair, I haven't really had time yet to study it: I don't know the significance of the  'W' specifier, and the  '@1W@0Wx/x/a' bit looks interesting, and the whole thing is enclosed in a  '(...)' group to which no quantification seems to be applied... Hmmm, interesting...

Re: how to unpack a C struct with length array preceding data array
by AnomalousMonk (Abbot) on May 28, 2013 at 04:25 UTC

    From Re^4: how to unpack a C struct with length array preceding data array (emphases added):

    I do agree with ig though, you should avoid [ig's] solution if possible, it's hard to read, and it would be difficult to expand it for a string of three parts or more.

    In the spirit of ig's ingenious reply and of Anonymonk's excellent explication, here's a version that unpacks four data items in a single unpack statement (the initial  s4 is optional if you don't care about all the length info):

    >perl -wMstrict -MData::Dump -le "my $str = qq{\x05\x00\x08\x00\x05\x00\x06\x00Just Another pack Hacker +}; ;; my $template = q{ ( s4 @0 x[s0] s @0 s0 x[s4] /a @0 x[s1] s @0 s1 x[s3] /x /a @0 x[s2] s @0 s2 x[s2] /x /x /a @0 x[s3] s @0 s3 x[s1] /x /x /x /a ) }; ;; my @ra = unpack $template, $str; dd \@ra; " [5, 8, 5, 6, "Just ", "Another ", "pack ", "Hacker"]

    As you can see, this is pretty orthogonal and easily generalized to any number of data items. (I can post the code to generate the unpack template for any array 'type' and any number of data items on my scratch pad if anyone's interested.) It also involves a lot of shuttling back and forth to pick up and use the various unpack elements, and this has the odor of wasted motion. One can also see how such a template could grow rather unwieldy if one were dealing with 400 or 4000 data items rather than just four.

    All the admonitions remain in force: this is a trick you should only try at home, never in any public, much less any occupational, venue.

    I suspect that a much more concise version of this exists. Ideally, I would like to wind up with a template looking something like
        ((clever unpack template)n)
    where n is the number of data items to be unpacked; i.e., to roll up what is effectively an unrolled loop. Unfortunately, I just can't see my way clear to this solution. I'll keep trying...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1034810]
Approved by Paladin
Front-paged by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (11)
As of 2014-12-23 02:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (133 votes), past polls