Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: how to unpack a C struct with length array preceding data array

by AnomalousMonk (Monsignor)
on May 22, 2013 at 21:34 UTC ( #1034820=note: print w/ replies, xml ) Need Help??


in reply to how to unpack a C struct with length array preceding data array

I had to resort to two unpack()'s ... how to do this in one unpack.

Doing this in one statement and with one unpack, if it's even possible (and I don't think it is (Update: but I think wrong: see ig's reply)), might qualify as a neat hack, but it's also a hack which you should then look at, sigh, and set aside in favor of a maintainable two-statement solution. What is the practical advantage of doing this in one statement?

Update: I think I would take a slightly different approach to unpack-ing the final data strings:

>perl -wMstrict -le "my $struct = qq{\x0d\x00\x0b\x00Just Another Perl Hacker}; ;; my @len = unpack 'ss', $struct; print qq{@len}; ;; my @data = unpack qq{(x[s])2 A$len[0] A$len[1]}, $struct; print qq{'$data[0]' '$data[1]'}; " 13 11 'Just Another' 'Perl Hacker'

(Note that  '(x[s])2' could just as well be  'x[s] x[s]'.)


Comment on Re: how to unpack a C struct with length array preceding data array
Select or Download Code
Re^2: how to unpack a C struct with length array preceding data array
by johnlumby (Initiate) on Jun 05, 2013 at 20:36 UTC
    sorry for delay acknowledging your and others' posts - I mistook the first response to mine as some kind of disapproval

    As to your

    What is the practical advantage ...
    I view the task of performing the unpacking of the string as logically one task, not two, and so perhaps it is clearer to any reader how this task is being accomplished if done in one reasonably intuitive statement rather than two. If by practical you were thinking more of performance, then none.

    And I omitted to say originally that what encouraged me to believe that such a statement might exist is this from perlfunc "pack"

    For "unpack", an internal stack of integer arguments unpacked so far is used. You write "/"sequence-item and the repeat count is obtained by popping off the last element from the stack. The sequence-item must not have a repeat count.
    I was not sure what this meant but it seemed relevant. I tried using it in various ways but all failed. I did not find any elucidation of this in perltut. I *think* some of the suggestions offered in this thread are using such sequence-items but I'm not sure; and if so, I'm also not sure how the argument numbers are being pushed onto the stack.

    If anyone can explain what the quoted text means (specifically the integer arguments) or have a simpler example, I would appreciate that.

      If anyone can explain what the quoted text means (specifically the integer arguments) or have a simpler example, I would appreciate that.

      Re:"For "unpack", an internal stack of integer arguments unpacked so far is used. You write "/"sequence-item and the repeat count is obtained by popping off the last element from the stack."

      It means that the template character before the '/', must be one that unpacks an integer. That value is then used as the repeat count for the template character after the '/'.

      So, if you use a template of 'C/A' the 'C' will unpack the first character of the string as a number and then its value will determine how many characters of the string are used by the 'A'.

      Eg. Below, the C unpacks the first character "\x05", which is then used as the repeat count for the 'A', resulting the next 5 characters being unpacked.

      say unpack "C/A", "\x05hellofred";; hello

      Note. The reference to "the stack" refers to Perl's argument stack. As unpack processes each element of its template, the values extracted from the input string are pushed onto that stack, so that when the function returns, they are returned to the caller.

      When the '/' is encountered, the last value pushed onto the stack is popped off again and used as the repeat count for the next template character. Hence, in the example above, the value '5' generated by the 'C' is not returned to the caller.

      The <t1>/<t2> combination (where t1 can be any template char that results in a numeric value) is very useful for streaming communications protocols where you frequently have length-prefixed data items: <len1><data1><len2><data2>.

      Unfortunately, there is no sensible way to use it for your struct format where you have <len1><len2><data1><data2>. (That is, I don't consider the absolute positioning solution above a "sensible" approach as it is not really extensible to more than 2 len/data pairs.

      It will almost certainly always be quicker and cleaner to do that in two steps:

      my( $len1, $len2 ) = unpack "SS", $packed; my( $data1, $data2 ) = unpack "x[SS] A$len1 A$len2", $packed;

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        ... the absolute positioning solution ... is not really extensible to more than 2 len/data pairs.

        I disagree, with a potential caveat dependent on the exact meaning of the word 'really'. I think I have shown below that an absolute positioning solution can easily (for some definition of 'easy') be generalized to any number of data items using any data length 'type'.

        But not all things that are easy are wise, and I continue to agree with you and ig that a two-step approach is almost certainly best.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1034820]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (6)
As of 2014-07-23 00:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (130 votes), past polls