Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How much was unpack()ed?

by Stevie-O (Friar)
on May 21, 2004 at 14:28 UTC ( #355284=perlquestion: print w/ replies, xml ) Need Help??
Stevie-O has asked for the wisdom of the Perl Monks concerning the following question:

I thought this was a simple problem, but apparently it's tougher than I first thought.

I need to unpack() a pile of data that's in a flexible format. This data has a lot of metadata, e.g. length-of-string, type-of-next-value, etc.. One of the problems is that I don't often know *what* I'm unpacking until I just before need to unpack it.

For example, let's say one of the values was a 'type' byte. 0=byte, 1=short, 2=long, 3=ASCIIZ. I need to do this:

$t = unpack('C', $foo); if ($t == 0) { $v = unpack('C', substr($foo, 1)) } elsif ($t == 1) { $v = unpack('S', substr($foo, 1)) } elsif ($t == 2) { $v = unpack('L', substr($foo, 1)) } elsif ($t == 3) { $v = unpack('Z*', substr($foo, 1)) }
Now, that's not so hard. The problem arises when I have to fetch the NEXT value. For the first three cases it's easy (fixed lengths) but the third is variable-length. How to easily find out the place where unpack() finished, so I can pick up where it left off?

Some might offer 'well, use length($v) + 1 for Z*'. Well, that works for Z*. But what about things like $v = unpack('w', $foo)? 'w' is a variable-length encoded integer value. How do I know whether it was encoded with 1, 2, 3, or 4 bytes? Better yet, try several things - unpack('(wZ*w)5)'

--Stevie-O
$"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc

Comment on How much was unpack()ed?
Select or Download Code
Re: How much was unpack()ed?
by ambrus (Abbot) on May 21, 2004 at 16:11 UTC

    The problem here seems to be that unpack does not have a formatter similar to %n of scanf. You might try to use a regexp instead for matching a zero-terminated string, while still using pack for the other types. Then, you can use pos or $+[0] to find out how much you've read.

Re: How much was unpack()ed?
by NetWallah (Abbot) on May 21, 2004 at 16:17 UTC
    This problem has 2 aspects:
    • program structure
    • Unpacking variable data
    Take a look at how the Netpacket::* modules reduce programming complexity - they aaddress the same 2 issues.

    Offense, like beauty, is in the eye of the beholder, and a fantasy.
    By guaranteeing freedom of expression, the First Amendment also guarntees offense.
Re: How much was unpack()ed?
by Anonymous Monk on May 21, 2004 at 17:09 UTC
    To "easily find out the place where unpack() finished" you could simply consume the string as you go:
    (my($t),$foo) = unpack('Ca*', $foo); if ( $t == 3 ) { ($v,$foo) = unpack('Z*a*', $foo); }
    Unforunately Z* doesn't finish at the zero byte - it actually consumes the remainder of the input and discards everything beyond the zero.

      If the fields are prefixed with length bytes, then you can prevent 'a*' (or 'A*' or 'Z*') from consuming the rest of the string by telling format that the length byte is there.

      $x = pack 'c/a*N', '12345123451234512345', 99999; print for unpack 'c/a* N', $x; 12345123451234512345 99999

      The downside is that the length byte is then consumed. To workaround that, you have to unpack the length byte twice.

      print for unpack 'cXc/a* N', $x; 20 12345123451234512345 99999

      However, I don't see how this helps the OP as his data has a 'type' byte but no 'length' byte.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail

      What? It seems to me that Z* does stop at the zero byte, it's only Z with a finite number that does not stop. Look.

      $ perl -we '($a,$b)= unpack "Z*a*", "nine\0eight"; warn ">>$a<< >>$b<< +";' >>nine<< >>eight<< at -e line 1. $ perl -we '($a,$b)= unpack "Z8a*", "nine\0eight"; warn ">>$a<< >>$b<< +";' >>nine<< >>ht<< at -e line 1.

      This was with perl, v5.8.1 built for i686-linux.

      Update: just checked, with perl 5.6.1, I get the wrong behaviour, that is, Z* consumes all the string.

Re: How much was unpack()ed?
by meredith (Friar) on May 21, 2004 at 18:52 UTC

    Slightly off-topic: Your code block here:

    $t = unpack('C', $foo); if ($t == 0) { $v = unpack('C', substr($foo, 1)) } elsif ($t == 1) { $v = unpack('S', substr($foo, 1)) } elsif ($t == 2) { $v = unpack('L', substr($foo, 1)) } elsif ($t == 3) { $v = unpack('Z*', substr($foo, 1)) }

    could also be this:

    my @unpack_types = qw( C S L Z* ); $t = unpack('C', $foo); $v = unpack($unpack_types[$t], substr($foo, 1)); #Is $t always a num +ber?

    You would probably do the same thing later on, when you take another look at this code :) I know it's not much savings for only four types, but I don't know what you're working on -- you could have 12! :)

    mhoward - at - hattmoward.org
Re: How much was unpack()ed? (stream)
by tye (Cardinal) on May 22, 2004 at 04:34 UTC

    Whenever I do much with pack/unpack, I'm reminded that these really need to support stream operations.

    unpack needs to be able to read from a stream or, more likely, to start reading at pos($input) and to set pos($input) to note where it left off (in other words, treat the input string as a stream).

    You can already do print STREAM pack ... so pack doesn't really need to be able to write to a stream. But it'd be cool if pack supported starting at pos($string) and overwriting as many bytes after that as needed and setting pos($string) to where it left off at.

    I even started writing a module to implement such. But I didn't finish and now pack/unpack have gotten quite a bit fancier such that this either needs to be patched directly into pack/unpack or (at least) they need some introspection features added to make implementing these in a module reasonable/possible.

    For example "." could be the format for "current seek offset" which you could use like:

    my( $z, $zEnd, $i, $iEnd )= unpack "z.I.", $buf; # $zEnd == length($z) # $iEnd == 4 + $zEnd my( $z, $i )= getData(); my $buf= pack "z.I.", $z, my $zEnd, $i, my $iEnd; # $zEnd == length($z) # $iEnd == 4 + $zEnd

    - tye        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://355284]
Approved by bmcatt
Front-paged by NetWallah
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2015-07-04 13:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls