Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

pack/unpack 6 bit fields.

by Anonymous Monk
on Aug 18, 2004 at 02:43 UTC ( #383837=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

what's the most efficient way to unpack 16, 6 bit fields from a 12 byte string? and vice versa? thanks.

Comment on pack/unpack 6 bit fields.
Re: pack/unpack 6 bit fields.
by Errto (Vicar) on Aug 18, 2004 at 03:42 UTC

    Assuming you are starting with a bit string $str and want to end up with an array of bit strings @arr, I would try:

    my @arr = map { pack "b6" } unpack "A6"x16, unpack "b96", $str;

    I'm not sure how best to test the efficiency of that.

    Update: tye's explanation is most likely to be more accurate. I was just taking a stab.

      To me, a "bit string" means a string composed of the characters "0" and "1", encoding one bit of value per byte (character) of the string. And the author of the pack documentation agrees with me.

      Of course, any contiguous data structure could be considered to be "a string of bits" so I much prefer to call such things "base-2" ("representations" or "strings" depending on context), which also conveniently tells us that we have most-significant bit first.

      But some people conceive of "bit strings" that aren't base-2 representations, including (out of necessity) the author(s) of pack (though they should have used "b" for "base-2" and "B" for "reversed base-2" instead of the opposite!).

      Also note that pack and unpack always start at the first byte (or sometimes word, etc) of the string and work toward the end, so "B32" is great for dealing with big-endian (most-significant byte first) d-words (4-byte unsigned integer values) while "b32" and little-endian multi-byte values usually require awkward insertions of reverse here or there (or both, if only to increase the fun).

      "from a 12-byte string" (yes, I added that important hyphen and I'll do it again, just you watch!) tells me we aren't starting with a "bit string" but it doesn't tell me whether we are to treat those 12 bytes as most- or least-significant byte first.

      In one respect, we shouldn't have to worry about bit order, unless someone designed a protocol so broken as to pack the more-significant bits of the input (6-bit) fields into the less-significant bits of the bytes in the packed string1 -- in which case they need to be fired and/or publicly chastized.

      It also doesn't tell me if we want the 6-bit fields returned with the most- or least-significant field first. A more subtle concern is whether the first 6-bit value is packed into the low 6 bits or the high 6 bits of the first byte.

      Alternately, you can combine these byte-order, field-order, and high-/low-first questions together and rephrase them as a question about bit-order and presume that 1) the first 6-bit field gets encoded into the first byte of the packed string and 2) that the same bit-order is used for the 6-bit values and for the 8-bit bytes (else more firings/chastizing).

      So the two sane possibilities (illustrated as a choice between bit order) are:

      bytes: 765432 10 7654 3210 76 543210 fields: 543210 54 3210 5432 10 543210 or bytes: 012345 67 0123 4567 01 234567 fields: 012345 01 2345 0123 45 012345

      Which also hints at how to use un/pack to get either translation (and Errto did a fine job demonstrating one of them).

      Another question is whether each 6-bit value should be packed into a one-byte string or be a numeric value. My first guess would be numeric values (Errto guessed packed one-byte strings, perhaps correctly).

      Even if we restrict ourselves to one interpretation of the question, there are quite a few ways to go about the task and it is a bit hard to pick between the Ways To Do It.

      But here's one way:

      my @fields= unpack "C*", # 16 6-bit numeric values pack "B6"x16, # string of 16 bytes, each holding a 6-bit value unpack "a6"x16, # 16 6-character base-2 strings unpack "B*", # 96-character base-2 string "twelve bytes"; # 12-byte packed string my $string= pack "B*", # 12-byte packed string pack "a6"x16, # 96-character base-2 string unpack "B6"x16, # 16 6-character base-2 strings pack "C*", # string of 16 bytes, each holding a 6-bit value @fields; # 16 6-bit numeric values

      And you can replace "B*" and "B6" with "b*" and (or!) "b6" to get some less-sane translations.

      - tye        

      1 Note that I call our 12-byte string a "packed string". Some would call this a "binary string", but "binary" is so horribly ambiguous when dealing with pack that I just avoid using it at all. "Packed" means that the string can contain arbitrary byte values, that it isn't necessarily just "text" a.k.a. "just printable characters".

        Neat solution...but when using 'B', it produces numbers greater than 6-bits can hold:

        P:\test>perl print join'|', unpack "C*", # 16 6-bit numeric values pack "B6"x16, # string of 16 bytes, each holding a 6-bit value unpack "a6"x16, # 16 6-character base-2 strings unpack "B*", # 96-character base-2 string "twelve bytes"; # 12-byte packed string ^Z 116|28|116|148|108|28|100|148|32|24|36|228|116|24|84|204

        Switching to 'b' fixes that:

        P:\test>perl print join'|', unpack "C*", # 16 6-bit numeric values pack "b6"x16, # string of 16 bytes, each holding a 6-bit value unpack "a6"x16, # 16 6-character base-2 strings unpack "b*", # 96-character base-2 string "twelve bytes"; # 12-byte packed string ^Z 52|29|23|25|44|25|23|25|32|8|22|30|52|21|54|28

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: pack/unpack 6 bit fields.
by etcshadow (Priest) on Aug 18, 2004 at 05:01 UTC
    There's also vec (perldoc -f vec). For example:
    # to turn your $str into an array @array = map { vec($str, $_*6, 6) } 0..15; # to turn an array of 6-bit items into a packed string vec($str, $_*6, 6) = $array[$_] for 0..$#array; # or simply to access the $i-th item in the $str, directly: $read_item = vec($str, $i*6, 6) = $write_item; # vec is an lvalue!
    Enjoy.
    ------------ :Wq Not an editor command: Wq

      My first guess is that you didn't try this code. For me, vec only allows field sizes that are powers of 2 (and this restriction is still in 5.8's vec docs -- the latest PM currently links to).

      Illegal number of bits in vec at ...

      - tye        

Re: pack/unpack 6 bit fields.
by shenme (Priest) on Aug 19, 2004 at 01:07 UTC
    Fieldata? .oO(The memories!)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://383837]
Approved by ybiC
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (9)
As of 2014-07-28 07:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (193 votes), past polls