|Just another Perl shrine|
Re^2: pack/unpack 6-bit fields. (precision)by tye (Cardinal)
|on Aug 18, 2004 at 07:10 UTC||Need Help??|
To me, a "bit string" means a string composed of the characters "0" and "1", encoding one bit of value per byte (character) of the string. And the author of the pack documentation agrees with me.
Of course, any contiguous data structure could be considered to be "a string of bits" so I much prefer to call such things "base-2" ("representations" or "strings" depending on context), which also conveniently tells us that we have most-significant bit first.
But some people conceive of "bit strings" that aren't base-2 representations, including (out of necessity) the author(s) of pack (though they should have used "b" for "base-2" and "B" for "reversed base-2" instead of the opposite!).
Also note that pack and unpack always start at the first byte (or sometimes word, etc) of the string and work toward the end, so "B32" is great for dealing with big-endian (most-significant byte first) d-words (4-byte unsigned integer values) while "b32" and little-endian multi-byte values usually require awkward insertions of reverse here or there (or both, if only to increase the fun).
"from a 12-byte string" (yes, I added that important hyphen and I'll do it again, just you watch!) tells me we aren't starting with a "bit string" but it doesn't tell me whether we are to treat those 12 bytes as most- or least-significant byte first.
In one respect, we shouldn't have to worry about bit order, unless someone designed a protocol so broken as to pack the more-significant bits of the input (6-bit) fields into the less-significant bits of the bytes in the packed string1 -- in which case they need to be fired and/or publicly chastized.
It also doesn't tell me if we want the 6-bit fields returned with the most- or least-significant field first. A more subtle concern is whether the first 6-bit value is packed into the low 6 bits or the high 6 bits of the first byte.
Alternately, you can combine these byte-order, field-order, and high-/low-first questions together and rephrase them as a question about bit-order and presume that 1) the first 6-bit field gets encoded into the first byte of the packed string and 2) that the same bit-order is used for the 6-bit values and for the 8-bit bytes (else more firings/chastizing).
So the two sane possibilities (illustrated as a choice between bit order) are:
Which also hints at how to use un/pack to get either translation (and Errto did a fine job demonstrating one of them).
Another question is whether each 6-bit value should be packed into a one-byte string or be a numeric value. My first guess would be numeric values (Errto guessed packed one-byte strings, perhaps correctly).
Even if we restrict ourselves to one interpretation of the question, there are quite a few ways to go about the task and it is a bit hard to pick between the Ways To Do It.
But here's one way:
And you can replace "B*" and "B6" with "b*" and (or!) "b6" to get some less-sane translations.
1 Note that I call our 12-byte string a "packed string". Some would call this a "binary string", but "binary" is so horribly ambiguous when dealing with pack that I just avoid using it at all. "Packed" means that the string can contain arbitrary byte values, that it isn't necessarily just "text" a.k.a. "just printable characters".