Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Using pack to evaluate text strings as hexadecimal values

by Anonymous Monk
on Mar 07, 2006 at 16:26 UTC ( #534961=note: print w/replies, xml ) Need Help??

in reply to Using pack to evaluate text strings as hexadecimal values

you could use unpack instead of hex; pack is definitly wrong.
remember: hex is a string (representation) of an (integer) value.
pack creates a (one) string out of (multiple) values.
unpack creates (multiple) values out of a (one) string.

Therefore it is better to wrap the var in () in a call to unpack.
  • Comment on Re: Using pack to evaluate text strings as hexadecimal values

Replies are listed 'Best First'.
Re^2: Using pack to evaluate text strings as hexadecimal values
by jpl (Monk) on Mar 21, 2011 at 18:11 UTC

    With apologies for resurrecting a dead horse (and mangling metaphors), pack/unpack give us the capability of going back and forth between strings and hex digits (H and h), and between strings and integers (i and n and friends), but not, as far as I can tell, between hex digits and integers, which I think was the original intent of davis. Yes, hex() and sprintf() let us do these, but there's no reason in principle why there couldn't be a pack/unpack format item to do the job.

    Why bother, one might ask? I want to keep the "external form" of my records as pure ascii, so I can use all my favorite UNIX utilities on them. So how do I represent integer bit masks? I cannot use formats like n blindly, lest some bit mask results in a byte that looks like a newline or a null. I can "or in" 6 bits to 0x20 (ord(' ')), avoiding the high bit to stay ascii and the constant x20 bit, to avoid all the nasty control characters, but A) it's unlikely to port well to a non-ascii system and, more important, B) it's not at all easy to see what bits are on and what bits are off. The same applies to uuencoded masks using format u. I could represent the mask as a fixed width digit string, but that's B) still difficult to see what bits are on and off, and C) it takes a lot of characters to encode a few bits (5 digit numbers to accommodate 16 bits). I'd be happy to trade off the extra characters to represent the bits in hex, for the benefit of easy determination of what's on and what's off.

    I can so this using hex() and sprintf(), but then I need a way to fiddle records before packing and after unpacking. Am I alone in wishing there were a pack format item for the conversion, so I could do the whole job with pack and unpack?

      What doesn't this do that you'd like it to do?

      $n = 0; $n |= 1 << $_ for 1,3,5,7;; print $n, unpack 'H2', pack 'v', $n;; 170 aa

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        It doesn't give me a single format string that I can use to convert back and forth between "records" and "fields".

        @fields = unpack($format, $record); $record = pack($format2, @fields);

        (I actually use hash slices rather than arrays, but that's not germane here.) I don't mind having distinct formats. For all-alpha fields, I sometimes do

          ($format3 = $format) =~ s/A/a/g;

        so I can generate a non-trimming unpack format from the default, a convenience, but not absolutely essential, since setting up the formats is a one-time thing, but conversion is a per-record thing.

      I'm going to "pop this up a few levels", because the exchange between BrowserUk and me isn't easily visible any more. Here's a (somewhat edited) version of the last couple exchanges. Sorry, my html smarts don't include making the quotations evident, and the Formatting Tips didn't help a lot.

      (jpl) Why couldn't we have a format item "Y" (where "Y" is some unused format character, finding which may be the real problem) such that

      pack("Y4", 32); # produces "0020" unpack("Y4", "0020"); # produces 32

      There are many format characters, v among them, that pack integers into strings and unpack strings into integers. I believe the only practical difference between "Y" and "v" is the contents of the string. The one produced by "v" may be unsuitable for display, the one produced by "Y" would be both displayable and, when displayed, indicative of the bits in the integer from which it was produced.

      (BrowserUk:) It could work and would be useful. It would be a departure from the norm of converting numerics to and from their binary representation.

      There are some issues as to what should happen if you specified pack 'Y4', 100000; or pack 'Y4', 32.0 but actually that perhaps suggests a way around the lack of remaining letters that also has an existing precedent.

      To skip a complex structure--say consisting of 2 shorts and float--the syntax is X[vvf], meaning skip enough bytes to cover 2 shorts and a float. Ie. 2+2+4 = 8 bytes.

      To get your hexified numeric, the syntax could be pack 'H[v]', 32, meaning treat the number as a a 16-bit int and hexify it; thereby producing your 4 bytes of output.

      The nice thing is that this then extends naturally to pack 'H[V]', 32; to produce 8-bytes of hex. And pack 'H[Q]', 123456789012345; and even pack 'H[f]', 1234.56e78; and so on.

      And once that is accepted, this further extends to the other bane of pack/unpack; binary. With 'B[v] b[V] B[d]' etc. And a quick peruse of the docs suggest] that 'O' isn't currently used, so maybe 'O[v] O[Q]' might be useful also.

      Now all you've got to do is: knock up the patch; get it by p5p; and wait for it to make it into a build :)

      (End of thread summary).

      Perhaps, rather than having to reproduce much of the complexity of the [v] [Q] etc. notation, the format item could "functionally compose" with the item that followed. (Not so different from BrowserUk's Re^3 suggestion, but all carried out within pack/unpack.) So, whatever string the following pack format item produces gets turned into printable hex (or octal or binary, I suppose), and unpack would turn the printable string into the (usually unprintable string) that the next item expects to unpack. I haven't actually looked at the pack/unpack code, but this has the "feel" of something that should be able to piggy-back off existing code to do all the "hard work".

        You've done no one any favours but "popping a few levels". This is a hierarchal forum and, used correctly, all posts are "visible" regardless of how deep they get,

        As for the idea of functional composition of pack formats. You'd need to post a couple of examples of your intent before I would be able to interpret your meaning?

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://534961]
[erix]: "code of someone that died" -- kinda nice if your code stops working too
[erix]: hard to implement, hmm
[Corion]: erix: Well, they also seem to have changed the server, or some software, or whatever, and seem to be in the process of changing the DB schema from having the "username" as primary key to something else.
[Corion]: Far too many things being done at once, or maybe only now has it become apparent that nobody knows that piece of software anymore
[marto]: good morning all
[Corion]: I consider having an abstract key as userid in your system good, because the "real" company-wide (or even larger) user id will likely not fit your criteria well
[Corion]: A good morning marto!

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (8)
As of 2017-01-23 09:35 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (192 votes). Check out past polls.