Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Handling Hex data with Dynamic unpack

by PerlJedi (Novice)
on Jul 05, 2012 at 08:51 UTC ( #979995=perlquestion: print w/ replies, xml ) Need Help??
PerlJedi has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have a binary file with the headder content (in HEX) as below (one single line).
0001000000004f914f1c0b010042534341303000000005000043413030303030000000 +014341303030313400000002434130303032320000000343413030303234000000044 +34130303032390000000100224e43656c6c3000000002004253434130310000000500 +054341303130303100000006434130313030340000000743413031303130000000084 +341303130313500000009434130313031360000000100224e43656c6c300000000300 +42534341303200000006000a
This data should be unpacked to populate the following structure.
Header Record Id : 1 Byte File Format Version : 1 Byte Timestamp : 8 Bytes No. of BSCs : 1 Byte For each BSC ... BSC Id : 1 Byte Application Version : 1 Byte BSC Name : 2 Bytes Number of Cells : 2 Bytes For each Cell ... Cell Pointer : 2 Bytes Cell Name : 9 Bytes Number of Neighbour Cells : 2 Bytes For each Neighbour Cell to this BSC ... Cell Pointer : 2 Bytes Cell Name : 9 Bytes

Here is what I have been trying...

#! /usr/bin/perl -w open(FILE, "<A20120420.1257+0100-1302+0100_group0.bin"); binmode(FILE); my $headder = <FILE>; ($headderRecordId, $fileFormatId, $timeStamp, $noOfBSCs, $specOfBSCs) += unpack ("H2 H2 H16 H2 H*", $headder);

No doubt, I can decode the binary file this way. But here is what I prefer:

1. I need to convert the $noOfBSCs to Decimal, and then use it in a loop and then decode the data for the rest of the BSCs. Same would be followed for the Cells also.

> Is there a way that I can work without converting the read hex value in loops?

2. I do see that there is dynamic unpack possible also. But then I could not understand.

Since performance is a criteria, I dont want to code with C kind of logic. Please let me know as to how it can be done better in perl. Thanks in advance, PerlJedi

Comment on Handling Hex data with Dynamic unpack
Select or Download Code
Re: Handling Hex data with Dynamic unpack
by Anonymous Monk on Jul 05, 2012 at 09:59 UTC

    No doubt, I can decode the binary file this way.

    I don't know many binary files represented as hex-text, so that won't work

    Also, the format specification doesn't specify endianess so the format spec seems incomplete

    I dont want to code with C kind of logic

    :) But but but but, aren't you dealing with C-kind of data? Maybe Convert::Binary::C can help?

      It is a BCD encoded binay file which is read in binary() mode.

      It follows a Big Endian notation.

      By C kind of logic, I meant the usual way of using loops etc.

        It is a BCD encoded binay file which is read in binary() mode.

        What does that mean?

        By C kind of logic, I meant the usual way of using loops etc.

        I think first you have to write some loops :)

        You might find the Template Grouping section of perlpacktut interesting, though I think writing some loops would be easier.

Re: Handling Hex data with Dynamic unpack
by Anonymous Monk on Jul 05, 2012 at 11:06 UTC

    This is how I would start, but I'd need more complete spec

    #!/usr/bin/perl -- use strict; use warnings; use Data::Dump; my $rawAsHex = q'0001000000004f914f1c0b0100425343413030000000050000434 +130303030300000000143413030303134000000024341303030323200000003434130 +3030323400000004434130303032390000000100224e43656c6c30000000020042534 +341303100000005000543413031303031000000064341303130303400000007434130 +31303130000000084341303130313500000009434130313031360000000100224e436 +56c6c30000000030042534341303200000006000a'; my $raw = pack 'H*', $rawAsHex; my @stack = unpack q{ A2 ### Header Record Id : 1 Byte A2 ### File Format Version : 1 Byte A16 ### Timestamp : 8 Bytes A2 ### No. of BSCs : 1 Byte ### For each BSC ... A2 ### BSC Id : 1 Byte A2 ### Application Version : 1 Byte A4 ### BSC Name : 2 Bytes A4 ### Number of Cells : 2 Bytes ### For each Cell ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes A4 ### Number of Neighbour Cells : 2 Bytes ### For each Neighbour Cell to this BSC ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes }, $rawAsHex; my $id = unpack 'C*', pack 'H*', $stack[0]; my $version = unpack 'C*', pack 'H*', $stack[1]; my $time = unpack 'H*', pack 'H*', $stack[2]; ## WHAT?! dd [ $id, $version, $time , \@stack ]; __END__ [ 0, 1, "000000004f914f1c", [ "00", "01", "000000004f914f1c", "0b", "01", "00", 4253, 4341, 3030, "000000050000434130", 3030, 3030, "000000014341303030", ], ]

      Hey Thanks Anonymous Monk ... It certainly is a good starting point. I appretiate your replies :)

      Thanks and Regards,

      PerlJedi

      Some how perl doesn't allow me to unpack like this (look at H2 H2 H16 H2 A2 ...) :
      my @stack = unpack q{ H2 ### Header Record Id : 1 Byte H2 ### File Format Version : 1 Byte H16 ### Timestamp : 8 Bytes H2 ### No. of BSCs : 1 Byte ### For each BSC ... A2 ### BSC Id : 1 Byte A2 ### Application Version : 1 Byte A4 ### BSC Name : 2 Bytes A4 ### Number of Cells : 2 Bytes ### For each Cell ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes A4 ### Number of Neighbour Cells : 2 Bytes ### For each Neighbour Cell to this BSC ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes }, $rawAsHex;

      Any idea how this could be done?

      I get a output like this :

      [ 48, 48, "3031303030303030", [ 30, 30, "3031303030303030", 30, "04", "f9", "14f1", "c0b0", 1004, "253434130300000000", 5000, "0434", "130303030300000000", ], ]

        Some how perl doesn't allow me to unpack like this (look at H2 H2 H16 H2 A2 ...) ... Any idea how this could be done?

        I showed you how. What you're dealing is a text-strings, so if you want bytes, you have to pack them. First pack them to get bytes ( pack 'H*' ) then pack them to get what you're really after ( C An unsigned char (octet) value. ) .....

        Commands

        perl -le " print unpack q{H*}, q{Y} " perl -le " print pack q{H*}, q{59} " perl -le " print ord q{Y} perl -le " print pack q{H*}, q{59} " perl -le " print unpack q{C}, pack q{H*}, q{59} "

        Session

        $ perl -le " print unpack q{H*}, q{Y} " 59 $ perl -le " print pack q{H*}, q{59} " Y $ perl -le " print ord q{Y} 89 $ perl -le " print pack q{H*}, q{59} " Y $ perl -le " print unpack q{C}, pack q{H*}, q{59} " 89

        Y encoded as hex is 59

        The numeric value ( ord ) of Y is 89

        The C An unsigned char (octet) value, 8-bits, 1-byte of Y is 89

Re: Handling Hex data with Dynamic unpack
by Marshall (Prior) on Jul 05, 2012 at 11:16 UTC
    By printing this out in ASCII hex, you have actually changed the problem.

    When you do a binmode read to a scalar, you get a $buffer where each "character" is a byte (0-255 unsigned). For actual bytes there is "nothing to be done" - it is already a byte value. For multi-byte fields some sort of unpack() is usually necessary. Use substr() to get a range of byte values. Use unpack() to convert sequences of bytes into some other representation (from little endian to big endian or whatever).

    my $HeaderRecordId = substr($buf,0,1); my $FileFormatVersion = substr($buf,1,1); my $TimeStamp = substr($buf,2,8); #some kind of unpack needed here! my $NumBSC = substr($buf,10,1);
    $HeaderRecordId $FileFormatVersion, $NumBSC are just bytes and nothing more is needed past substr().

    Update: I looked back an some ancient code (I don't deal with binary very often), but this had to do with .WAV files. My point is that substr() will get you the sequence of bytes. Here, I look for "RIFF" and "data" with string compares. The V4 unpack is for little endian conversion.

    code snippet... read(IN, my $buff, 1 * 2**10); (substr($buff,0,4) eq "RIFF") || die "not a valid RIFF file"; my $size = unpack ("V4",substr($buff,4,4)); myprint (" RIFF segment size = $size"); (substr($buff,50,4) eq "data")|| die "data segment not found"; my $dsize = unpack ("V4",substr($buff,54,4)); myprint (" DATA Segment size = $dsize");

      For actual bytes there is "nothing to be done" - it is already a byte value.

      Except unpacking -- like if you want get an actual number (big-endian 8-bit signed integer )

      binmode does not work on a scalar, only a file handle. substr can be either in utf8 mode or byte mode, and substr doesn't guarantee the returning scalar is in one or the other mode.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://979995]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2014-09-01 11:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (6 votes), past polls