Handling Hex data with Dynamic unpack

PerlJedi has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have a binary file with the headder content (in HEX) as below (one single line).

0001000000004f914f1c0b010042534341303000000005000043413030303030000000
+014341303030313400000002434130303032320000000343413030303234000000044
+34130303032390000000100224e43656c6c3000000002004253434130310000000500
+054341303130303100000006434130313030340000000743413031303130000000084
+341303130313500000009434130313031360000000100224e43656c6c300000000300
+42534341303200000006000a
[download]

This data should be unpacked to populate the following structure.

Header Record Id : 1 Byte
File Format Version : 1 Byte
Timestamp : 8 Bytes
No. of BSCs : 1 Byte
    For each BSC ...
    BSC Id : 1 Byte
    Application Version : 1 Byte
    BSC Name : 2 Bytes
    Number of Cells : 2 Bytes
        For each Cell ...
        Cell Pointer : 2 Bytes
        Cell Name : 9 Bytes
        Number of Neighbour Cells : 2 Bytes
            For each Neighbour Cell to this BSC ...
            Cell Pointer : 2 Bytes
            Cell Name : 9 Bytes
[download]

Here is what I have been trying...

#! /usr/bin/perl -w

open(FILE, "<A20120420.1257+0100-1302+0100_group0.bin");

binmode(FILE); 

my $headder = <FILE>;

($headderRecordId, $fileFormatId, $timeStamp, $noOfBSCs, $specOfBSCs) 
+= unpack ("H2 H2 H16 H2 H*", $headder);
[download]

No doubt, I can decode the binary file this way. But here is what I prefer:

1. I need to convert the $noOfBSCs to Decimal, and then use it in a loop and then decode the data for the rest of the BSCs. Same would be followed for the Cells also.

> Is there a way that I can work without converting the read hex value in loops?

2. I do see that there is dynamic unpack possible also. But then I could not understand.

Since performance is a criteria, I dont want to code with C kind of logic. Please let me know as to how it can be done better in perl. Thanks in advance, PerlJedi

Comment on Handling Hex data with Dynamic unpack Select or Download Code

Replies are listed 'Best First'.
Re: Handling Hex data with Dynamic unpack by Marshall (Canon) on Jul 05, 2012 at 11:16 UTC
By printing this out in ASCII hex, you have actually changed the problem. When you do a binmode read to a scalar, you get a $buffer where each "character" is a byte (0-255 unsigned). For actual bytes there is "nothing to be done" - it is already a byte value. For multi-byte fields some sort of unpack() is usually necessary. Use substr() to get a range of byte values. Use unpack() to convert sequences of bytes into some other representation (from little endian to big endian or whatever). `my $HeaderRecordId = substr($buf,0,1); my $FileFormatVersion = substr($buf,1,1); my $TimeStamp = substr($buf,2,8); #some kind of unpack needed here! my $NumBSC = substr($buf,10,1);` [download] $HeaderRecordId $FileFormatVersion, $NumBSC are just bytes and nothing more is needed past substr(). Update: I looked back an some ancient code (I don't deal with binary very often), but this had to do with .WAV files. My point is that substr() will get you the sequence of bytes. Here, I look for "RIFF" and "data" with string compares. The V4 unpack is for little endian conversion. `code snippet... read(IN, my $buff, 1 * 2**10); (substr($buff,0,4) eq "RIFF") \|\| die "not a valid RIFF file"; my $size = unpack ("V4",substr($buff,4,4)); myprint (" RIFF segment size = $size"); (substr($buff,50,4) eq "data")\|\| die "data segment not found"; my $dsize = unpack ("V4",substr($buff,54,4)); myprint (" DATA Segment size = $dsize");` [download]	[reply] [d/l] [select]
Re^2: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 11:20 UTC
For actual bytes there is "nothing to be done" - it is already a byte value. Except unpacking -- like if you want get an actual number (big-endian 8-bit signed integer )	[reply]
Re^2: Handling Hex data with Dynamic unpack by patcat88 (Deacon) on Jul 05, 2012 at 18:36 UTC
binmode does not work on a scalar, only a file handle. substr can be either in utf8 mode or byte mode, and substr doesn't guarantee the returning scalar is in one or the other mode.	[reply]
Re: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 09:59 UTC
No doubt, I can decode the binary file this way. I don't know many binary files represented as hex-text, so that won't work Also, the format specification doesn't specify endianess so the format spec seems incomplete I dont want to code with C kind of logic :) But but but but, aren't you dealing with C-kind of data? Maybe Convert::Binary::C can help?	[reply]
Re^2: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 10:11 UTC
It is a BCD encoded binay file which is read in binary() mode. It follows a Big Endian notation. By C kind of logic, I meant the usual way of using loops etc.	[reply]
Re^3: Handling Hex data with Dynamic unpack by ig (Vicar) on Jul 05, 2012 at 10:42 UTC
You might find the Template Grouping section of perlpacktut interesting, though I think writing some loops would be easier.	[reply]
Re^4: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 11:11 UTC
Re^5: Handling Hex data with Dynamic unpack by ig (Vicar) on Jul 05, 2012 at 19:07 UTC
Re^3: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 10:36 UTC
It is a BCD encoded binay file which is read in binary() mode. What does that mean? By C kind of logic, I meant the usual way of using loops etc. I think first you have to write some loops :)	[reply]
Re^4: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 11:04 UTC
Re^5: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 11:09 UTC
Re^4: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 10:39 UTC
Re^5: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 10:41 UTC
Re: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 11:06 UTC
This is how I would start, but I'd need more complete spec #!/usr/bin/perl -- use strict; use warnings; use Data::Dump; my $rawAsHex = q'0001000000004f914f1c0b0100425343413030000000050000434 +130303030300000000143413030303134000000024341303030323200000003434130 +3030323400000004434130303032390000000100224e43656c6c30000000020042534 +341303100000005000543413031303031000000064341303130303400000007434130 +31303130000000084341303130313500000009434130313031360000000100224e436 +56c6c30000000030042534341303200000006000a'; my $raw = pack 'H', $rawAsHex; my @stack = unpack q{ A2 ### Header Record Id : 1 Byte A2 ### File Format Version : 1 Byte A16 ### Timestamp : 8 Bytes A2 ### No. of BSCs : 1 Byte ### For each BSC ... A2 ### BSC Id : 1 Byte A2 ### Application Version : 1 Byte A4 ### BSC Name : 2 Bytes A4 ### Number of Cells : 2 Bytes ### For each Cell ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes A4 ### Number of Neighbour Cells : 2 Bytes ### For each Neighbour Cell to this BSC ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes }, $rawAsHex; my $id = unpack 'C', pack 'H', $stack[0]; my $version = unpack 'C', pack 'H', $stack[1]; my $time = unpack 'H', pack 'H*', $stack[2]; ## WHAT?! dd [ $id, $version, $time , \@stack ]; __END__ [ 0, 1, "000000004f914f1c", [ "00", "01", "000000004f914f1c", "0b", "01", "00", 4253, 4341, 3030, "000000050000434130", 3030, 3030, "000000014341303030", ], ] [download]	[reply] [d/l]
Re^2: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 11:17 UTC
Hey Thanks Anonymous Monk ... It certainly is a good starting point. I appretiate your replies :) Thanks and Regards, PerlJedi	[reply]
Re^2: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 11:28 UTC
Some how perl doesn't allow me to unpack like this (look at H2 H2 H16 H2 A2 ...) : my @stack = unpack q{ H2 ### Header Record Id : 1 Byte H2 ### File Format Version : 1 Byte H16 ### Timestamp : 8 Bytes H2 ### No. of BSCs : 1 Byte ### For each BSC ... A2 ### BSC Id : 1 Byte A2 ### Application Version : 1 Byte A4 ### BSC Name : 2 Bytes A4 ### Number of Cells : 2 Bytes ### For each Cell ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes A4 ### Number of Neighbour Cells : 2 Bytes ### For each Neighbour Cell to this BSC ... A4 ### Cell Pointer : 2 Bytes A18 ### Cell Name : 9 Bytes }, $rawAsHex; [download] Any idea how this could be done? I get a output like this : `[ 48, 48, "3031303030303030", [ 30, 30, "3031303030303030", 30, "04", "f9", "14f1", "c0b0", 1004, "253434130300000000", 5000, "0434", "130303030300000000", ], ]` [download]	[reply] [d/l] [select]
Re^3: Handling Hex data with Dynamic unpack by Anonymous Monk on Jul 05, 2012 at 11:38 UTC
Some how perl doesn't allow me to unpack like this (look at H2 H2 H16 H2 A2 ...) ... Any idea how this could be done? I showed you how. What you're dealing is a text-strings, so if you want bytes, you have to pack them. First pack them to get bytes ( pack 'H' ) then pack them to get what you're really after ( C An unsigned char (octet) value. ) ..... Commands `perl -le " print unpack q{H}, q{Y} " perl -le " print pack q{H}, q{59} " perl -le " print ord q{Y} perl -le " print pack q{H}, q{59} " perl -le " print unpack q{C}, pack q{H}, q{59} "` [download] Session `$ perl -le " print unpack q{H}, q{Y} " 59 $ perl -le " print pack q{H}, q{59} " Y $ perl -le " print ord q{Y} 89 $ perl -le " print pack q{H}, q{59} " Y $ perl -le " print unpack q{C}, pack q{H*}, q{59} " 89` [download] Y encoded as hex is 59 The numeric value ( ord ) of Y is 89 The C An unsigned char (octet) value, 8-bits, 1-byte of Y is 89	[reply] [d/l] [select]
Re^4: Handling Hex data with Dynamic unpack by PerlJedi (Novice) on Jul 05, 2012 at 11:43 UTC


Perl-Sensitive Sunglasses
	PerlMonks