I am trying to read data from a binary file. It consists of a sequence identifier followed by two sequences of unsigned integers. The format of this file is like this:
First 4 bytes is general information
then the data comes in blocks like this:
next 4 bytes sequence identifier
next 2 bytes is the length y of the sequence
next y * 2 bytes is the first sequence
next 2 bytes is a separator with the second sequence
next y * 2 bytes is the second sequence
next 2 bytes is a separator with the following block
The number of these blocks varies and is not known when opening the file.
I wrote some code to get the sequences out of the binary file
open(DATFILE, "<$datfilename") or die $!;
binmode(DATFILE);
read(DATFILE, $_, 4, 0); # Read 4 bytes of the general information
foreach (0..110){
read(DATFILE, $_, 4, 0); # Read 4 bytes of the profile ID
read (DATFILE, $_, 2, 0); # Read 2 bytes of the sequencelength
&ReadData ($profilelength); # read the first sequence
read (DATFILE, $_, 2, 0); # Read 2 bytes of the trailing zero
&ReadData ($profilelength); # read the second sequence
read (DATFILE, $_, 2, 0); # Read 2 bytes of the trailing zero
}
This particular file has 111 data blocks and this codes works. The subroutine ReadData puts the data in an array. No problems here.
For the real thing I want to replace the foreach (0..110) by while <DATFILE> to keep reading until the eof since I do not know the number of blocks. When I do this the read behaviour changes. Instead of reading the expected byte number 5 when using foreach it starts reading at byte 15 when using while. This is within the sequence and that means the length of the sequence is wrong and the data that comes out is corrupt. Could any of the wise monks here kindly explain this while behaviour to me and perhaps a way to do it the proper way?
Kind regards, Hans