Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Unpack an 8 bit unsigned char matrix

by BrowserUk (Patriarch)
on Jul 15, 2013 at 20:22 UTC ( [id://1044454]=note: print w/replies, xml ) Need Help??


in reply to Unpack an 8 bit unsigned char matrix

The Python code doesn't show the value of nRow or from where it is obtained.

And your words:

reading file 05398.bin with size 90942 with nCols 3954 and nRows 23 nCols stays static for all of the files, but nRows changes for each file.

contradicts the evidence of the Python code which has nRow preset (somewhere) and calculates the value of nCols.

But, taking you at your word, and assuming the files consist of N rows of 3954 columns, you could do this:

use constant NCOLS => 3954; my $file = $matrixPath.'/'.$info{$transcriptName}{'PATH'}; my $fileSize = -s $file; die 'Bad filesize' unless $filesize % NCOLS == 0; open IN, $file or die "Can't open inputfile $file $!"; binmode(IN); ## Note: You're reading the whole file in a single read NO NEED FOR A +LOOP. sysread(IN, my $buffer, $fileSize) or die. close IN; ## The first (rightmost) unpack splits the buffer into NCOLS length se +ctions of bytes. ## The second unpack breaks each of those into its individual (uchar) +integers ## and puts them in an anonymous array. ## the map assigns al the anonymous arrays to @matrix. # Update: template corrected, see posts below. my @matrix = map[ unpack 'C*', $_ ], unpack '(a' . NCOLS . ')*', $buff +er; ...

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Unpack an 8 bit unsigned char matrix
by Ineffectual (Scribe) on Jul 15, 2013 at 21:19 UTC
    Hi BrowserUk! Thanks for the help!

    The python code gets the number of rows from a text file that was written when the matrix was written. So for each file, I have a text file that tells me the file size and the number of rows in that file (along with other metadata).

    When I use the code above, it gives back one row with 3954 columns. However, most of the files have more than one row of information. Is it truncating the rest? Do I need to use a while loop to only get fileSize/$nRows bytes and process those?

      A quick (not binary, but same difference) demo:

      #! perl -slw use strict; use Data::Dump qw[ pp ]; sysread( DATA, my $buffer, 80 ) or die $!; my @matrixX10 = map[ unpack 'C*', $_ ], unpack '(a10)*', $buffer; pp\@matrixX10; my @matrixX5 = map[ unpack 'C*', $_ ], unpack '(a5)*', $buffer; pp\@matrixX5; __DATA__ 1234567890123456789012345678901234567890123456789012345678901234567890 +1234567890

      Produces:

      C:\test\primes>..\junk94 [ [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], [49, 50, 51, 52, 53, 54, 55, 56, 57, 48], ] [ [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], [49, 50, 51, 52, 53], [54, 55, 56, 57, 48], ]

      Broken code replaced above:

      #! perl -slw use strict; use Data::Dump qw[ pp ]; sysread( DATA, my $buffer, 80 ) or die $!; my @matrixX10 = map[ split 'C*', $_ ], unpack '(a10)*', $buffer; pp\@matrixX10; my @matrixX5 = map[ split 'C*', $_ ], unpack '(a5)*', $buffer; pp\@matrixX5; __DATA__ 1234567890123456789012345678901234567890123456789012345678901234567890 +1234567890

      Produces:

      C:\test>junk94 [ [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], ] [ [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [1, 2, 3, 4, 5], [6, 7, 8, 9, 0], ]

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      I apologise. I omitted an essential part of the first unpack template (the repeat *). Please try substituting this:

      my @matrix = map[ unpack 'C*', $_ ], unpack '(a' . NCOLS . ')*', $buff +er;
      The template has become: '(a3954)*' which tells unpack to split the buffer into as many 3954-byte chunks as are available.

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1044454]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-24 08:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found