Beefy Boxes and Bandwidth Generously Provided by pair Networks Joe
Do you know where your variables are?
 
PerlMonks  

Convert binary file to ascii

by richz (Beadle)
on Jun 22, 2007 at 17:57 UTC ( #622870=perlquestion: print w/ replies, xml ) Need Help??
richz has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I have a binary file that looks like the following in a hex editor(these are the first 8 bytes):

02 00 00 00 f7 ff f4 ff ....

What I would like to do is convert every 2 bytes into an ASCII string that I can then write out to a new file. So for the above example, I want my output file to look like this (bytes flipped due to endianness issues):

0x0002, 0x0000, 0xfff7, 0xfff4 What I have found so far is that I need to set the filehandle to binary mode using binmode() after I open it, and then also use ord() to convert a binary value to its ascii equivalent. The problem I have right now is trying to group every set of 2 binary bytes into 4 hex digits but reversed for the endianness reason. My code so far:

$ctr = 0; # get first argument, i.e filename $in_filename = shift; print "You chose input <$in_filename>\n"; $out_filename = shift; print "You chose output <$out_filename>\n"; open(INFILE, $in_filename) or die "can't open input file: $!"; #set infile to binary mode binmode(INFILE); open(OUTFILE, ">$out_filename") or die "can't open output file: $!"; while(<INFILE>) { @samples = split(//,$_); foreach $byte (@samples) { print OUTFILE "0x".sprintf("%02hx",ord($byte)).", "; $ctr += 1; if (($ctr % 8) == 0) { print OUTFILE "\n"; $ctr = 0; } } }
As output this will give me:

 0x02,0x00,0x00,0x00,0xf7,0xff,0xf4,0xff

Any suggestions on how to get the behavior I would like? Furthermore, when I call split on one line of the binary file, will it split the line into individual bytes? This is what it appears to be doing, but then how does ord work exactly? If it sees a binary value of ff what does it convert that to? Thanks.

Comment on Convert binary file to ascii
Select or Download Code
Re: Convert binary file to ascii
by BrowserUk (Pope) on Jun 22, 2007 at 18:23 UTC

    Once you've read the data from the binmoded file, use this:

    Update: added join for commas.

    print join ',', map{ sprintf '0x%04x', $_ } unpack 'v*', $bin;; 0x0002 0x0000 0xfff7 0xfff4

    Depending which way you're going big- to little-endian or vice versa, you might need to swap 'v*' for 'n*'. See pack for details.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I want to print a newline after every 8 values. Is there any better way of doing this with your method besides the following?

      @converted_list = map{sprintf '0x%04x', $_} unpack ('v*', $line); foreach $sample (@converted_list) { print OUTFILE "$sample, "; $ctr += 1; if (($ctr % 8) == 0) { print OUTFILE "\n"; $ctr = 0; } }

        Well, the $ctr = 0 doesn't do anything useful, and the $ctr += 1 could be merged into the if conditional.

        my @converted_list = map { sprintf('0x%04x', $_) } unpack('v*', $line) +; my $ctr; foreach (@converted_list) { print OUTFILE "$_, "; print OUTFILE "\n" unless ++$ctr % 8; }

        You could merge of all that together if you so desired, but that's probably a bit odd.

        my $ctr; print map { ++$ctr % 8 ? "$_, " : "$_\n" } map { sprintf('0x%04x', $_) } unpack('v*', $line);

        The simplest solution is probably to omit the counter altogether by only reading in 16 bytes at a time. As a bonus, it doesn't leave trailing comma when the file isn't an exact multiple of 16 bytes long.

        # Each line has 8 16-bit words, so 16 bytes. local $/ = \16; while (<INFILE>) { print join ', ', map { sprintf('0x%04x', $_) } unpack('v*', $_); print("\n"); }

        Probably the easiest way would be to set $/ = \16 so that you read the file in 16 bytes chunks.

        Ie. $/ = \16; while( my $line = <INFILE> ) { willresult in $line containing 16 bytes each time, which will give you your 8 values per output line.

        Simplistically, that make the program something like:

        #! perl -lw use strict; ## Note the -l above which makes print add newlines. open IN, '<:raw:perlio', $ARGV[0] or die $!; open OUT, '>', 'junk.out' or die $!; $/ = \16; ## read 16 bytes at a time; print join ',', map{ sprintf '0x%04x', $_ } unpack 'v*', $_ while <IN> +; close OUT; close IN;

        If there was (still) some way to binmode *ARGV, it could be reduced to a one-liner:

        perl -nle"BEGIN{$/=\16}print join',',map{sprintf'0x%04x',$_}unpack'v*' +,$_" binfile >outfile

        but without binmode, that fails if the file contains a ^Z (control-Z; ascii 26) character.

        Or you can go the other way and 'PBP-up' the above.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      There is a problem I am running into, and that is that if there is a byte with the value "0a" in the binary file, it is not making it through the conversion process above. I guess something thinks it is a newline. Any idea on how to address that?

      Edit: Upon reading ikegami's reply above, this may have to do with the field record separator, which I just learned about from ikegami's post, i.e. $/.

        That makes no sense. Nothing in the snippet I posted will care, or even notice whether there is an '0a' character in the data it is processing.

        The only time characters get 'converted' is of the file is read with CRLF translation in effect, which it won't be if you have binmoded the file as you did in the OP (and as I mentioned above.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Convert binary file to ascii
by shmem (Canon) on Jun 22, 2007 at 18:33 UTC
    You were close:
    ... @samples = split(//,$_); while (@samples) { print OUTFILE sprintf("0x%02x%02x",map{ord}reverse splice @sam +ples,0,2 ).", ";

    ord returns the ordinal number (ASCII value) of a char in decimal, i.e. a number between 0..255. If there's a 'f' in that stream, that will be converted into 102. sprintf("0x%02x",ord "f") gives 0x66.

    But there are more elegant ways to do this. See pack and unpack.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Convert binary file to ascii
by jwkrahn (Monsignor) on Jun 22, 2007 at 19:56 UTC
    You can get what you want with something like this:
    use warnings; use strict; @ARGV == 2 or die "usage: $0 in_filename out_filename\n"; # get first argument, i.e filename my $in_filename = shift; print "You chose input <$in_filename>\n"; my $out_filename = shift; print "You chose output <$out_filename>\n"; #set infile to binary mode open INFILE, '<:raw', $in_filename or die "can't open $in_filename: $! +"; open OUTFILE, '>', $out_filename or die "can't open $out_filename: $!" +; # read 8 bytes at a time $/ = \8; while ( <INFILE> ) { print OUTFILE join( ', ', map sprintf( '0x%04x', $_ ), unpack 'S*' +, $_ ), "\n"; }
Re: Convert binary file to ascii
by moritz (Cardinal) on Jun 22, 2007 at 20:26 UTC
    Where does this file come from?

    The last time I saw a question like this was when somebody had a UTF-16 encoded file and wanted to recode it to something different, without knowing what is was.

    In that case you shouldn't just drop every second byte ;-)

      Just a guess, but it's more likely the file is something like 16-bit audio samples, not any sort of actual character data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://622870]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2014-04-16 05:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (414 votes), past polls