If I think there's something hinky about a file because it contains "unexpected" byte values, I would check its inventory of byte values, with something like this:
#!/usr/bin/perl
use strict;
use warnings;
die "Usage: $0 file.name\n" unless ( @ARGV == 1 and -f $ARGV[0] );
open( FH, shift );
binmode FH;
$/ = undef;
$_ = <FH>;
my %char_hist;
for my $c ( split // ) {
$char_hist{ sprintf( "%02x", ord( $c )) }++;
}
for my $c ( sort keys %char_hist ) {
printf "%s\t%d\n", $c, $char_hist{$c};
}
(That's just a toy version to try it out on files that aren't seriously large. I'd do it differently for general use.)
It's sometimes surprising what you can learn about a file just by looking at a histogram of its byte values - seeing which values occur, and which ones don't.
(If you happen to know that a file contains utf8-encoded text, you can learn a lot by looking at a histogram of its Unicode characters - I posted a script for that too: unichist -- count/summarize characters in data.