This is certainly a reasonable guess
Yes, it's just a guess, but I felt I wanted to provide a possible alternative to the guess that the OP doesn't know their specifications and doesn't know what hex is.
anomalies, likewise, could now be tackled in the context of that now-successfully-decoded integer (not text ...) data stream. In general, does not make good sense to me to attack the file with regular expressions
Well, in my hypothetical situation of a serial data stream corrupted by noise, unfortunately decoding into integers first and then inspecting those integers for bad values won't work. The reason is that the corruption on such streams can include bytes inserted or dropped, meaning that it's entirely possible that none of the incoming data is aligned on 32-bit boundaries. In such a case, one needs a state machine to reacquire synchronization with the data stream, so actually in this case Perl's regular expressions are a decent tool for that job. Note how none of the valid values in the following stream are aligned on 4-byte boundaries:
my $datastr = "BEEF00000001AB0000000200000700000003F00D";
print "$_\n" for $datastr=~/0{7}[0-9]/g;
__END__
00000001
00000002
00000003
| [reply] [d/l] |
| [reply] |
the latter proposition would require a bizarre explanation to say the least
What you call "bizarre" is in my experience completely normal. I myself would not design a data format in this way, but have worked with plenty of binary data formats that do make somewhat strange choices like for example storing a value from 0 to 9 in a 32-bit field. Just a month ago I finished implementing a driver for a proprietary network protocol that, among other things, has a "flag" field in which only the lowest 3 bits are used, which is 32 bits wide. As for how the corruption might have gotten there I already explained a possibility, which again, in the ECE world is, despite being avoidable, unfortunately still completely normal.
So as I said, given that the OP seemed to be clear on the expected format, I just wanted to provide a different perspective for the explanation.
| [reply] |