Is your code a shorthand for the above?

Yes, (kinda:), but more flexible and quicker.

THe template: n/a* says: pack as many arbitrary binary bytes as are contained in the argument, counting them as you go, and the prepend that data with that count as a network-order unsigned short. C/a* would pack the count as a single byte; N/a* as a network-order unsigned long; and so on.

The really powerful template is N/(n/a*)* I use for arrays and hashes. It says: pack each input argument as bytes, each prefixed with its length as a network-order ushort; and the prefix the whole result with a single network order ulong that counts all the bytes and all the counts is the count of the fields packed.

If I were to add the following: binmode Socket, ":raw"; To both the client and server code, would I be in 'binary' mode on windows, *nix, etc. or would I need to have different client code for each.

On 'nix it will do nothing; on Windows it will turn of crlf modifications (+ prevent the oft-forgotten ^Z == EOF).

My understanding is that if you use just binmode SOCKET; on all platforms; no translation will be done anywhere and you'll recv exactly what you send.

When ':raw' first came around, it disabled all PerlIO layers; then they changed it for no good reason and without documentation. Last time I investigated it (on windows only!), it still removed :crlf, but didn't remove all layers. To my knowledge there is no explanation available of what layers get left behind, or why?

Third, 'Storable' does not produce 'network neutral' results, so can't be used in this case.

Storable does have nfreeze (network order freeze) which is defined as a "portable format"; though I've never tried using it between 32/64-bit platforms.

That said, there are many horror stories of people being bitten by Storable; though at least half of them can be traced back to misunderstanding or incompetence.

That said, having 'discovered' the pack 'N/(n/a*)*' method of packing simple arrays and hashes, I would use that in preference to Storable for non-nested hashes and arrays.

Fourth, if someone passes a ':utf8' key/value pair to my application and I store the variables in an external file as ":raw", will they be able to use the data as utf8 when they receive the key/value pair back.

Until you apply some form of encode/decoding operation to a file or data stream, anything you read is just a bunch of bytes.

If you read bytes and transmit bytes, the receiver gets the data in the same state as if it had read bytes from the original source. If those bytes constitute data encoded in some form you will need to decode it before using it -- but it doesn't matter where (which end of the connection) that decoding happens -- so long as it is done only once and correctly.

Of course, the definition of 'correct' requires thought. If you transmit utf16le to a big-endian machine, then that machine will need to decode it as 'utf16le' (not just 'utf16' which locally might default to 'utf16be').

But "The Unicode Problem" -- how the f*** do you know which of the many Unicode standards was used to encode the data??? -- exists wherever you do the decoding. If the receiving machine had read the same bytes from a file, it still has to either "know" (or guess) which of the Unicode Standards was used to encode the data, because the file could have come from anywhere. (eg. the internet).

Unicode is a f*** up! And will remain that way until they finally require that each of the various binary formats that are encompassed by the Unicode (non)Standard, prefix all encoded data with something that identifies the encoding.

From your perspective; if you will be transmitting (say) hashes built from input that has previously been decoded, then you will need to understand the Perl Unicode handling tools. I wish I could point you to a definitive reference, but no such animal seems to exist yet.

In reply to Re^5: Best technique to code/decode binary data for inter-machine communication? by BrowserUk
in thread Best technique to code/decode binary data for inter-machine communication? by flexvault

