Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Best technique to code/decode binary data for inter-machine communication?

by BrowserUk (Pope)
on Aug 15, 2012 at 18:49 UTC ( #987616=note: print w/ replies, xml ) Need Help??


in reply to Best technique to code/decode binary data for inter-machine communication?

Different operating systems change or delete characters in the stream.

Why not just binmode the socket (pipe/filehandle) that you do the transfer over?

Then you don't need to do any encoding or decoding.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?


Comment on Re: Best technique to code/decode binary data for inter-machine communication?
Re^2: Best technique to code/decode binary data for inter-machine communication?
by flexvault (Parson) on Aug 15, 2012 at 22:26 UTC

    Hello BrowserUk,

    That was my first try, but I had buffer problems. I may have done something wrong, so I will look at that solution. I'm glad you reminded me, since that would be the best solution, but some clients just hung. Again, I'll revisit that and let you know.

    I implemented your earlier solution using 'fork' which I'm much more familiar with, but this week I wanted to read up on 'threads', but the new Camel book removed chapter 17 on threads...what a disappointment. Are there any good 'paper books' on the subject. Old habits, I like to mark up the pages. It helps me when I go back for reference.

    Regards...Ed

    "Well done is better than well said." - Benjamin Franklin

      Were you using \n as a record separator in your protocol by any chance? Binmode would prevent conversions in transmission, but the clients would be parsing the input differently, and some would never see a "\n" since they're really looking for "\r\n".

        some would never see a "\n" since they're really looking for "\r\n"

        No, lack or presence of a "\r" is not going to mess up line-oriented I/O on any version of Perl1. Perl on Windows has no problem reading files that lack "\r" characters. Perl on Unix has no problem reading files that contain "\r" characters (it just includes the "\r" in the returned string).

        But a common mistake with using sockets with Perl is using <$sock>, which will hang forever until a newline or end-of-file arrives. (Using print on a socket shouldn't to be a problem as sockets shouldn't default to buffered mode.)

        1 Now that ancient Mac Perl's mistake of psuedo ASCII is history. But avoiding binmode wouldn't help in that case anyway.

        - tye        

      As SuicideJunkie suggest, you were probably trying to use line-oriented xfer functions (ie. print and readline ) on a binmoded socket.

      My recommendation would be to use pack/unpack & send/recv like this:

      $to->send( pack 'n/a*', $binData ); ... $from->recv( my $len, 2 ); $from->recv( my $binData, unpack 'n', $len );

      That's good for packets up to 64k in length. Switch to 'N' to handle up to 4GB.

      The nice thing about this is that the receiver always knows how much to ask for; and can verify that he got it (length $binData) which avoids the need for delimiters and works just as well with non-blocking sockets if you need to go that way.

      Important update: If using this method to transmit data between machines, see also the thread at Mystery! Logical explanation or just Satan's work?

      I also found that when it comes to transmitting arrays and hashes, using pack/unpack is usually more compact (and therefore faster) than using Storable, because (for example) an integer always required 4 or 8 bytes binary, but for many values it is shorter in ascii:

      use Storable qw[ freeze ];; @a = 1..100;; $packed = pack 'n/(n/a*)', @a;; print length $packed;; 394 $ice = freeze \@a;; print length $ice;; 412 @b = unpack 'n/(n/a*)', $packed;; print "@b";; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2 +7 28 29 30 31 32 33 34 35 ... %h = 'aaaa'..'aaaz';; $packed = pack 'n/(n/a*)', %h;; print length $packed;; 158 $ice = freeze \%h;; print length $ice;; 202 %h2 = unpack 'n/(n/a*)', $packed;; pp \%h2;; { aaaa => "aaab", aaac => "aaad", aaae => "aaaf", aaag => "aaah", aaai => "aaaj", aaak => "aaal", aaam => "aaan", aaao => "aaap", aaaq => "aaar", aaas => "aaat", aaau => "aaav", aaaw => "aaax", aaay => "aaaz", }

      It doesn't always work out smaller, but it is usually faster and platform independent.

      Of course, storable wins if your data structures can contain references to others.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        I have had some interesting experiences with Storable, in the form of data which, once frozen, could not be thawed!   This was on an AS/400, and it was very data-specific, and I do not know if it was a momentary bug in whatever-it-was version of the CPAN module.   But as it was, I had to quickly scramble and store the data in the database in a different format.   (Fortunately, this was an SQLite file that didn’t have to be shared with anyone, but the occurrence of the problem surprised me greatly, nonetheless.)

        BrowserUk,

        First I'm interested in the code sample you gave:

        $to->send( pack 'n/a*', $binData );
        Currently I write that as:
        $binData = pack('N',length( $data ) ) . $data; $to->send( $binData );
        Is your code a shorthand for the above?

        Second, as you and others have pointed out, I did not use 'binmode' after opening the socket. If I were to add the following:

        binmode Socket, ":raw";
        To both the client and server code, would I be in 'binary' mode on windows, *nix, etc. or would I need to have different client code for each. Reading the latest 'binmode' documentation, it sounds like the function would be ignored on some systems and then used where binary and text definitions differ.

        Third, 'Storable' does not produce 'network neutral' results, so can't be used in this case.

        Fourth, if someone passes a ':utf8' key/value pair to my application and I store the variables in an external file as ":raw", will they be able to use the data as utf8 when they receive the key/value pair back. Until I read the 'binmode' documentation, I didn't think of that possibility!

        Thank you

        "Well done is better than well said." - Benjamin Franklin

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://987616]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (6)
As of 2014-12-20 21:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (99 votes), past polls