<?xml version="1.0" encoding="windows-1252"?>
<node id="968985" title="Re^10: UDP connection" created="2012-05-04 17:03:25" updated="2012-05-04 17:03:25">
<type id="11">
note</type>
<author id="171588">
BrowserUk</author>
<data>
<field name="doctext">
&lt;blockquote&gt;&lt;i&gt;You are correct. I must have misread the numbers.&lt;/i&gt;&lt;/blockquote&gt;
&lt;p&gt;It happens. Thank you for acknowledging it.

&lt;blockquote&gt;&lt;i&gt;
However, there is no problem. Here are a sample server and client that do bidirection communication over 4 socket pairs. 
&lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;Hm. There are still problems you are not dealing with. I'm not talk ing about error handling or niceties here.

&lt;ol&gt;&lt;li&gt;Your server is not acknowledging (responding to) the clients heartbeats per the OPs description:
&lt;blockquote&gt;&lt;i&gt;
And once it receives a "heartbeat" (a byte containing "01") it sends a response to the IP and port the hb came from (byte containing "25").  
&lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;It may sound a minor omission, but (I believe) it does complicate the design of the client (and server) somewhat.

&lt;/li&gt;&lt;li&gt;Your server is using the receipt of the heartbeat as a trigger for the start of server to client data flow.

&lt;p&gt;In the OPs description:
&lt;ul&gt;&lt;li&gt;the hexdumps are only sent after the initial heartbeat. 
&lt;p&gt;(Obviously, how else would the server know there was a client to send to:)
&lt;P&gt;

&lt;/li&gt;&lt;li&gt;&lt;b&gt;But they are not triggered by the heartbeat!&lt;/b&gt;

&lt;P&gt;He goes on to say:

&lt;blockquote&gt;&lt;i&gt;
If there are test results ready on the server it sends them to ....  
&lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;Which suggests, though it doesn't actually state -- and he never came back to answer the question -- that the sending of the hexdumps is initiated by the server, when they become available; provided that is within 10 seconds of a heartbeat.
&lt;/li&gt;&lt;/ul&gt; 

&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;The significance of those (perhaps apparently small) differences, is that your client treats all inbound communications as data. The OP cannot do this as he has also to distinguish between the heartbeat response and the actual data.

&lt;p&gt;The single byte packet size of the response should make that easy... except when you take the uncertain timing of the availability of the hexdump data into consideration.

&lt;p&gt;Here is the black hole in the timing scenario I was trying resolve with the OP, which your client does not -- and (I believe) as written, cannot -- resolve:

&lt;code&gt;
time ascending relative
  ...
  0.000
   At this point, a client has (just) sent a HB, and the server has acknowledged it.
   The server had no data to send
  ...
  9.99999... seconds expire

 The server discovers that it has hexdump to send and starts transmitting ...
 10.000   
 The client sends it next heartbeat and awaits the response.
&lt;/code&gt;

&lt;P&gt;The hexdump doesn't have to be large, a single packet coming available (at exactly the wrong time) will trigger the problem. The client is expecting a single byte response. The server hasn't seen the heartbeat, so it isn't going to send it until (at least) after it finishes sending the current packet. If, after sending the heartbeat, the client went into a read state for the single byte response, it will get the first byte of the data packet, and the rest will be discarded. This is the exact scenario that the OP described in his first post.

&lt;p&gt;Using blocking reads -- as would normally be used in conjunction with threads as indicated by the OP -- the client cannot be in a read state in anticipation of the arrival of a data packet that could come at any time; and also send a regular heartbeat, because you cannot send() via socket, if that socket is currently in a recv() state. (At the same end!)

&lt;p&gt;Using [select] obviates that, by avoiding entering the read state until it knows (by polling) that data is ready to be read. &lt;b&gt;But that alone does not completely close the window for failure.&lt;/b&gt;

&lt;p&gt;If the 10 second timeout at the server occurs exactly whilst the client is in the process of receiving the latest packet -- and is therefore unable to transmit its next heartbeat -- the server will (may) conclude that the client has 'gone away' and discard (fail to send) any subsequent data until the client re-heartbeats. And that leads to data loss, which the OP explicitly denounced.

&lt;p&gt;Whilst the window for such a failure is small; such are the nature of communications protocol failures. 

&lt;p&gt;The usual solution(s) to this problem include:

&lt;ol&gt;&lt;li&gt;Have the client send the heartbeat at twice the frequency of the servers timeout.
&lt;/li&gt;&lt;li&gt;Have the server declare a timeout of half what it will actually accept.
&lt;/li&gt;&lt;li&gt;Have the server reset the timeout whenever it &lt;i&gt;sends&lt;/i&gt; -- be it heartbeat ack; or data packet -- rather than when it &lt;i&gt;receives&lt;/i&gt;.
&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;And it was these details I was trying to establish with the OP before he got scared away by our ...um... discussion.




&lt;div class="pmsig"&gt;&lt;div class="pmsig-171588"&gt;
&lt;hr /&gt;
&lt;font size=1 &gt;
&lt;div&gt;With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'&lt;/div&gt;
&lt;div&gt;Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.&lt;/div&gt;
&lt;div&gt;"Science is about questioning the status quo. Questioning authority". &lt;/div&gt;
&lt;div&gt;In the absence of evidence, opinion is indistinguishable from prejudice.
&lt;p align=right&gt;[http://www.theregister.co.uk/2011/11/29/sas_versus_world_programming/|The start of some sanity?]&lt;/p&gt;&lt;/div&gt;
&lt;/font&gt;

&lt;/div&gt;&lt;/div&gt;</field>
<field name="root_node">
968647</field>
<field name="parent_node">
968909</field>
</data>
</node>
