We have a socket connection to a server via TCP/IP with use of SSL. Sometime I lose some packets.
I write some packets, which are successfully received by the server. And then comes at least one, which gets lost.
As return of wip_write I get the number of actually written bytes. But the data is not received by the server. The next written data returns with the error -999, followed by a WIP_CEV_PEER_CLOSE event.
Why returns the wip_write successfully but the data is not received by the server?
In the tcpdump on the server I see, that data bytes of the lost packet contain more bytes than the successfully transmitted packets. The useless bytes are the last bytes of the previous successfully tranmitted packet.
The server software crashes with the error ‘SSL3_GET_RECORD:wrong version number’ (python). That seems, that the server software cannot decrypt the SSL data, because there are useless bytes. So the OpenAT firmware sends corrupt data.
We used different gprs networks and frequency channels. We repeatedly get currupted data.
We checked the transmission without SSL. The tcpdump showed also repeated bytes.
The 6th packet starts with the last bytes of the 5th packet. When this happens with SSL the SSL header is corrupt and the data cannot be decoded.
Could there be a bug in the TCP stack?
It sounds like you are filling/overflowing the TCP buffer in the module.
You have to call wip_write() until the value written is not equal to the number that you requested to write. This indicates that the internal TCP write buffer is full and is waiting to be transmitted.
When wip_write() returns, it doesn’t mean that the data has been transmitted, it just means that it has been transferred from your buffer into the internal TCP TX buffer…
You then have to stop writing until you get another WIP_CEV_WRITE event which indicates that there is more space available to queue the next data to write.
This is explained (but not very clearly) in the WIP doco. I struck this myself when first using WIP, and you have to follow the example to deal with the buffer not being fully transmitted. In my experience, this issue does not become apparent until you are starting to transmit large chunks of data - bigger than the ethernet MTU (roughly around 1500 bytes) packet size.
I start writing after an WIP_CEV_WRITE event, until it returns a number that is smaller than my requested bytes to write (or an error event occurs). After this I wait for a WIP_CEV_WRITE event again before writing more data.
In this case I do not transmit large buffers. One data packet consists of 38 bytes (I tested with different sizes for one data packet). I write the data (if channel ready for data) every 500ms (transfer interval of 2s produces also the error).
The internal TCP TX buffer seems to collect 2 or 3 of my data packets (3 * 38 bytes = 114 bytes) before sending and the next tranmitted packet is corrupt.