Lost data over TCP/IP (SSL)

We have a socket connection to a server via TCP/IP with use of SSL. Sometime I lose some packets.
I write some packets, which are successfully received by the server. And then comes at least one, which gets lost.
As return of wip_write I get the number of actually written bytes. But the data is not received by the server. The next written data returns with the error -999, followed by a WIP_CEV_PEER_CLOSE event.

Why returns the wip_write successfully but the data is not received by the server?

Thank for helping :wink:

In the tcpdump on the server I see, that data bytes of the lost packet contain more bytes than the successfully transmitted packets. The useless bytes are the last bytes of the previous successfully tranmitted packet.

The server software crashes with the error ‘SSL3_GET_RECORD:wrong version number’ (python). That seems, that the server software cannot decrypt the SSL data, because there are useless bytes. So the OpenAT firmware sends corrupt data.

How do you know that it’s the OAT sending corrupt data - rather than the data getting corruptd at some other point(s) along the linl… :question:

We used different gprs networks and frequency channels. We repeatedly get currupted data.
We checked the transmission without SSL. The tcpdump showed also repeated bytes.

Simplified payload received on server
Packet1: 0x00 0x11 0x22 0x33 0x00 0x01
Packet2: 0x00 0x11 0x22 0x33 0x00 0x02
Packet3: 0x00 0x11 0x22 0x33 0x00 0x03
Packet4 & 5: 0x00 0x11 0x22 0x33 0x00 0x04 0x00 0x11 0x22 0x33 0x00 0x05
Packet6: 0x00 0x05 0x00 0x11 0x22 0x33 0x00 0x06
Packet7: 0x00 0x11 0x22 0x33 0x00 0x07

The 6th packet starts with the last bytes of the 5th packet. When this happens with SSL the SSL header is corrupt and the data cannot be decoded.
Could there be a bug in the TCP stack?

Hiya,

It sounds like you are filling/overflowing the TCP buffer in the module.

You have to call wip_write() until the value written is not equal to the number that you requested to write. This indicates that the internal TCP write buffer is full and is waiting to be transmitted.

When wip_write() returns, it doesn’t mean that the data has been transmitted, it just means that it has been transferred from your buffer into the internal TCP TX buffer…

You then have to stop writing until you get another WIP_CEV_WRITE event which indicates that there is more space available to queue the next data to write.

This is explained (but not very clearly) in the WIP doco. I struck this myself when first using WIP, and you have to follow the example to deal with the buffer not being fully transmitted. In my experience, this issue does not become apparent until you are starting to transmit large chunks of data - bigger than the ethernet MTU (roughly around 1500 bytes) packet size.

I start writing after an WIP_CEV_WRITE event, until it returns a number that is smaller than my requested bytes to write (or an error event occurs). After this I wait for a WIP_CEV_WRITE event again before writing more data.
In this case I do not transmit large buffers. One data packet consists of 38 bytes (I tested with different sizes for one data packet). I write the data (if channel ready for data) every 500ms (transfer interval of 2s produces also the error).
The internal TCP TX buffer seems to collect 2 or 3 of my data packets (3 * 38 bytes = 114 bytes) before sending and the next tranmitted packet is corrupt.

Hiya,

Hmm, interesting.

There are a number of TCP parameters that you can tune using the wip_setOpts() command. These include buffer sizes and TCP timeouts.

Not in front of my dev PC at the moment, so can’t tell you exactly which settings might be of use.

ciao, Dave