cancel
Showing results for 
Search instead for 
Did you mean: 

Multiple race conditions and bugs in SOCKD

mafredri
Associate II
Posted on November 20, 2016 at 18:43

The SOCKD implementation suffers from many race conditions and bugs which makes it impossible to make robust applications without forcibly rebooting the WiFi module when it's not performing as expected.

Problems & bugs:

1. Data from client can appear before WIND (Client connected, now in Data Mode) if the client disconnects quickly:

    Hello from socket!\r\n

    \r\n+WIND:61:Incoming Socket Client:192.168.0.10\r\n

    \r\n+WIND:60:Now in Data Mode\r\n

    \r\n+WIND:59:Back to Command Mode\r\n

    \r\n+WIND:62:Socket Client Gone:192.168.0.10\r\n

2. Large messages can cause hard fault when in command mode with a connected client and entering data mode (+WIND:8:Hard Fault:CW1200RxPrs: r0 00000070, r1 00000078, r2 00000068, r3 B3E51B1F, r12 00000002, lr 08016365, pc 080163A4, psr 21000000)

3. Closing SOCKD server (AT+S.SOCKD=0) when a large message is pending can cause SOCKD to not respond even when enabled (AT+S.SOCKD=32000) until WiFi module is rebooted.

4. Entering data mode at the same time a client disconnects can cause a loss of WIND 62 (Socket client gone)

5. Entering command mode as a client connects leaves ''at+s.'' in the buffer and prevents WINDs from arriving, ''at+s.'' also takes ~500ms to take effect which means you have to perform arbitrary waiting and checking if you have actually exited command mode or not, taking care to clear the ''at+s.'' with a ''\r\n'' if you did not receive a WIND.

6. Client connection during AT command prevents AT command from completing (see my other thread: Incoming Socket Client during AT command (HTTPGET))

Suggestion: Remove (or disable with option) Data/Command mode

This mode results in nondeterministic behavior. Instead, notifying through the UART that there is pending data and letting the application read/write N bytes of data through AT commands would solve many issues. This means we can trust the guarantees of AT command behavior instead of trying to handle all the race conditions of the Data/Command mode.

This would also fix the trustability issue of Data Mode, what I mean is that there's no way to know if data is coming from the socket client or from the WiFi module (socket client could be sending forged +WIND:xx messages).

#spwf01sa-socket-sockd-server
8 REPLIES 8
Posted on December 10, 2016 at 15:29

Hi,

thanks for detailed log. Problems (hardfault and wrong WIND/data order) were solved, and fix will be part of next FW release.

About #5, 500ms delay, I suspect is related to default ip_sockd_timeout.

Regards

jerry

Posted on December 17, 2016 at 11:07

 ,

 ,

Thanks Jerry, looking forward to the next firmware release!

Regarding ♯ 5, that seems plausible. I read the user manual but it's not entirely clear to me what the practical implications of decreasing `ip_sockd_timeout` is.

Socket server - buffer timeout management (from 5 ms to 250 ms)

Warning: 250 ms is suggested to avoid data loss

I'm not sure what a buffer timeout in this context means, and why less than 250 ms might result in data loss.

Posted on December 18, 2016 at 10:07

Hi,

since you have a full picture of current issues, do you want to join beta-tester crew for next FW release?

About sockd_timeout, that sentence needs to be deleted from UM. To avoid data loss, simply use flow control.

That's timeout that data_mode_console needs to understand that no other bytes are going to be sent to remote.

A summary slide:

  • Concerning Buffer usage, it is sent out when

  • Timeout expires (none byte is received over UART for “

    ip_sockd_timeout

    �?

    ms

    )

  • Buffer is full (Note that buffer is 1Kb in size)

  • Concerning timings:

  • Every byte is placed in the Buffer while no Timeout is found. When Timeout occurs, Buffer is sent to Client

  • Every Escape Sequence

    must

    be followed by a Timeout. If Escape is followed by further bytes, the whole “

    Escape+bytes

    �? sequence is sent to Client

  • In order to free the buffer for next operation, a Timeout should go before every Escape Sequence. In this case, bytes before Escape Sequence are sent, the buffer is freed, and so the Escape Sequence is correctly received and understood. Otherwise, if the Escape Sequence is sent at the end of buffer, around 1Kb size limit threshold, the whole “

    bytes+Escape

    �? sequence is sent to client in 2 different packets.

Posted on December 18, 2016 at 12:26

gallucci.gerardo wrote:

Hi,

since you have a full picture of current issues, do you want to join beta-tester crew for next FW release?

Yes, that would be great!

Also, thank you for the explanation about 

sockd timeout and the summary slide. It clarifies a lot!

Posted on December 18, 2016 at 20:15

Please send me a private message, containing your email address.

Thanks

jerry

Christopher Karle
Associate
Posted on June 25, 2017 at 14:42

The SOCKD interface is kinda broken.  The entire command/data mode stuff could be supported/integrated with the other socket functions (SOCKQ, SOCKR, SOCKW).

I am working with version '160129-c5bf5ce-SPWF01S' which still exhibits these problems.  Has a firmware version that addresses these issues been released?

Posted on June 26, 2017 at 14:13

Next FW version has not been released yet.

SOCKD will not be aligned to socket client AT-commands (on, write, rear, query and close). Command/data mode will be still there.

If you want server aligned to client, can switch to SPWF04

Posted on June 26, 2017 at 14:28

Would it be possible to receive early access to the beta firmware?