cancel
Showing results for 
Search instead for 
Did you mean: 

[STM32H563] NetX Duo slow TLS handshake

Sander_UP
Associate III

Hello!

I have an MQTT project using NetX Duo, which publishes MQTT messages to a remote server. But, before it can do that, a secure connection must be established using TLS v1.2, which takes about 18-20 seconds (checked while debugging and stepping through the connect function). So, I wanted to find a way to make that process faster.

I found this post on the forums, here (porting Mbed TLS to the STM32H5 platform with hardware crypto acceleration
). But this raised some questions for me. Firstly, how to exactly implement this in the NetX Duo's TLS setup (if even possible)? And secondly, I am missing the AES option in the CubeMX settings - is that correct or I am missing something? I only have PKA, HASH, RNG and GTZC options.

I checked the included files in the above linked post on how to port the mbedTLS, but was a bit overwhelmed as how would one implement it into the NetX Duo's TLS connection setup. I am using an RSA key(s) for the secure connection. I was hoping that someone might give some initial steps or even has an example of how they implemented and used it with NetX Duo.

1 ACCEPTED SOLUTION

Accepted Solutions

Hello @mbarg.1 

Thank You for the suggestion, but I found another fix for it. The problem was not with the Internet connection or Internet related, but rather in the calculations that where done during the handshake. In the debugging version, the code for the calculations were a bit too "heavy" for the MCU. But when I optimized the files (inside the crypto_libraries folder) with -Os flag, then the handshake went from taking about 20 seconds to about 1-3 second(s). It seems that the optimizations done to the files helped out a lot.

While debugging I saw that the main function that hanged for so long was the _nx_secure_tls_send_certificate_verify function, which is dealing with encrypting a hash of received messages. But after optimizing with the -Os flag, now it is quite fast.

View solution in original post

4 REPLIES 4
mbarg.1
Senior III

Before looking for the holy graal, try to understand basic of your problem.

Many process rely on NO-REPLY validation - that mean system has to send challenges and wait to see if anybody replies in order to validate, a reply choice bad, no reply choice good - an example: SLAAC.

TLS require access to remote resources, via network: on weak, slow or unreliable connections, system will wait some timeout before re-trying.

Computation overhead by CPU, can cause some milliseconds of CPU time, not seconds; obviously, if you have some near.real time process, it is preferrable to outsource.

Crypto code is very large, easy to corrupt: having tested code running cerified is always welcome, will shorten validation process and make you product more reliable. 

For 20 seconds delay, look at NetX code and choices you did - use DHCPv6 instead of SLAAC, try multiple IP addresses instead of waiting for timeput ... hardware solutions are great but will not help in this case.

Sander_UP
Associate III

@mbarg.1  Thank You for the suggestion.

But what exactly do you mean by "try multiple IP addresses"? And exactly does using DHCPv6 help if retrieving an IP address is not the problem here?

From WireShark logs I can see that the IP address has been configured, Internet connection has been established (I check by pinging 8.8.8.8). I can also see that the connection is being established (MQTT connect commands sent by the IoT device and from the remote, TCP SYN and ACK packets also sent), and then suddenly there is an approximately 20 seconds delay. After that another MQTT connect command is sent from the remote server, and after that everything is OK.

After the first MQTT Connect Command, I see that a TCP Spurious Retransmission packet is sent from the remote. Maybe the IoT device is somewhat slow to answer and thus the 20 second delay happens. And while debugging I can see that the program is delaying in the _nx_secure_tls_handshake_process function.

IF problem is that after making TCP connection the server or the client fall asleep, it is code problem - unless someone is waiting to complete some other task.

I have seen many systems waiting to complete SLAAC procedures before continuing dialogue, or waiting some non-existing DNS/SNTP/.. servers to reply; in this case trying secondary DNS server (with another IP address) instead of waiting that primary times out, can help, same for NTP, internet data connection validation ...

Hello @mbarg.1 

Thank You for the suggestion, but I found another fix for it. The problem was not with the Internet connection or Internet related, but rather in the calculations that where done during the handshake. In the debugging version, the code for the calculations were a bit too "heavy" for the MCU. But when I optimized the files (inside the crypto_libraries folder) with -Os flag, then the handshake went from taking about 20 seconds to about 1-3 second(s). It seems that the optimizations done to the files helped out a lot.

While debugging I saw that the main function that hanged for so long was the _nx_secure_tls_send_certificate_verify function, which is dealing with encrypting a hash of received messages. But after optimizing with the -Os flag, now it is quite fast.