cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F767 + AWS IoT expansion fails when publishing larger data

skhem
Associate

We used STM32F767 running with AWS IoT expansion. However, AWS IoT expansion doesn't support to be generated from STM32CubeMX directly, so we generated empty project using STM32CubeMX which has FreeRTOS, LwIP, EmbedTLS as initiation. After that, we applied some parts from AWS IoT expansion to our project. In conflict parts, We used linker script, syscalls.c, net.c, lwip_net.c, net_tcp_lwip.c, mbedtls_net.c, net_tls_mbedtls.c from AWS IoT expansion. We can say that we can run example code successfully because we can control the LED light from AWS cloud without any problem.

However, we would like to stream much more data over MQTT. (2.5KB per message, 8 messages per second or about 20KB/s) that's where problems start.

first, when we tried to increase publish message size more than TCP_MSS (from our guess), oversize problem starts to happen with different assertion and error over multiple times. (Only publishing, No subscription)

# Code

for(;;){

    event = osMessageGet(awsQueueHandle,0);

    if(event.status==osEventMessage)

    {

       void * data = event.value.p;

       memcpy(cPayload,data,DATA_LEN);

       vPortFree(data);

       rc = aws_iot_mqtt_publish(&client, cPTopicName, strlen(cPTopicName), &paramsQOS0);

       if (rc != AWS_SUCCESS)

       {

         msg_error("publish error %d:\n", rc);

       }

    }

    else if(event.status==osOK || event.status==osEventTimeout)

    {

    }

    else

    {

       msg_error("dequeue error: %d\r\n",event.status);

    }

}

# Error from the first run

Assertion "unsent_oversize mismatch (pcb vs. last_unsent)" failed at line 469 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

Assertion "last_unsent->oversize_left >= oversize_used" failed at line 686 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

# Error from the second run

Assertion "last_unsent->oversize_left >= oversize_used" failed at line 686 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

Assertion "state!" failed at line 1673 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

Assertion "already writing or closing" failed at line 1667 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

Assertion "unsent_oversize mismatch (pcb vs. last_unsent)" failed at line 469 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

Assertion "inconsistent oversize vs. space" failed at line 473 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

Assertion "inconsistent oversize vs. len" failed at line 481 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

# Error from the third run

Assertion "state!" failed at line 1673 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

Assertion "tcp_receive: valid queue length" failed at line 1191 in ../Middlewares/Third_Party/LwIP/src/core/tcp_in.c

Assertion "tcp_write: pbufs on queue => at least one queue non-empty" failed at line 342 in ../Middlewares/Third_Party/LwIP/src/core/tcp_out.c

Second, After we publish smaller size message, it can run without any problem for at least 20 minutes (we didn't test for longer). But when we try to run aws_iot_mqtt_yield to publish and subscibe at the same time. we have new errors which changed over multiple times of running.

# Code

for(;;){

      rc = aws_iot_mqtt_yield(&client, 10);

    if (rc != AWS_SUCCESS)

    {

      msg_error("yield error %d:\n", rc);

    }

    event = osMessageGet(awsQueueHandle,0);

    if(event.status==osEventMessage)

    {

       void * data = event.value.p;

       memcpy(cPayload,data,DATA_LEN);

       vPortFree(data);

       rc = aws_iot_mqtt_publish(&client, cPTopicName, strlen(cPTopicName), &paramsQOS0);

       if (rc != AWS_SUCCESS)

       {

         msg_error("publish error %d:\n", rc);

       }

    }

    else if(event.status==osOK || event.status==osEventTimeout)

    {

    }

    else

    {

       msg_error("dequeue error: %d\r\n",event.status);

    }

}

# Error from the first run

Assertion "state!" failed at line 1673 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

Assertion "already writing or closing" failed at line 1667 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

../Middlewares/Third_Party/mbedTLS/library/ssl_tls.c:4061: is a fatal alert message (msg 80)

../Middlewares/Third_Party/mbedTLS/library/ssl_tls.c:3739: mbedtls_ssl_read_record_layer() returned -30592 (-0x7780)

../Middlewares/Third_Party/mbedTLS/library/ssl_tls.c:6842: mbedtls_ssl_read_record() returned -30592 (-0x7780)

# Error from the second run

 Assertion "state!" failed at line 1673 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

Assertion "already writing or closing" failed at line 1667 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

Assertion "conn->current_msg != NULL" failed at line 1509 in ../Middlewares/Third_Party/LwIP/src/api/api_msg.c

We believe mentioned problems from two conditions are related.

We've tried our code with stm32f7 nucleo board to check it's not hardware related problem.

We've tried update aws iot embedded c sdk to latest version to check it's not problem from AWS IoT library.

We've tried applied some part of code with CC3220, we don't have oversize problem (using FreeRTOS PlusTCP instead of LwIP).

Since it has too many errors we don't know where to start, can you please guide me where to start debugging?

Do you think it can be some buggy with LwIP, mbedTLS porting?

What can be the root cause of these problems?

Best Regards,

Sarawin

2 REPLIES 2
JGENT
Associate II

​Thanks for your detail explanation and sorry for not having answer before. It looks like  problem is  related to threading issues. But I don t see why you can have such issue. Can you first check that lwipopts.h , FreeRTOSConfig.h and  mbedtls_config files are the same than the one from the AWS IOT expansion package .

Then I suspect some memory issue,  Can you check that you don't have any memory issues with FreeRTOS ?  We are using below defines in FreeRTOSConfig.h file to track such issues: 

#define configCHECK_FOR_STACK_OVERFLOW         1

#define configUSE_MALLOC_FAILED_HOOK           1

The AWS C  sdk has not been designed for RTOS based platform , we did some project where we  manage the aws_iot functions from a single thread to avoid any issues. Your second example seems to be inline with this approach but it may be also a direction to search.

Best Regards,

 Jean-Marc

Piranha
Chief II

Since it has too many errors we don't know where to start, can you please guide me where to start debugging?

In files ethernetif.c and stm32f7xx_hal_eth.c.

Do you think it can be some buggy with LwIP, mbedTLS porting?

What can be the root cause of these problems?

I don't know about mbed TLS, but ST's lwIP implementation is definitely faulty and is causing problems in higher software layers. The are so much problems that it's useless to document it, though You can find some of them described separately in this forum and other sources. Basically it needs a total rewrite.

We've tried our code with stm32f7 nucleo board to check it's not hardware related problem.

You can also try my demonstration firmware:

https://community.st.com/s/question/0D50X0000AhNBoWSQW/actually-working-stm32-ethernet-and-lwip-demonstration-firmware