Posted on June 29, 2015 at 15:22Hello
We have a custom board with a STM32F429ZG MCU and CubeMX 4.7 FW Lib 1.5 generated libraries. The board is a redesign of a board a STM32F103ZG MCU with Wiznet W5300 Ethernet Coprocessor.
The old FW is based on the non-HAL/CubeMx libraries V3.5.0 (SPL?).
Used Libraries (new board):
FreeRTOS 8.1.2
FatFS R0.10b
LwIP: Self-modified to handle data transfers with 64k per tcp_write (exchanged several u16 with u32 for snd_buf and like variables)
Zero Copy ethernetif + stm32f4xx_hal_eth driver in send and receive direction
sdio.c implementation has two independent DMA for send and receives (like found in this forum)
With the new board we are facing some strange data corruption:
Reading larger binary files (2MB+) in chunks of SD-Card page sizes (64k, but occurs also with smaller chunks):
- Requesting a buffer of 64k from a custom LWIP mempool (memp_malloc())
- Read from SD Card to SDRAM 1 via DMA2_Stream3 Channel 4.
- Posting a pbuf to a FreeRTOS queue which is read by LWIP tcp_poll and eventually processed by tcp_write
- LwIP splits up the large 64k in slices of 1460 byte and passes an pbuf chain to the driver
- The low level ethernetif sets the ETHIF_DMATxDescriptors to point directly to the payload of the pbuf without copy: heth->TxDesc->Buffer1Addr = (uint32_t)p->payload;
- In LwIP tcp_sent() the pbuf is freed on successful receiving confirmation
The entire memory allocation is handled by the LWIP mempool. The allocation/deallocation is guarded by FreeRTOS portSET_INTERRUPT_MASK_FROM_ISR() / portCLEAR_INTERRUPT_MASK_FROM_ISR(0);
We have also replaced it by the FreeRTOS heap_2 with a heap on the SDRAM.
This works fine as long as the files are smaller than ~2MB. However on larger files at some point it ends up that we have simultaneous SD Card reads and ETH writes because LwIP starts to process the pbuf queue while the SD card is still sending data.
On larger files I get sometimes corrupt data at the boarder of a SD-Card page (or begin/end of a TCP transmission). Usually 564 bytes of data are corrupt. Within these corrupt bytes there is a constant value added or subtracted from the expected value, but only if Abs(constant value) < Abs(expected value).
Usually a data table like this is transferred (artificial values for testing):
Expected:
CH0 CH1 CH2 CH3 CH4 CH5 CH6 CH7 UINT16
20100 20200 20300 20400 20500 20600 20700 20800 5
20101 20201 20301 20401 20501 20601 20701 20801 5
20102 20202 20302 20402 20502 20602 20702 20802 5
...
30100 30200 30300 30400 30500 30600 30700 30800 5
Corrupt:
CH0 CH1 CH2 CH3 CH4 CH5 CH6 CH7 UINT16
20100 20200 20300 20400 20500 20600 20700 20800 5
13101 13131 13301 13401 13501 13601 13701 13801 5
13102 13132 13302 13402 13502 13602 13702 13802 5
...
30100 30200 30300 30400 30500 30600 30700 30800 5
The oddest thing is, we found no invalid CRC reports. Nowhere.
The actual stored values on the SD Card are correct. It is correct in the SDRAM as well. It is received corrupt on the PC side however, without TCP/IP to reporting an invalid CRC. I've tried both LWIP and HW CRC generation.
The error doesn�t appear if the SD card is not involved at all. (Writing the same structure directly in the SDRAM from an external connected ADC via SPI/DMA and following the same procedure afterwards)
Has anyone a hint where to look or how to isolate the root cause? For me, it seems to be some electrical behavior. It may be something in the board layout but I wouldn't exclude the software as well. Currently I�ve ordered an STM32429I-EVAL for reproduction, but it will last some days until I receive it.
#cubemx #stm32f429 #eth #sd-card