2019-12-03 08:07 AM
I am trying to setup DHCP over Ethernet using LwIP, with an ultimate goal of using TCP/UDP for communication. I have configured everything through Cube and set up the MPU according to the FAQ guidelines for Ethernet on STM32H7 and am finally receiving/sending packets.
Unfortunately the packets I receive are just raw Ethernet frames w/ no recognizable protocol. I searched around and only found one other post about this and the fix for them was to call LwIP_Process() in the main loop which I am already doing.
Here is what wire-shark is capturing on the interface the stm32 is connected to:
Here is my MPU Configuration:
I have also attached my IOC file, any help will be greatly appreciated.
Edit: Providing more context:
Buffer definition:
ETH_DMADescTypeDef DMARxDscrTab[ETH_RX_DESC_CNT]
__attribute__((section(".RxDecripSection"))); /* Ethernet Rx DMA Descriptors */
ETH_DMADescTypeDef DMATxDscrTab[ETH_TX_DESC_CNT]
__attribute__((section(".TxDecripSection"))); /* Ethernet Tx DMA Descriptors */
uint8_t Rx_Buff[ETH_RX_DESC_CNT][ETH_RX_BUFFER_SIZE]
__attribute__((section(".RxArraySection"))); /* Ethernet Receive Buffers */
Linker script:
/* Entry Point */
ENTRY(Reset_Handler)
/* Highest address of the user mode stack */
_estack = 0x24080000; /* end of RAM */
/* Generate a link error if heap and stack don't fit into RAM */
_Min_Heap_Size = 0x400; /* required amount of heap */
_Min_Stack_Size = 0x800; /* required amount of stack */
/* Specify the memory areas */
MEMORY
{
DTCMRAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
RAM_D1 (xrw) : ORIGIN = 0x24000000, LENGTH = 512K
RAM_D2 (xrw) : ORIGIN = 0x30000000, LENGTH = 288K
RAM_D3 (xrw) : ORIGIN = 0x38000000, LENGTH = 64K
ITCMRAM (xrw) : ORIGIN = 0x00000000, LENGTH = 64K
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 2048K
}
/* Define output sections */
SECTIONS
{
/* The startup code goes first into FLASH */
.isr_vector :
{
. = ALIGN(4);
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(4);
} >FLASH
/* The program code and other data goes into FLASH */
.text :
{
. = ALIGN(4);
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} >FLASH
/* Constant data goes into FLASH */
.rodata :
{
. = ALIGN(4);
*(.rodata) /* .rodata sections (constants, strings, etc.) */
*(.rodata*) /* .rodata* sections (constants, strings, etc.) */
. = ALIGN(4);
} >FLASH
.ARM.extab : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
.ARM : {
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
} >FLASH
.preinit_array :
{
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array*))
PROVIDE_HIDDEN (__preinit_array_end = .);
} >FLASH
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array*))
PROVIDE_HIDDEN (__init_array_end = .);
} >FLASH
.fini_array :
{
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP (*(SORT(.fini_array.*)))
KEEP (*(.fini_array*))
PROVIDE_HIDDEN (__fini_array_end = .);
} >FLASH
/* used by the startup to initialize data */
_sidata = LOADADDR(.data);
/* Initialized data sections goes into RAM, load LMA copy after code */
.data :
{
. = ALIGN(4);
_sdata = .; /* create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(4);
_edata = .; /* define a global symbol at data end */
} >RAM_D1 AT> FLASH
/* Uninitialized data section */
. = ALIGN(4);
.bss :
{
/* This is used by the startup in order to initialize the .bss secion */
_sbss = .; /* define a global symbol at bss start */
__bss_start__ = _sbss;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
_ebss = .; /* define a global symbol at bss end */
__bss_end__ = _ebss;
} >RAM_D1
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(8);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(8);
} >RAM_D1
.lwip_sec (NOLOAD) : {
. = ABSOLUTE(0x30040000);
*(.RxDecripSection)
. = ABSOLUTE(0x30040060);
*(.TxDecripSection)
. = ABSOLUTE(0x30040200);
*(.RxArraySection)
} >RAM_D2 AT> FLASH
/* Remove information from the standard libraries */
/DISCARD/ :
{
libc.a ( * )
libm.a ( * )
libgcc.a ( * )
}
.ARM.attributes 0 : { *(.ARM.attributes) }
}
Solved! Go to Solution.
2019-12-03 10:42 AM
And what address is it *actually* using for the buffers, not just the descriptors.
Perhaps make the memory area larger, and make sure you commit any data with a fencing instruction, or explicit flush. ie __DSB, and SCB_CleanDCache_by_Addr()
2019-12-03 10:42 AM
And what address is it *actually* using for the buffers, not just the descriptors.
Perhaps make the memory area larger, and make sure you commit any data with a fencing instruction, or explicit flush. ie __DSB, and SCB_CleanDCache_by_Addr()
2019-12-03 12:16 PM
Oh this was it, the Txbuffer in low_level_output was on the stack and was being cached. Adding SCB_CleanInvalidDCache() before calling HAL_ETH_Transmit(....) in ethernetif.c fixed the issue for me.
Just wondering if there is a better way to do this without touching generated files (unfortunately there is no /* USER CODE */ section in low_level_output)
2019-12-03 01:12 PM
NVM got it, just used the MPU to setup the stack/heap area as Write Through RW Allocate and it works
2019-12-04 04:06 AM
For these needs __DMB() is enough. In contrast to __DSB(), it leaves the possibility for a CPU to execute non memory access instructions, while waiting for a memory access to complete.
2019-12-04 04:08 AM
As Clive (Avogadro) already wrote, for a transmit invalidation is not necessary and clean is enough.
2019-12-04 04:13 AM
Be aware of the fact, that output frames can come also from lwIP memory pools and even flash memory. Therefore D-cache management with SCB_CleanDCache_by_Addr() is more universal approach and gives more performance as the memory can be cached when processing it.
2019-12-04 12:32 PM
That makes sense but unfortunately cube doesn't have /* USER CODE */ sections in that function and we are a pretty big team so it would get messy if I modified the generated code, is there a way around this?