FTP Server Unaligned Memory failure

GreenGuy · ‎2024-04-20

STM32H743

STM32CubeIDE - Version: 1.14.1 Build: 20064_20240111_1413 (UTC)

STM32CubeMX - Version: 6.10.0-RC9 Build: 20231120-2037 (UTC)

Software Pack (AzureRTOS) 3.2.0

STM32Cube_FW_H7_V1.11.1

OS: Linux LMint 21

I am using the FTP server which works when compiled with Optimization None(-O0).

However, if I compile with Optimization Debug(-Og), the application ends up in the Hard_Fault Handler.

The fault analyzer is showing Bus, memory management or usage fault (FORCED)

with Usage Fault Details of Attempt to perform an unaligned access (UNALIGNED).

With Optimization None(-O0), I placed a break just before the fault which is:

                            if (attributes & FX_READ_ONLY)
                            {
                                memcpy(&buffer_ptr[1], "r--r--r--", 9); /* Use case of memcpy is verified. */
                            }
                            else
                            {
                                memcpy(&buffer_ptr[1], "rw-rw-rw-", 9); /* Use case of memcpy is verified. */
                            }

at line 3480 (line 1 above) of the -nx_ftp_server_command_process() call in the nxd_ftp_server.c file.

The fault occurs at line 3486 (line 7 above) when Optimization is anything other than None.

When Optimization is None line 3486 executes OK and the target memory looks like this:

as it should. With Optimization set to anything other than None and the fault occurs the target memory is untouched.

A work around might be to just use Optimization set to None, but that causes problems elsewhere in the code as well as running very slow compared to Optimization on.

Difficult to believe this is a code bug in the FTP server. More likely a problem with the tool.

GreenGuy · ‎2024-04-21

I was backwards on my thinking regarding strongly ordered and TEX level. And agreed, I had the regions out of order.

Set as this solves the problem:

View solution in original post

Pavel A. · ‎2024-04-20

IIRC unaligned fault can occur on unusual memory (device, strongly ordered attributes), even if the offending instruction normally allows unaligned access.

/* This, it seems, is why authors of ST examples advise to configure the ETH buffer memory as cacheable and do cache management).

Many users (myself including) do not fully understand difference between "normal non-cached", "shareable" and "device" attributes in Cortex-M7.

From the appnote AN4838, table 4, "normal non-cached" is TEX=001, C=0, B=0. If you make a subtle mistake and forget TEX=1, you get strongly ordered (TEX=0 C=0 B=0) or shared device (Tex=0 C=0 B=1) - which requires aligned access.

In the text after table 3: "The S field is equivalent to non-cacheable memory. .... The TEX, C and B bits are used to define cache properties for the region, and to some extent, its shareability".

Hmm? What will be TEX=001 C=0 B=0 S=1?

More to that... from the same table 4, there are three kinds of normal cacheable memory, all optionally shareable:

TEX=0 C=1 B=0, TEX=0 C=1 B=1 and TEX=1 C=1 B=1.

They differ in subtle behavior: cache allocate by writes or not, write back or not - all these differences can result in weird dependency on timing and order of accessing various addresses. Scary!

*/

GreenGuy · ‎2024-04-20

The MPU region where the netx is using memory is RAM_D2 (start 0x30000000) after the .Rx and .Tx. descriptors. I changed the the two sections there to TEX=0 from TEX=1. This makes no difference in how the webserver behaves but the FTP server still ends in HardFault_Handler() at the same point with the same faults. The setting in Cube MX are:

Pavel A. · ‎2024-04-20

So what is the faulting instruction and the data address shown in the fault analyzer: in descriptors or buffers?

The 2nd MPU area, 256KB, is invalid, because the base address is not aligned on the size. No idea why Cube allows this.

GreenGuy · ‎2024-04-20

Agreed on the 2nd MPU area. Not sure why I did not pick up that. I reworked the linker script to segment more according to the documentation in RM0433 Rev 8. So instead of having to remember D1 is AXI SRAM and so on, I renamed everything according to the doc. Like so:

MEMORY
{
  FLASH (rx)     : ORIGIN = 0x08000000, LENGTH = 2048K
  DTCMRAM (xrw)  : ORIGIN = 0x20000000, LENGTH = 128K
  AXI_SRAM (xrw)   : ORIGIN = 0x24000000, LENGTH = 512K
/*  RAM_D1 (xrw)   : ORIGIN = 0x24000000, LENGTH = 512K */
/*  RAM_D2 (xrw)   : ORIGIN = 0x30000000, LENGTH = 288K */
  SRAM1 (xrw)    : ORIGIN = 0x30000000, LENGTH = 128K 
/*  RAM_D3 (xrw)   : ORIGIN = 0x38000000, LENGTH = 64K */
  SRAM2 (xrw)   : ORIGIN = 0x30020000, LENGTH = 128K
  SRAM3 (xrw)   : ORIGIN = 0x30040000, LENGTH = 32K
  SRAM4 (xrw)   : ORIGIN = 0x38000000, LENGTH = 64K
  SDRAM1  (xrw)	 : ORIGIN = 0xC0000000, LENGTH = 32768K
  SDRAM2  (xrw)	 : ORIGIN = 0xD0000000, LENGTH = 32768K
  ITCMRAM (xrw)  : ORIGIN = 0x00000000, LENGTH = 64K
}

Now everything is called out for its correct size and I don't have to keep mentally shifting gears.

I set up the MPU like this:

which should align ok with my linker directives:

 /* Networking resources */
  .tcp_sec (NOLOAD) : 
  {
   . = ABSOLUTE(0x30000000);
    *(.RxDecripSection)

   . = ABSOLUTE(0x30000060);
    *(.TxDecripSection)
  } >SRAM1 AT> FLASH /*RAM_D2 AT> FLASH*/


  .nx_data (NOLOAD) :
  {
   . = ALIGN(32);
   . = ABSOLUTE(0x30000200);
   *(.NxServerPoolSection)
  
   . = ALIGN(32);
   . = ABSOLUTE(0x30004200);
   *(.NetXPoolSection) 
  } >SRAM1 AT> FLASH /*RAM_D2 AT> FLASH*/

This run OK as far as the networking in general and the Webserver.

However, the FTP server still ends up in the fault handler at the same place which is as I pointed out in the first post.

the line (3486) in nxd_ftp_server.c:

memcpy(&buffer_ptr[1], "rw-rw-rw-", 9); /* Use case of memcpy is verified. */

the pointer &buffer_ptr[1] is pointing to address 0x30000c4d

The Fault Analyzer looks like this:

Pavel A. · ‎2024-04-21

So 0x30000c4d is in your 2nd MPU region, with attributes "strongly ordered" (TEX=000 C=0 B=0).

The memcpy function does not expect this (it was designed for normal memory), it probably tries to move data in 32-bit units and crashes on unaligned address.

Btw, the 1st MPU region (256B) has no effect because the 2nd region overlaps it. The MPU is too hard to use in ARMv7. In ARMv8 it is improved and the attributes are more palatable.

GreenGuy · ‎2024-04-21

I was backwards on my thinking regarding strongly ordered and TEX level. And agreed, I had the regions out of order.

Set as this solves the problem:

Pavel A. · ‎2024-04-22

No more crash in memcpy?

GreenGuy · ‎2024-04-22

No more HardFaults.