Showing results for 
Search instead for 
Did you mean: 

LDREX instruction on the STM32H755 when MPU activated causes hard fault

Associate III

Hello guys,

I had recently a problem with executing LDREX instruction on the STM32H755 when MPU was activated for the specific memory region that covered the address accessed by the LDREX instruction. After executing LDREX instruction I got immediately a hard fault error with BFARVALID & PRECISERR bits are set in BFSR and the last accessed address pointed on the address accessed by the LDREX instruction. When the MPU was not activated there was no problem at all.

Before I provide the MPU configuration I would like to ask few questions:

1. Has the STM32H755 implemented a global monitor of the exclusive accesses?

2. Is the exclusive access cleared in the global monitor after ISR occurs on the CPU that claimed it, or only the local one is cleared?

The MPU configuration is following:

  MPU_InitStruct.Enable = MPU_REGION_ENABLE;

  MPU_InitStruct.Number = MPU_REGION_NUMBER0;

  MPU_InitStruct.BaseAddress = someAddress;

  MPU_InitStruct.Size = 2;

  MPU_InitStruct.SubRegionDisable = 0x00;

  MPU_InitStruct.TypeExtField = MPU_TEX_LEVEL1;

  MPU_InitStruct.AccessPermission = MPU_REGION_PRIV_RW_URO;


  MPU_InitStruct.IsShareable = MPU_ACCESS_SHAREABLE;

  MPU_InitStruct.IsCacheable = MPU_ACCESS_CACHEABLE;

  MPU_InitStruct.IsBufferable = MPU_ACCESS_BUFFERABLE;


Accepted Solutions
Associate III

Actually I got this info from ST directly, I created a support ticked, here is the answer from ST:

We don't implement global monitor on STM32H7. We recommend to use the HW semaphore for synchronization.

Best regards

View solution in original post

Pavel A.
Evangelist III

Your LDREX is 16 bit, of course (LDREXH) ?

I tried also specifically LDREXH instruction (still the same problem), the address accessed by instruction is aligned to the word, so that should not be a problem.

It really looks like a problem between MPU and exclusive address instructions, when the MPU is not active it works without any problems

Associate II

I have just stumbled upon this problem on the STM32F746 and as far as I can tell HardFault only happens if memory is specified in the MPU as shareable, if it is specified as non-shareable HardFault does not occur.

The fault that escalates to HardFault is actually Bus Fault when LDREX tries to access shared memory.

This seems like an unexpected limitation on this device, but it is not mentioned in the errata sheet.

Associate II

That cannot be the reason since I have set up memory accessed by LDREX as Device memory in this case and according to ST's application note AN4838 to prevent speculative prefetch, memory should be set up as Strongly-Ordered or Device and I tried both combinations but when Shared attribute is set (which should be set for both of these types) LDREX produced hard fault (actually bus fault) as in the picture


I don't think LDREX/STREX to device (or strongly-ordered) memory is supported.

From ARMv7-M ARM A3.4.5 Load-Exclusive and Store-Exclusive usage restrictions:

"LDREX and STREX operations must be performed only on memory with the Normal memory attribute."

Associate II

I've also run into this problem on the STM32H755. I suspect that the STM32H7 AXI bus matrix does not implement a global exclusive monitor, given that it doesn't support hardware coherency. From (emphasis mine):

"The S field is for a shareable memory region: the memory system provides data synchronization between bus masters in a system with multiple bus masters, for example, a processor with a DMA controller. A strongly-ordered memory is always shareable. If multiple bus masters can access a non-shareable memory region, the software must ensure the data coherency between the bus masters. The STM32F7 Series and STM32H7 Series do not support hardware coherency."

I scoured the reference manual, datasheet, and several application notes[1] and found no mention of a missing global exclusive monitor — a glaring omission.

In testing, I found that stores issued on the M4 core would not clear exclusive locks acquired by the M7 core, and vice versa. In other words, I couldn't get std::atomic<uint32_t>::compare_exchange_weak() on one core to fail when the other core was simultaneously storing to the same location. I've taken that to mean that the C++ std::atomic synchronization primitives are unusable on this platform, and instead one must explore alternative algorithms or consider using the hardware semaphore (HSEM) peripheral.

If ST is monitoring this post, would you please confirm that the STM32H7 does not implement a global exclusive monitor?


Thanks for that, it does completely explain my findings.