Is the word(uint32_t) access atomic in shared memory in dual core STM32H7

yhplx · ‎2024-08-27

Hi Friends,

In STM32H7 MCU, both M7 and M4 can access shared memory, such as D3 SRAM4. I am going to use a D3 SRAM4 address as a place to share some status information between M7 and M4. If a 32bits word is defined shown as below, can the access from both sides(M7 and M4) be guaranteed atomic? Does it need HSEM?

-----

Code example:

volatile uint32_t *status_ptr = (status4_7 *)0x38000000;

In M4 side, *status_ptr = 0x1234abcd;

In M7 side, uint32_t status = *status4to7_ptr;

I know a 32bits word access is atomic in single core STM32 MCU.

Thank you in advance.

SofLit · ‎2024-08-27

It can't be safe if one core reads and the other writes as the operations are asynchronous. How could guarantee the consistency of the data between the cores? think about race conditions .. if the race condition doesn't matter in your application, you can do the operations without semaphore.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
PS: Be polite in your reply. Otherwise, it will be reported as inappropriate and you will be permanently blacklisted from my help/support.

View solution in original post

TDK · ‎2024-08-27

Yes, it's atomic. It won't read intermediate values.

If you feel a post has answered your question, please click "Accept as Solution".

Tesla DeLorean · ‎2024-08-27

It's dual-ported, writes are buffered and potentially cached. Assume RMW actions will NOT be atomic

>>I know a 32-bits word access is atomic in single core STM32 MCU.

That's a bit sweeping. An aligned write can be a single bus transaction. Peripherals should generally be treated as operating independently and concurrently.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

SofLit · ‎2024-08-27

Hello,

If you mean by "atomic" to protect a variable from RW operations by the other core while performing a RW operation in the first core, I say yes, you need to use HSEM to lock/unlock RW operations. And you need also to define that memory region as Sharable region using MPU to prevent any data incoherency between CM7 and CM4 while using cache in CM7.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
PS: Be polite in your reply. Otherwise, it will be reported as inappropriate and you will be permanently blacklisted from my help/support.

yhplx · ‎2024-08-27

Thank you, SoftLit.

What if it is only written(W) in M4 side, and only read(R) in M7 side, can it be safe without using HSEM lock/unlock?

SofLit · ‎2024-08-27

It can't be safe if one core reads and the other writes as the operations are asynchronous. How could guarantee the consistency of the data between the cores? think about race conditions .. if the race condition doesn't matter in your application, you can do the operations without semaphore.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
PS: Be polite in your reply. Otherwise, it will be reported as inappropriate and you will be permanently blacklisted from my help/support.

SofLit · ‎2024-08-29

Hello @yhplx ,

Did your question has been answered? if yes please mark Accept as Solution the comment that answered your original question.

Thank you.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
PS: Be polite in your reply. Otherwise, it will be reported as inappropriate and you will be permanently blacklisted from my help/support.

Tesla DeLorean · ‎2024-08-29

Carefully considered in can be relatively safe.

The hazards mostly being related to cache coherency and write buffering, operations would need to be fenced to ensure the content reached the memory cells before proceeding.

Just making something "volatile" is not sufficient.

I think one of the efficiencies would be to manage a larger buffer, or units of data, and then signal validity/availability via the HSEM/EXTI mechanics.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Tesla DeLorean · ‎2024-08-29

It gets to be more a completion issue. So think coherency, write buffering, fencing

There really isn't a clean atomic RMW at the MCU level, and if you do such things across multiple cores you tend to drag performance down to the lowest common denominator.

Make large buffers, using queuing methods.

Try to keep as much as possible on a single core, especially if you have data volume, and processing it would make it smaller, etc.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

SofLit · ‎2024-08-30

@Tesla DeLorean wrote:

Carefully considered in can be relatively safe.

The hazards mostly being related to cache coherency and write buffering, operations would need to be fenced to ensure the content reached the memory cells before proceeding.

Not obvious as you can imagine. Even with Strongly-Ordered you cannot grantee that as the access of the two cores is asynchronous and you can't guarantee at a given time which will operate on the memory.

Just making something "volatile" is not sufficient.

Indeed, there is no impact in there ..

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
PS: Be polite in your reply. Otherwise, it will be reported as inappropriate and you will be permanently blacklisted from my help/support.