cancel
Showing results for 
Search instead for 
Did you mean: 

Memory/Instruction barriers before writing to the backup SRAM

lutztonineubert
Associate III

The following code didn't worked because of a missing data barrier before writing to the backup SRAM:

HAL_PWR_EnableBkUpAccess();
std::copy(buffer, buffer + num_bytes, BaseAddress + address);
HAL_PWR_DisableBkUpAccess();

Adding a DSB, solved the issue for now.

HAL_PWR_EnableBkUpAccess();
__DSB();
std::copy(buffer, buffer + num_bytes, BaseAddress + address);
HAL_PWR_DisableBkUpAccess();

It is understandable, that the enable BkUp (which is setting a single bit) needs to be fully completed before writing the the actually memory addresses.

But for me it is not fully understandable, if a DMB would be enough (it also works) and if I need an additional ISB, so Enable and Disable don't happen just before the actual copy, like so:

HAL_PWR_EnableBkUpAccess();
__DSB();
std::copy(buffer, buffer + num_bytes, BaseAddress + address);
__ISB();
HAL_PWR_DisableBkUpAccess();

Can someone help me out here, what the correct way would be?

1 ACCEPTED SOLUTION

Accepted Solutions

> how can I be sure the effect of enable is effective?

It may not be sufficient, if APB1 is slow, or if there are busmaster (DMA) conflicts on APB1. See below.

The H4 is very, VERY different in this - backup SRAM is there by default in Normal area which can reorder writes even if the area is not cached. The bus structure is different, too. Barriers may be necessary and also not sufficient; I am not interested in 'H4 to pay more than casual attention.

In 'F4, writes never get reordered, not even in Normal area. The issue here is given by the relatively slow APB1 bus on which PWR sits, versus the relatively fast AHB1 bus on which BKPSRAM (and RCC with BDCR) sits. The same applies to all backup domain items.

I've talked about it here already a couple of years ago, ST "discovered" it only recently (see the 'F407 erratum "Possible delay in backup domain protection disabling/enabling after programming the DBP bit"). Use the recommended workaround from there - for your case, only the readback is applicable (C, I don't ++):

PWR->CR |= PWR_CR_DBP;
 (void)PWR->CR; // readback to ensure the bit is set before commencing the SRAM/RTC access, as PWR is on APB1 whereas RTC and SRAM are on AHB1

There is no such issue in the other way round, i.e. after writing to BPKSRAM, there is no need for any delay before writing to PWR_CR.DBP.

You may want to make sure the compiler won't reorder accesses, though (see volatile and sequence points; again, I don't ++).

JW

View solution in original post

26 REPLIES 26

Which STM32?

This has nothing to do with barriers as such; the DSB there acts as a simple delay.

JW

Pavel A.
Evangelist III

Seen this on STM32H743 (Nucleo) and 753.

After writing to the PWR registers and the backup RAM, __DSB is needed (but not ISB).

(Maybe a MPU region can be set up to make it work without flush, I have not tried)

-- pa

STM32F446

If it has nothing todo with barriers, how can I be sure the effect of enable is effective?

Without the barrier and optimization enabled (Os) the write always fails for the first byte.

Thank you for your answer. So this way?

HAL_PWR_EnableBkUpAccess();
__DSB();
std::copy(buffer, buffer + num_bytes, BaseAddress + address);
__DSB();
HAL_PWR_DisableBkUpAccess();
__DSB();

Piranha
Chief II

ISB is not a memory barrier and is totally unrelated. DSB can be used but is unnecessary restrictive. DMB is sufficient and most optimal.

@Community member​, by default SRAM is of normal memory type while peripheral registers are of device memory type and the CPU is allowed to reorder accesses to normal memory. Before Cortex-M7 it wasn't a thing, but Cortex-M7 is capable of it and does actually does it. Therefore memory barrier is required.

@Pavel A.​, you don't believe in memory barriers, don't you? Only some hackers believe in those... 😉

But seriously, AN4838 section 3.1:

Normal memory: allows the load and store of bytes, half-words and words to be arranged by the CPU in an

efficient manner (the compiler is not aware of memory region types). For the normal memory region the load /

store is not necessarily performed by the CPU in the order listed in the program.

Device memory: within the device region, the loads and stores are done strictly in order. This is to ensure the

registers are set in the proper order.

And my topic on this:

https://community.st.com/s/question/0D50X0000C4Nk4GSQS/bug-missing-compiler-and-cpu-memory-barriers

Replace them with DMB and remove the last one. 🙂

Thank you very much for your detail explanation. 🙂

> how can I be sure the effect of enable is effective?

It may not be sufficient, if APB1 is slow, or if there are busmaster (DMA) conflicts on APB1. See below.

The H4 is very, VERY different in this - backup SRAM is there by default in Normal area which can reorder writes even if the area is not cached. The bus structure is different, too. Barriers may be necessary and also not sufficient; I am not interested in 'H4 to pay more than casual attention.

In 'F4, writes never get reordered, not even in Normal area. The issue here is given by the relatively slow APB1 bus on which PWR sits, versus the relatively fast AHB1 bus on which BKPSRAM (and RCC with BDCR) sits. The same applies to all backup domain items.

I've talked about it here already a couple of years ago, ST "discovered" it only recently (see the 'F407 erratum "Possible delay in backup domain protection disabling/enabling after programming the DBP bit"). Use the recommended workaround from there - for your case, only the readback is applicable (C, I don't ++):

PWR->CR |= PWR_CR_DBP;
 (void)PWR->CR; // readback to ensure the bit is set before commencing the SRAM/RTC access, as PWR is on APB1 whereas RTC and SRAM are on AHB1

There is no such issue in the other way round, i.e. after writing to BPKSRAM, there is no need for any delay before writing to PWR_CR.DBP.

You may want to make sure the compiler won't reorder accesses, though (see volatile and sequence points; again, I don't ++).

JW

@Piranha​ ,

OP uses a 'F446. See my post below.

JW