2020-08-02 03:22 PM
Hi
I am using the Nucleo STM32F746 and have a problem with in-circuit flash programming. The flash drivers used have been in various other STM32 M3, M4 and M7 parts without any issues (since about 2010) but with the F746 the effect that I see is that not all written data is programmed.
For example, I write new data via FTP (using Ethernet and TCP/IP stack) and find that if I write a short test file (as example) of just a few bytes the following occurs at the low level (addresses and write ordering controlled by file system level):
If I add a 10us delay to the code after each write operation there are no issues and I can write many files with small or large amounts of data without any errors.
I use data cache workarounds in F4 case but I find no errata for the F746 - also the F4 data cache workaroud doesn't change anything.
Is any problem known? Although the 10us delay makes it reliable it would be best to know what is different with this part.
Note that the core is running at 168MHz in the tested case (APB1 42MHz, 84MHz AHB2)
Thanks
Regards
Mark
2020-08-02 07:02 PM
It sounds like you may not be waiting for the BSY flag to be de-asserted before writing to the next address. You may also need to __DSB() after writing but before polling for BSY to ensure operations are done in the right order.
2020-08-02 07:56 PM
Hi TDK
The BSY flag was being checked but wthout __DSB().
I just tried __DSB() between writing the new value and checking the BSY and it looks now to be OK without the delay.
Therefore it very probably was due to a synchronisation requirement which was not needed in the other parts that I had used with the same code.
Many thanks!
Regards
Mark
2020-08-03 12:22 AM
__DSB() is unnecessary restrictive, __DMB() is what you actually need. Is it so hard to read a few lines in documentation to understand how those two differ? Anyway, more details on this:
https://community.st.com/s/question/0D50X0000C4Nk4GSQS/bug-missing-compiler-and-cpu-memory-barriers
Also check out network related issues:
2020-08-03 08:52 AM
>>__DSB() is unnecessary restrictive
Since the flash write operation takes several micro-seconds to complete the exact boundary instruction used is not of any real consequence. The most restrictive one is fine.
>> Also check out network related issues:
The TCP/IP stack and Ethernet drivers used are from the uTasker project [https://github.com/uTasker/uTasker-Kinetis] and not lwip. There may be similar issues to those in the thread but there are presently no reports of related issues.
Regards
Mark
2020-08-03 09:45 AM
Personally would buffer data so it could write aligned WORDs or DOUBLE WORDs to the memory, and avoid pitfalls with ECC
Check also MPU buffering and caching settings
2020-08-03 10:39 AM
The flash driver writes data depending on the alignment (single bytes, half-words or full words) of the addresses and size of data - so is flexible.
The application can ensure that all is aligned to be a bit more efficient if it wants but the file system being used reserves 5 bytes which are written after the data content has been written. For processors/memory that can't handle flexible writes (eg. flash that must be written as lines of 128 bit (eg. LPC chips) with ECC) there is an optional buffering that can be enabled. However, apart from the BSY issue due to data synchronisation, there has been no issues with using it without the buffering enabled on STM32 parts so I am happy to keep it like this for the moment. If ever an issue arised that needed it I would of course enable it.
Regards
Mark
2020-08-04 12:50 PM
Actually this is the full correct implementation:
FLASH->CR = FLASH_CR_PG;
__DMB();
for (; nbData > 0; --nbData) {
*(volatile uint8_t*)pbMem++ = *(const uint8_t*)pbData++;
__DMB();
while (FLASH->SR & FLASH_SR_BSY);
}
DMB is necessary not only after writing memory data, but also before memory writing starts after configuring FLASH peripheral. This is to ensure that CPU doesn't start memory write operation before the store to CR register is complete as that is critically necessary for memory write to succeed. Instruction reordering is allowed because flash memory is not a device type memory like peripheral registers.
2020-08-04 01:00 PM
Still it's a completely unnecessary delaying of CPU as DMB blocks only load/store instructions but DSB blocks all instructions. In case of flash writing the difference is negligible, but on other tighter and more performance critical operations DSB can loose half of Cortex-M7 dual-issue execution abilities and speed. :)