[SOLVED] Flash Write/Erase mechanisms behaviors, Cube examples reliability
Hi,
We are currently working on an usb composite library but some old FLASH ghosts reappeared.
When sending data through a cdc flash started to fail some erase/program.
We decided to upgrade our Wireless stack to latest (1.10.0) from our current 1.6.0 (FUS 1.0.2).
From there, we discovered that AN5289 got some big updates for clocks and FLASH sharing (good news!).
Now, 3 big questions/remarks subsist:
- Two questions concerning our code:
1) The first device we updated (FUS 1.0.2-> 1.1.0; WS 1.6 -> 1.10) never get CPU2 notification for WS ready and all flash operations fails. Is it a brick or a defaulted CPU2?
2) On some flash operations, FLASH_SR register is not all 0 and FLASH_CR_LOCK bit is set to 1 without doing anything. Trying to write 1 again result in hardfaulting our device (so our actual patch is only call HAL_Lock when FLASH_CR_LOCK is 0).
- Two other concerning BLE_RfWithFlash/flash_driver.c
3) When single_flash_operation_status fail, some action don't respect AN5289 p36 Figure 10:
if(single_flash_operation_status != SINGLE_FLASH_OPERATION_DONE)
{
return_value = NbrOfSectors - loop_flash + 1;
}
else
{
/**
* Notify the CPU2 there will be no request anymore to erase the flash
* On reception of this command, the CPU2 disables the BLE timing protection versus flash erase processing
*/
SHCI_C2_FLASH_EraseActivity(ERASE_ACTIVITY_OFF);
HAL_FLASH_Lock();
/**
* Release the ownership of the Flash IP
*/
LL_HSEM_ReleaseLock(HSEM, CFG_HW_FLASH_SEMID, 0);
return_value = 0;
}There, Flash will not be re-locked (but this make sense regarding question 2) ). More strange, Erase activity off and semaphore 2 will not be released!
4) DeadLock:
UTILS_ENTER_CRITICAL_SECTION();
/**
* Depending on the application implementation, in case a multitasking is possible with an OS,
* it should be checked here if another task in the application disallowed flash processing to protect
* some latency in critical code execution
* When flash processing is ongoing, the CPU cannot access the flash anymore.
* Trying to access the flash during that time stalls the CPU.
* The only way for CPU1 to disallow flash processing is to take CFG_HW_BLOCK_FLASH_REQ_BY_CPU1_SEMID.
*/
cpu1_sem_status = (SemStatus_t)LL_HSEM_GetStatus(HSEM, CFG_HW_BLOCK_FLASH_REQ_BY_CPU1_SEMID);
if(cpu1_sem_status == SEM_LOCK_SUCCESSFUL)
{
/**
* Check now if the CPU2 disallows flash processing to protect its timing.
* If the semaphore is locked, the CPU2 does not allow flash processing
*
* Note: By default, the CPU2 uses the PESD mechanism to protect its timing,
* therefore, it is useless to get/release the semaphore.
*
* However, keeping that code make it compatible with the two mechanisms.
* The protection by semaphore is enabled on CPU2 side with the command SHCI_C2_SetFlashActivityControl()
*
*/
cpu2_sem_status = (SemStatus_t)LL_HSEM_1StepLock(HSEM, CFG_HW_BLOCK_FLASH_REQ_BY_CPU2_SEMID);
if(cpu2_sem_status == SEM_LOCK_SUCCESSFUL)
{
/**
* When CFG_HW_BLOCK_FLASH_REQ_BY_CPU2_SEMID is taken, it is allowed to only erase one sector or
* write one single 64bits data
* When either several sectors need to be erased or several 64bits data need to be written,
* the application shall first exit from the critical section and try again.
*/
if(FlashOperationType == FLASH_ERASE)
{
HAL_FLASHEx_Erase(&p_erase_init, &page_error);
}
else
{
HAL_FLASH_Program(FLASH_TYPEPROGRAM_DOUBLEWORD, SectorNumberOrDestAddress, Data);
}
/**
* Release the semaphore to give the opportunity to CPU2 to protect its timing versus the next flash operation
* by taking this semaphore.
* Note that the CPU2 is polling on this semaphore so CPU1 shall release it as fast as possible.
* This is why this code is protected by a critical section.
*/
LL_HSEM_ReleaseLock(HSEM, CFG_HW_BLOCK_FLASH_REQ_BY_CPU2_SEMID, 0);
}
}
UTILS_EXIT_CRITICAL_SECTION();Here, critical section is defined as follow:
#define UTILS_ENTER_CRITICAL_SECTION( ) uint32_t primask_bit = __get_PRIMASK( );\
__disable_irq( )Looking into HAL_FLASHex_Erase or HAL_FLASH_Program reveal that both function call FLASH_WaitForLastOperation calling itself HAL_GetTick() for timeout protection.
But in most examples, HAL_GetTick return a variable updated by a systick Interrupt (for us, HAL Ticks are incremented by TIM1).
We already get locked in an infinite timeout measurement waiting for a Tick never incrementing.
Sorry for the long post, feel free to answer any question, just specify which.
NOTE: It's a detail, but our LSE is generated by an external oscillator and configured as bypass mode.