2020-02-05 11:44 PM
STM32 HAL driver library is full of flawed and sub-optimal constructs. The most common one, which impacts almost all drivers, is the lock mechanism. It's a bad and limiting design and getting rid of it requires a major rewrite, but the worst fact is that it's not even interrupt safe and therefore doesn't provide locking for which it was introduced. The current __HAL_LOCK() (reformatted) code looks like this:
#define __HAL_LOCK(__HANDLE__) \
do{ \
if((__HANDLE__)->Lock == HAL_LOCKED) \
{ \
return HAL_BUSY; \
} \
else \
{ \
(__HANDLE__)->Lock = HAL_LOCKED; \
} \
}while (0U)
Between testing and setting the ->Lock an interrupt can happen and also test and set the ->Lock. Therefore both - main thread and interrupt - will continue execution as if the object was unlocked and the interrupt will unlock it before the main thread has completed it's "locked" part, which makes it even more prone to next interrupt calls. The same will happen when higher priority interrupt interrupts lower priority interrupt. Additionally the ->Lock variable is not marked as volatile and therefore is prone to reorder by compiler optimization.
The proposed fix is simple and requires adding of only a few lines of code in stm32XXxx_hal_def.h files for all STM32 series:
#define __HAL_LOCK(__HANDLE__) \
do { \
uint32_t rPriMask = __get_PRIMASK(); \
__disable_irq(); \
if ((__HANDLE__)->Lock == HAL_UNLOCKED) { \
(__HANDLE__)->Lock = HAL_LOCKED; \
__set_PRIMASK(rPriMask); \
} else { \
__set_PRIMASK(rPriMask); \
return HAL_BUSY; \
} \
} while (0)
typedef volatile enum {
HAL_UNLOCKED = 0x00U,
HAL_LOCKED = 0x01U
} HAL_LockTypeDef;
Thus this will make that bad construct at least interrupt safe and actually provide locking as was intended.
Note that __HAL_UNLOCK() code doesn't need modifications as it already is atomic.
2020-02-20 04:03 PM
That's a very good idea with a #define to adjust the behavior of __HAL_LOCK/__HAL_UNLOCK mechanism.
Several levels from aborting (current behavior), waiting/blocking, complete disable etc. would be conceivable.
It's on the developer to implement a correct application structure and avoiding concurrent access to same hardware.
There are several synchronisation mechanisms (e.g. flags and semaphores) to deserialize hardware access in one task.
__HAL_LOCK/__HAL_UNLOCK should help to find the problem in application structure but should not try to solve it automatically.
2020-03-18 07:58 AM
Hello,
an overview on the lock mechanism coming soon in STM32Cube Package :
1- Critical section : The critical section mechanism is based on the use of the stack and the restore primask mechanism instead of enabling IRQs on the Exit CS phase.
Typical use of this method is illustrated in the pseudo code below:
HAL_StatusTypeDef HAL_PPP_Process (PPP_HandleTypeDef *hppp, __PARAMS__)
{
__HAL_ENTER_CRITICAL_SECTION();
/* Protected resources */
__HAL_EXIT_CRITICAL_SECTION();
}
The Enter/Exit CS functions are implemented macros in the stm32ynxx_hal_def.h file as follows for both bare metal and RTOS cases:
#if (USE_RTOS == 1)
#define __HAL_ENTER_CRITICAL_SECTION() OsEnterCriticalSection()
#define __HAL_EXIT_CRITICAL_SECTION() OsExitCriticalSection()
#else
#define __HAL_ENTER_CRITICAL_SECTION() \
uint32_t PriMsk; \
PriMsk = __get_PRIMASK(); \
__set_PRIMASK(1); \
#define __HAL_EXIT_CRITICAL_SECTION() \
__set_PRIMASK(PriMsk); \
#endif
2- Lock mechanism : ,The lock object is an entity allocated in the peripherals drivers handles and defined for each standalone process, for full duplex processes with simultaneous transfer, 2 lock objects shall be used. For peripheral with sub instances (Channels, Endpoints….etc) a lock object per sub-instance shall be defined.
a lock macro is used before starting any process as follows :
HAL_StatusTypeDef HAL_PPP_Process (PPP_HandleTypeDef *hppp, __PARAMS__, uint32_t Timeout)
{
uint32_t tickstart = HAL_GetTick();
if(__ARGS__ == WRONG_PARAMS)
{
hppp->ErrorCode = HAL_PPP_ERROR_PARAM;
return HAL_ERROR;
}
if(HAL_Lock (hppp->iLock) == HAL_LOCKED)
{
return HAL_BUSY;
}
(...)
Lock methods for ARMv7/ ARMv8
/**
* @brief Attempts to acquire the lock.
* @param lock Pointer to variable used for the lock.
* @details This in an interrupt safe function that can be used as a mutex.
The lock variable shall remain in scope until the lock is released.
Will not block if another thread has acquired the lock.
* @returns HAL_LOCKED if everything successful, HAL_UNLOCK if lock is taken.
*/
__STATIC_INLINE HAL_LockStateTypeDef HAL_Lock(__IO uint32_t *lock)
{
do {
/* Return if the lock is taken by a different thread */
if(__LDREXW(lock) != HAL_UNLOCKED) {
return HAL_LOCKED;
}
/* Attempt to take the lock */
} while(__STREXW(HAL_LOCKED, lock) != 0);
/* Do not start any other memory access until memory barrier is complete */
__DMB();
return HAL_UNLOCKED;
}
/**
* @brief Free the given lock.
* @param lock Pointer to variable used for the lock.
*/
__STATIC_INLINE void HAL_UnLock(uint32_t *lock)
{
/* Ensure memory operations complete before releasing lock*/
__DMB();
*lock = HAL_UNLOCKED;
}
Lock methods for ARMv6
/**
* @brief Attempts to acquire the lock.
* @param lock Pointer to variable used for the lock.
* @details This in an interrupt safe function that can be used as a mutex.
The lock variable shall remain in scope until the lock is released.
Will not block if another thread has acquired the lock.
* @ returns HAL_LOCKED if everything successful, HAL_UNLOCK if lock is taken.
*/
__STATIC_INLINE HAL_LockStateTypeDef HAL_Lock(__IO uint32_t *lock)
{
uint32_t oldvalue;
__HAL_SAVE_PRIMASK();
__HAL_ENTER_CRITICAL_SECTION();
oldvalue = *lock;
if(*lock == HAL_UNLOCKED)
{
*lock = HAL_LOCKED;
}
__HAL_EXIT_CRITICAL_SECTION();
return (oldvalue);
}
/**
* @brief Free the given lock.
* @param lock Pointer to variable used for the lock.
*/
__STATIC_INLINE void HAL_UnLock(__IO uint32_t *lock)
{
*lock = HAL_UNLOCKED;
}
the above implementations are used in non RTOS env. when RTOS is used (USE_RTOS), the lock is simply a semaphore take (unlock = semaphore release)
this way, when a process is locked in RTOS env. the current process is pended till the semaphore is freed, then the process resume once the semaphore is released.
Rds
2020-03-18 08:41 PM
Hi @MMAST.1
Thanks for posting the update and the opportunity to review It here.
There are some areas needing some more work. Please accept my comments constructively….
This is a summary of the LDREX and STREX instructions:
For the MCUs equipped with the LDREX and STREX instructions, this is what your HAL_Lock function does:
These are its outcomes:
PROBLEM #1. The HAL_Lock function’s detail description “The lock variable shall remain in scope until the lock is released�? is incorrect. It either obtains the lock for its thread or it detects another thread has it.
PROBLEM #2. The HAL_Lock function’s returns description is incorrect/inaccurate, and the HAL_UNLOCKED and HAL_LOCKED returns do not describe the function’s operation well and so a casual reader might incorrectly assume it is only reading the lock. It would read better if it returned HAL_LOCKED if it obtained the lock and HAL_BUSY otherwise.
PROBLEM #3. For the MCUs without LDREX and STREX instructions, the HAL_Lock function would execute faster and the code would be smaller if “if(*lock == HAL_UNLOCKED)�? were replaced with “if(oldvalue == HAL_UNLOCKED)�?. Remember *lock is volatile and the compiler has to load it from memory again. But you have already read it to oldvalue which could be a register, and would still be smaller code if it were local, and you have already entered a critical section.
REQUEST #1. Please add a method to turn off HAL’s locks.
I layer my apps so the calls of each peripheral drivers are single-threaded, or each direction is single-threaded if the peripheral supports simultaneous receive and transmit, and so my apps never see a busy error. If my app needed to output from more than one task, those tasks send to a task dedicated to the peripheral (or its output channel if it is duplex) where it is queued and started it as soon as the last output finishes or immediately if no output is in progress. Similar for receive, if my app needs to send received data to different tasks, a task dedicated to the peripheral would interrogate the data or check an application mode (with suitable protection) to determine where and forward it.
In summary, I design my apps to always work correctly.
Further, my company choose the smallest/cheapest part to do a job. So I want to save easy cycles.
Your method to disable HAL locks might be like this:
REQUEST #2. Please add a method to turn off HAL’s parameter checking. I’ve debugged. I’m accepting the MCU may be struck by a sub-atomic particles. I accept the risks. Please turn them off the same way as the HAL’s locks.
I do not have a good grasp why HAL locks are necessary. But clearly they are, else other developers would be asking for ways to turn them off too.
THOUGH #1. What does a task dedicated a peripheral or one of its channels look like? As example, this is one of my go-to methods for a dedicated task to handle a peripheral’s transmit channel:
THOUGHT #2. Does HAL have locks only because we can’t engineer our apps properly?
If you develop apps with more than one thread accessing a peripheral or one of its channels, I’ll poke with some tongue-in-cheek questions…
Post your thoughts.
2020-03-19 12:42 AM
Thanks a lot Alister, actually your feedback are more than appreciated. as I mentionned the listing I write is an overview of the update. we will take care of your feedback to improve the mechanism. thanks again
2020-04-01 08:22 AM
In addition to what Alister said...
#define __HAL_ENTER_CRITICAL_SECTION() \
uint32_t PriMsk; \
This will get you in a trouble if the lock/unlock will be necessary multiple times in a single function/block. Also it's not clear what __HAL_SAVE_PRIMASK(); does if __HAL_ENTER_CRITICAL_SECTION(); also does the same. Probably something like this should be introduced and put at the top of the function:
#define __HAL_DECLARE_CRITICAL_SECTION() uint32_t PriMsk
Or another simpler solution is to implement one global critical section nesting counter, as it's done in FreeRTOS, for example.
> when a process is locked in RTOS env. the current process is pended till the semaphore is freed
So for a non-RTOS environment HAL_Lock() would be non-blocking and returning BUSY, but for RTOS it would be blocking and not failing. That is inconsistent and leads to confusion and significantly different usage in each case. I mostly agree to Alister that the HAL_Lock() is unnecessary and damaging. And yes - there is no really a sane scenario for what to do when HAL_Lock() returns BUSY anyway. Managing access to a peripheral is a task for a higher platform layer code, not the driver.
2020-10-10 06:11 AM
The API that use lock can't be called from interrupt context, as lock is implemented with a RTOS semaphore.
A semaphore can be released from an IT but can't be taken (an IT can't be delayed).
The HAL_UART_DMAStop source code clearly explain this:
/* The Lock is not implemented on this API to allow the user application
to call the HAL UART API under callbacks HAL_UART_TxCpltCallback() / HAL_UART_RxCpltCallback():
when calling HAL_DMA_Abort() API the DMA TX/RX Transfer complete interrupt is generated
and the correspond call back is executed HAL_UART_TxCpltCallback() / HAL_UART_RxCpltCallback()
*/
For example in UART HAL the following functions use lock:
HAL_UART_RegisterCallback
HAL_UART_UnRegisterCallback
HAL_UART_Transmit_IT
HAL_UART_Receive_IT
HAL_UART_Transmit_DMA
HAL_UART_Receive_DMA
HAL_UART_DMAPause
HAL_UART_DMAResume
HAL_LIN_SendBreak
HAL_MultiProcessor_EnterMuteMode
HAL_MultiProcessor_ExitMuteMode
HAL_HalfDuplex_EnableTransmitter
HAL_HalfDuplex_EnableReceiver
So most of the API is not usable from interrupt.
Perhaps the HAL API should report an error if such an API is called from an interrupt context
Some RTOS rises an assert when a forbiden API is used in interrupt context: The HAL could build on that.
2022-11-11 07:17 AM
Dear All,
Based on the different gathered feedbacks from this Forum and other feedbacks sources and the full analysis of the different calls to the HAL_Lock() and the issues mentioned regarding this topic shows that the __HAL_LOCK() and __HAL_UnLOCK() are not used always as "standard" lock mechanism for critical sections properly but rather as a special state machine to reject launching same HAL processes in several statements in the current HAL , thus the following updates have been introduced on the HAL to fix the issue related to this topic.
Fix
Example:
HAL_StatusTypeDef HAL_UART_Transmit_IT(UART_HandleTypeDef *huart, const uint8_t *pData, uint16_t Size)
{
if (huart->gState == HAL_UART_STATE_READY)
{ ...}
}
2- Protect changing the common state machine for several processes (Already implemented on uart and full deployment ongoing):
Fix: have a state machine per independent process
Example:
typedef struct __UART_HandleTypeDef
{
(…)
HAL_LockTypeDef Lock; /*!< Locking object */
__IO HAL_UART_StateTypeDef gState; /*!< UART state information related to global Handle management
and also related to Tx operations. This parameter
can be a value of @ref HAL_UART_StateTypeDef */
__IO HAL_UART_StateTypeDef RxState; /*!< UART state information related to Rx operations. This
parameter can be a value of @ref HAL_UART_StateTypeDef */
} UART_HandleTypeDef;
3 - Protect checking and modifying the state machine by locking the check and set statement within lock mechanism based on __LDREXH / __STREXH (only series based on CM0 core come with an implementation around the enable/disable irq: (will be deployed on next HAL releases):
#define HAL_CHECK_AND_SET_STATE(__HANDLE__, __PPP_STATE_FIELD__, __PPP_CONDITIONAL_STATE__, __PPP_NEW_STATE__) \
do { \
do{ \
/* Return HAL_BUSY if the status is not ready */ \
if (__LDREXW((__IO uint32_t *)&(__HANDLE__)->__PPP_STATE_FIELD__) != (uint32_t)(__PPP_CONDITIONAL_STATE__)) \
{ \
return HAL_BUSY; \
} \
/* if state is ready then attempt to change the state to the new one */ \
} while(__STREXW((uint32_t)(__PPP_NEW_STATE__), (__IO uint32_t *)&((__HANDLE__)->__PPP_STATE_FIELD__)) != 0); \
/* Do not start any other memory access until memory barrier is complete */ \
__DMB(); \
}while(0)
4 - Protect common processes register update (Already implemented):
Fix:
Add new macros in the CMSIS device files for atomic bit and registers modifications (based on __LDREXH / __STREXH (only series based on CM0 core come with an implementation around the enable/disable irq)
§ ATOMIC_SET_BIT(REG, BIT)
§ ATOMIC_MODIFY_REG(REG, CLEARMSK, SETMASK)
§ ATOMIC_SETH_BIT(REG, BIT)
§ ATOMIC_CLEARH_BIT(REG, BIT)
§ ATOMIC_MODIFYH_REG(REG, CLEARMSK, SETMASK)
Example:
HAL_StatusTypeDef HAL_UART_Receive_DMA(UART_HandleTypeDef *huart, uint8_t *pData, uint16_t Size)
{
(…)
/* Enable the UART Receiver Timeout Interrupt */
ATOMIC_SET_BIT(huart->Instance->CR1, USART_CR1_RTOIE); => ATOMIC access for the USART_CR1_RTOIE bit
(…)
}
Thanks and Regards
Maher
2023-12-18 08:13 AM
Hello @MMAST.1,
Can you share the yearly update on this topic ?
I'm still struggling with very rare bug on a module <-> mcu communication based on.
I'm using raw C, no OS, no multi-thread.
TX (in Cube MX: Preemption Priority 2 UART and DMA):
HAL_UART_Transmit_DMA() wait for a flag set by
HAL_UART_TxCpltCallback()
RX: (in Cube MX: Preemption Priority 2 UART and DMA, DMA Circular, Overrun: Disable, DMA on RX Error: Enable)
HAL_UARTEx_ReceiveToIdle_DMA()
#define __HAL_LOCK(__HANDLE__)
do{
if((__HANDLE__)->Lock == HAL_LOCKED)
{
return HAL_BUSY;
}
else
{
(__HANDLE__)->Lock = HAL_LOCKED;
}
}while (0)
#define __HAL_UNLOCK(__HANDLE__)
do{
(__HANDLE__)->Lock = HAL_UNLOCKED;
}while (0)
More detailed about my issue here
Thank you for any help