Questions surrounding __HAL_LOCK

matthew2399 · ‎2016-04-25

Posted on April 26, 2016 at 00:12

I’m an engineer at Fluke, and we’re using an STM32F4xx seriesmicrocontroller (together with its HAL drivers) as the basis for a newproduct. The HAL contains a “locking�? mechanism, where eachsubsystem—I²C, USB, UART, and so on—has a “locking object�?. My team hasbeen working on the assumption that this mechanism’s goal is to providesome mutual exclusion for memory-mapped hardware registers and relatedstate so that, for example, an interrupt doesn’t modify the same hardware state being manipulated in the main “thread�? of exectution.

What’s confused us, however, is how this is done. The driver code found in stm32f4xx_hal_def.h defines the locking mechanism as follows:[1]

typedef
enum
{
HAL_UNLOCKED = 0x00U,
HAL_LOCKED = 0x01U
} HAL_LockTypeDef;
#if (USE_RTOS == 1)
/* Reserved for future use */
#error ''USE_RTOS should be 0 in the current HAL release''
#else
#define __HAL_LOCK(__HANDLE__) \
do
{ \
if
((__HANDLE__)->Lock == HAL_LOCKED) \
{ \
return
HAL_BUSY; \
} \
else
\
{ \
(__HANDLE__)->Lock = HAL_LOCKED; \
} \
}
while
(0)
#define __HAL_UNLOCK(__HANDLE__) \
do
{ \
(__HANDLE__)->Lock = HAL_UNLOCKED; \
}
while
(0)
#endif /* USE_RTOS */

In summary, “locking�? consists of setting a flag, or returning an error(indicating the system is busy) when the flag is already set.“Unlocking�? consists of unconditionally clearing the flag. What throws us is the immediate return of HAL_BUSY in the case ofcontention. If a HAL lock is providing mutual exclusion between the mainthread and an interrupt handler, what should be done if the interrupt isthe second caller of __HAL_LOCK and gets HAL_BUSY? An interrupt cannotpause its execution and start again when the lock is no longercontended, like a userspace thread could in a multitasking OS. (The bestone can do is have the interrupt schedule the work for some later time,but this is not always feasible.) Additionally, many read-modify-write sequences (such as ORing bits) are performed on hardware registers without any additional protection besides these ''locks''. What is to keep an interrupt (or, if one is running an RTOS, another task) from executing in the middle of one of these sequences? Based on these observations, we tried redefining __HAL_LOCK and__HAL_UNLOCK to be a global “interrupt lock�?, i.e., the former disablesinterrupts and the latter re-enables them. For cases where several callsto these functions nest (whether intentionally or unintentionally), wekeep track of a nesting count. Unfortunately this doesn’t work well withthe I²C driver, since it contains several code paths where, on atimeout, __HAL_UNLOCK is called twice for a single call to __HAL_LOCK,unbalancing our nesting count to disastrous effect. How is the __HAL_LOCK/__HAL_UNLOCK system meant to work? It would seemwe’re not using it according to its design goals. [1] All code is taken from STM32CubeF4, version 1.0 #hal #stm32f4

Willem La Grange · ‎2017-08-10

Posted on August 10, 2017 at 19:23

I think you all have the correct idea - Not to call certain functions inside an interrupt routine, and I agree on that.

Firstly, the HAL drivers can not know what is your interrupt priorities settings are, You might try to call a function that depends on a interrupt that has a lower priority than the function you are calling from. Your code might then be stuck in an endless loop.

Secondly, it is not good coding practice to handle messages in interrupt routines. Your messages might take too long to be decoded and/or serviced and then you have the next interrupt pending of lower priority in worst case, whilst still servicing the previous message.

So, the best practice would be to set a flag (declared as volatile) in the interrupt and handle the execution of the interrupt in the main loop. I know that is not always desirable, but it is definitely the safest. I also want to say that in some instances, if the process could be done very swiftly, you can handle the interrupt servicing inside the interrupt. For instance setting up the next buffer to be handled by the DMA on the I2S audio interface.

The questions to answer are thus:

1. How are the interrupt priorities structured

2. Can the interrupt servicing be moved to the main loop or not.

3. How many cycles are required in the servicing on the interrupt if it is to be handled within an interrupt.

Hope it helps..

Selso LIBERADO · ‎2017-11-28

Posted on November 28, 2017 at 17:17

Hello !

I've been experiencing issue on I2C because I'm having concurrent access from interrupt routine and main loop.

I use STMCUBE 1.

I've made a unit test to check the HAL_LOCK function :

using default implementation
replacing it by IT activation/désactivation (__disable_irq(), __enable_irq() ).
using solution ldrex/strex from
‌

On that test I implement a IT routine (on TIMER2 expiration) and execute a function in main loop.

Each have separate counter when lock succeded (supposed) and access a global counter as the critical section (and comparaison).

Result :

On the default implementation I get different values on counter : 1/1000 relative error
The IT activation/desactivation result in no error as expected.
the ldrex/strex implémentation does not give expected result : 1/1000 relative error.

About the intrinsic __ldrex/__strex Keil

http://www.keil.com/support/man/docs/armcc/armcc_chr1359124997htm

: .

Because memory accesses can clear the exclusive monitor, code using the

__ldrex

and

__strex

intrinsics can have unexpected behavior. Where

LDREX

and

STREX

instructions are needed, ARM recommends using embedded assembly.

So this explanation is sufficient to me to explain why the last solution did not work.

I also read on some

https://community.st.com/thread/infocenter.arm.com/help/topic/com.arm.doc.dht0008a/DHT0008A_arm_synchronization_primitives.pdf

that we must perform a DMB operation to ensure a consistent memory view.

I worked on a assembly code that uses ldrex/strex operation comptatible with GCC. This is my first assembly code I coded it basing on the default implementation listing and ARM documentation :

 uint32_t hallock(volatile uint8_t * pLock){uint32_t llock;uint32_t lBusy;uint32_t lFailed;__ASM volatile( 'mov%[rBusy], #1\t''ldrexb %[rlock], %[rplock]\t''cbnz%[rlock], busy\t''strexb %[rFailed], %[rBusy], %[rplock]\t''cbnz%[rFailed], busy\t''dmb\t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */'mov%[rBusy], %[rlock]\t' /* lock contenait unlock=0 */'bxlr\t''busy:\t''mov%[rBusy], #2\t' /* renvoyer HAL_BUSY */'bxlr\t': [rplock] '+Q' (*pLock), [rFailed] '=&r' (lFailed), [rlock] '=&r' (llock),[rBusy] '=&r' (lBusy):);// Never reach this point : avoir a warning 'no return'.return lBusy;}void halunlock(volatile uint8_t * pLock){uint8_t llock;__ASM volatile('mov %[rlock], #0\t''dmb\t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */'str %[rlock], %[rplock]\t': [rplock] '=Q' (*pLock), [rlock] '=&r' (llock):);}#undef __HAL_LOCK#define __HAL_LOCK(__HANDLE__) do { if( hallock( &(__HANDLE__)->Lock) != HAL_OK) return HAL_BUSY ; } while(0);#undef __HAL_UNLOCK#define __HAL_UNLOCK(__HANDLE__) do { halunlock( &(__HANDLE__)->Lock ) ; } while(0) ;#endif�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

taraben · ‎2017-11-28

Posted on November 28, 2017 at 17:55

Hello Selso,

I have not really understood, how you use the exclusive memory access (

ldrex/strex)

for ressource allocation.

Basically those are meant for use of concurrent memory access and generating lockfree code.

Did you thought about raising the CPU priority in order to prevent certain interrupts from being activated?

This has several advantages over disabling the interrupt.

see this sample:

ATOMIC_BLOCK(NVIC_GetPriority(TIM5_IRQn)) {

time_us = rx_timestamp.utc_us[0] + rx_timestamp.sub_sec; if((rx_timestamp.locked != false) && (rx_timestamp.utc_us[1] != 0)) { rx_timestamp.utc_us[0] = rx_timestamp.utc_us[1]; rx_timestamp.utc_us[1] = 0; } rx_timestamp.locked = false; }

You need to assign the CPU priorities more carefully do.

hope this helps,

Adib.

--

and this is how it is implemented

/**

@file atomic_stmh @brief implemented atomic function for STM32 controller similar as avrlib for atmel controller

@see discission on

https://community.st.com/external-link.jspa?url=http%3A%2F%2Fwww.mikrocontroller.net%2Ftopic%2F312246%23new

*/

#ifndef INC_ATOMIC_STM32_H

#define INC_ATOMIC_STM32_H

/**

block all ISR with same or higher priority number, @param prio system priority as used in NVIC_GetPriority() call @note the internal CMSIS command __set_BASEPRI_MAX can not use prio directly but must shift by the not implemented prio levels */ #define ATOMIC_BLOCK(prio) for( uint32_t __cond = 1, __prio = __get_BASEPRI(); \ __cond != 0 ? (__set_BASEPRI_MAX(prio << (8U - __NVIC_PRIO_BITS)), 0) : 0, __cond != 0; \ __set_BASEPRI(__prio), __cond = 0 )

/*

static uint32_t BEGIN_ATOMIC_BLOCK(uint32_t prio) { uint32_t __prim = __get_BASEPRI(); uint32_t prioritygroup = NVIC_GetPriorityGrouping(); uint32_t encodedprim = NVIC_EncodePriority(prioritygroup, prio, 0); uint32_t newbasepri = (uint8_t)((encodedprim << (8U - __NVIC_PRIO_BITS)) & (uint32_t)0xFFUL); __set_BASEPRI_MAX(newbasepri); return __prim; }

static void END_ATOMIC_BLOCK(uint32_t prio) { __set_BASEPRI(prio); }

*/

#endif // INC_ATOMIC_STM32_H

Selso LIBERADO · ‎2017-11-29

Posted on November 29, 2017 at 18:32

This is the aim of the assembly code :

load and check lock;
try to set it
if ok : call memory barrier operation
otherwise return 'busy'

Blocking IT is not a solution because it's not generic implémentation.

Using C code with intrinsic is not reliable because the compiler does not garantee that the exclusive monitor is not cleard, So one shall code assembly.

The example I gave do not work when the code is inline because it ships with the 'bx lr' instructions, I did a 'inline code' that performs well' :

__attribute__((always_inline)) __STATIC_INLINE uint32_t hallock(volatile uint8_t * pLock)
//uint32_t hallock(volatile uint8_t * pLock)
{
uint32_t llock = 1;
uint32_t lBusy = 1;
uint32_t lFailed = 1;
__ASM volatile (
'ldrexb %[rlock], %[rplock] 
\t'
'cmp%[rlock], #0 
\t'
'it eq 
\t'
'strexeq %[rFailed], %[rBusy], %[rplock] 
\t'
'cmp%[rFailed], #0
\t'
'it eq 
\t'
'dmbeq 
\t' /* dmb : ensures that all subsequent accesses are observed after the gaining of the lock is observed */
: [rplock] '+Q' (*pLock), [rFailed] '+r' (lFailed), [rlock] '+r' (llock),[rBusy] '+r' (lBusy)
: /* no input */
: 'memory'
);
return (lFailed == 0) ? HAL_OK : HAL_BUSY;
}
__attribute__((always_inline)) __STATIC_INLINE void halunlock(volatile uint8_t * pLock)
//void halunlock(volatile uint8_t * pLock)
{
uint8_t llock = 0;
__ASM volatile(
'dmb
\t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */
'str %[rlock], %[rplock]
\t'
: [rplock] '=Q' (*pLock)
: [rlock] 'r' (llock)
: 'memory'
);
}
#define __HAL_LOCK(__HANDLE__) do { if( hallock( &(__HANDLE__)->Lock) != HAL_OK) return HAL_BUSY ; } while(0);
#define __HAL_UNLOCK(__HANDLE__) do { halunlock( &(__HANDLE__)->Lock ) ; } while(0) ;
�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

Selso LIBERADO · ‎2017-11-30

Posted on November 30, 2017 at 12:17

Il would just add that str and strex shalle respectively modified to strb and strexb for storing a byte and not a word.

Vito Marolda · ‎2018-01-13

Posted on January 13, 2018 at 09:36

Hello I tried to compile the __HAL_LOCK replacement following your instructions (including strb and strexb), but I get the following errors (STM32F042):

ccvoDXXA.s:473: Error: selected processor does not support `ldrexb r1,[r2]' in Thumb mode

ccvoDXXA.s:475: Error: selected processor does not support `it eq' in Thumb mode ccvoDXXA.s:476: Error: selected processor does not support `strexbeq r3,r0,[r2]' in Thumb mode ccvoDXXA.s:478: Error: selected processor does not support `it eq' in Thumb mode ccvoDXXA.s:479: Error: Thumb does not support conditional execution

here is what I wrote:

__attribute__((always_inline)) __STATIC_INLINE uint32_t hallock(volatile uint8_t * pLock)
//uint32_t hallock(volatile uint8_t * pLock)
{
 uint32_t llock = 1;
 uint32_t lBusy = 1;
 uint32_t lFailed = 1;
 __ASM volatile (
 'ldrexb %[rlock], %[rplock] 
\t'
 'cmp %[rlock], #0 
\t'
 'it eq 
\t'
 'strexbeq %[rFailed], %[rBusy], %[rplock] 
\t'
 'cmp %[rFailed], #0 
\t'
 'it eq 
\t'
 'dmbeq 
\t' /* dmb : ensures that all subsequent accesses are observed after the gaining of the lock is observed */
 : [rplock] '+Q' (*pLock), [rFailed] '+r' (lFailed), [rlock] '+r' (llock),[rBusy] '+r' (lBusy)
 : /* no input */
 : 'memory'
 );
 return (lFailed == 0) ? HAL_OK : HAL_BUSY;
}
__attribute__((always_inline)) __STATIC_INLINE void halunlock(volatile uint8_t * pLock)
//void halunlock(volatile uint8_t * pLock)
{
 uint8_t llock = 0;
 __ASM volatile(
 'dmb 
\t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */
 'strb %[rlock], %[rplock] 
\t'
 : [rplock] '=Q' (*pLock)
 : [rlock] 'r' (llock)
 : 'memory'
 );
}
#define __HAL_LOCK(__HANDLE__) do { if( hallock( &(__HANDLE__)->Lock) != HAL_OK) return HAL_BUSY ; } while(0);
#define __HAL_UNLOCK(__HANDLE__) do { halunlock( &(__HANDLE__)->Lock ) ; } while(0) ;
�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

Sorry I have very limited experience with ARM assembly.

Can you please help?

Thank you.

Selso LIBERADO · ‎2018-01-15

Posted on January 15, 2018 at 15:25

Hello !

According to you MCU model you are compiling the code for a cortex-M0, that do not s

https://developer.arm.com/docs/dui0497/latest/the-cortex-m0-instruction-set/instruction-set-summary

https://developer.arm.com/docs/dui0553/latest/2-the-cortex-m4-processor/22-memory-model/227-synchronization-primitives

I'm sorry that implementation cannot fit your need, I'm also learning so I missed that point.

Another lock implementation could be made with

https://developer.arm.com/products/architecture/a-profile/docs/dht0008/latest/a-swp-and-swpb/a1-legacy-synchronization-instructions/a11-swp-and-swpb

, but it is also not supported by these new family of Cortex.

The code shall also be changed because of the unsupported 'it' instruction.

For your case I don't see anything but masking IT.

Selso LIBERADO · ‎2018-01-15

Posted on January 15, 2018 at 16:33

So, generally speaking, fixing __HAL_LOCK with a more robust code would not solve the weaknesses of HAL drivers.

You're wrong it actually does ! The assembly code is somewhat a mutex implementation. It's not because you could not test it that is does not work.

I can send my sources that do proves it works.

Things could be different with an RTOS (e.g. FreeRTOS), but then you have RTOS' locks, so __HAL_LOCK/UNLOCK could be rewritten using RTOS locks instead of assembly code.

And what do you think RTOS locks implementation actually are ?

You can't just say 'disabling IT' resolve issues, as it may creates other pb, think of that :

func1()

disable IT

call func2 (fun2 also disables and enables IT).

/** YOU are doomed here !!!*/

enable IT

Vito Marolda · ‎2018-01-15

Posted on January 15, 2018 at 15:52

Hello thank you very much for your kind reply!

I did mask the interrupts and everything works fine now.

I also understood that

fixing __HAL_LOCK would not solve main program / irq synchronization for HAL drivers. Only IRQ disable during lock and re-enable in unlock would solve.

In fact, my problem was with CAN rx interrupts never re-enabled after a while, and I discovered it was due toCAN irq occurring during the call to HAL_CAN_Transmit_IT.

In this case, masking IT is the only solution:HAL_CAN_IRQHandler andCAN_Receive_IT functions don't care __HAL_LOCK - of course - and mess with state in a totally unguarded way.

I have solved it masking IT before calling HAL_CAN_Transmit_IT and reenabling them:

static void canEnterCritical(void){
NVIC_DisableIRQ(CEC_CAN_IRQn);
__asm volatile( 'dsb' ); // Synchronization of the pipeline to guarantee
__asm volatile( 'isb' ); // that the protection against interrupts is effective
}
static void canExitCrititcal(void){
NVIC_EnableIRQ(CEC_CAN_IRQn);
}
�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

So, generally speaking, fixing __HAL_LOCK with a more robust code would not solve the weaknesses of HAL drivers.

Things could be different with an RTOS (e.g. FreeRTOS), but then you have RTOS' locks, so __HAL_LOCK/UNLOCK could be rewritten using RTOS locks instead of assembly code.

--Vito.

Vito Marolda · ‎2018-01-16

Posted on January 16, 2018 at 11:08

,

Hello I agree disabling interrupts is bad practice.

But in my particular case (CAN Tx / CAN Irq issue), after inspecting the

CAN HAL drivers source code, I found buggy read-modify-writes on status

data in interrupt code, that are shared with the APIs. Moreover, interrupt

code does not call __HAL_LOCK/UNLOCK at all.

So, modifying __HAL_LOCK/UNLOCK to become a real and strong mutex (as your

fix, for example) would not help in this case.

Regards,

--Vito.

2018-01-15 16:34 GMT+01:00 Selso LIBERADO <,st-microelectronics@jiveon.com>,:

STMicroelectronics Community

<,https://community.st.com/?et=watches.email.thread>,

Re: Questions surrounding __HAL_LOCK

reply from Selso LIBERADO

<,

in STM32 MCUs Forum - View the full discussion

<,https://community.st.com/message/181294-re-questions-surrounding-hallock?commentID=181294&,et=watches.email.thread ♯ comment-181294>,