cancel
Showing results for 
Search instead for 
Did you mean: 

Can __HAL_LOCK in stm32 code cause race condition issues

Kefei Yao
Associate III
Posted on March 19, 2018 at 10:24

The __HAL_LOCK macro is defined as in stm32f1xx_hal_def.h

#define __HAL_LOCK(__HANDLE__) \

do{ \

if((__HANDLE__)->Lock == HAL_LOCKED) \

{ \

return HAL_BUSY; \

} \

else \

{ \

(__HANDLE__)->Lock = HAL_LOCKED; \

} \

}while (0)

So the Lock is read first and modified if needed. Would it cause race condition while the task calling __HAL_LOCK is preempted by a higher priority task which call __HAL_LOCK subsequently.  

For example, task1 calls __HAL_LOCK and find the lock is open. task 1 is preempted by task2 whose priority is higher than task1 right before task1 write the Lock as HAL_LOCKED. task2 finds the lock is open and set the Lock as HAL_LOCKED. The task2 goes to block state to wait for some resource. Then task1 runs again and continue to get the lock. This will potentially corrupt the resources protect by the __HAL_LOCK for task2.

I would be happy if someone could correct me if I'm wrong since I could learn a lesson from it

18 REPLIES 18
Posted on May 15, 2018 at 09:41

Dude, this is not related to multi-tasking, but to concurrency. Every interrupt causes concurrency. And as the HAL uses interrupts, it's broken. The only option is not to use the HAL.

Posted on May 15, 2018 at 15:06

... The only option is not to use the HAL.

The HAL can still be used if task-level 'managers' are carefully implemented for the needed modules.  You're still subject to the HAL's races, MIA errata workarounds, and lingering bugs but the managing tasks can make it bearable.

sudo112358
Associate II

Hi there,

correct, concurrency is the keyword. We just had the problem that while starting to send a message over CAN bus, a receive interrupt can always occur....finally, we found out about this problem by hard search and just after a module stopped receiving any messages due to this race condition...the receive interrupt was forever disabled....of course in a (prototype) system running in a customer's application.

So, like in the Linux kernel: "If it's not broken, don't fix it." (mind the inverse logic :)

....does anyone have an unbroken link to this solution?

That was a link to another discussion. Your only option is to fix the __HAL_LOCK-Makro to something working. The STM actually has the necessary instructions to do so. We solved our issues by porting to the LL-Lib. Smaller code, better performance, fewer bugs....

Hi,

Can you provide your code?

Darius

Hi,

I can't because it's part of our product. But the problem appears when we use the HAL library for CAN communication, transmit and receive in interrupt mode. When our module is just entering the transmit interrupt routine and simultaneously receives a message, the lock fails and as far as we could debug, the receive routine has already disabled the receive interrupt and will somehow not run through the receive callback which would enable the interrupt again. I can check in the source code tomorrow for details, but the principal behaviour is like this. We had the system running for weeks in an application at a customer's site and suddenly it did not "receive" any messages - in other words, the interrupt didn't fire anymore. In our lab, we built the same setup with a higher rate of sent and received messages and it just takes minutes, sometimes even some seconds until it fails. Manually re-setting the interrupt flag for reception in debug solves the problem until it crashes again. Our current workaround is quite ugly, but as one of the next steps, we will switch to the LL-lib or finally implement it ourselves..the HAL has several problems and a strange code quality imho...

Greetings,

Chris

Have you nested interrupts? Check interrupts priority. Be carefull with nested interrupts.

HAL systick must have higher priority.

BR.

Darius

T J
Lead

I use the Canbus without issue, but I dont use the CanTx interrupts, they are not enabled.

I use a CanTx message table and a CanRx message table both 64 frames long.

During the CanRx interrupt, I pull the new frame and save it in the table.

In the primary foreground process,

If there is a CanTx message frame ready to go, then it is sent without enabling any interrupt.

I only look at MsgIDs in the ranges 0x400 - 0x5FF   and  0x600-0x6FF  
 
This is my Startup init code:
 
char canRxString            [1024];
uint16_t canRxMsgID         [64];
uint8_t  canRxMsgLength     [64];
uint32_t canRxMsgBottomWord [64];
uint32_t canRxMsgTopWord    [64];
char 	 canRxHaveSegmentID [64];							
 
char canTxString[1024];
uint16_t canTxMsgID         [64];
uint8_t  canTxMsgLength     [64];
uint32_t canTxMsgBottomWord [64];
uint32_t canTxMsgTopWord    [64];
 
 
in HAL void CAN_Config() {
:
.
.
.
 
    //##-2- Configure the CAN Filter ###########################################
    sFilterConfig.FilterNumber 					= 0;
    sFilterConfig.FilterMode   					= CAN_FILTERMODE_IDMASK;
    sFilterConfig.FilterScale  					= CAN_FILTERSCALE_32BIT;
    sFilterConfig.FilterIdHigh 					= 0x400 << 5;  //11-bit ID in top bits
    sFilterConfig.FilterIdLow  					= 0x0000;
    sFilterConfig.FilterMaskIdHigh	  	        = 0x600 << 5; 	// resolves as 0x0400 - 0x05FF
    sFilterConfig.FilterMaskIdLow				= 0x0000;
    sFilterConfig.FilterFIFOAssignment	        = 0;
    sFilterConfig.FilterActivation	            = ENABLE;
    sFilterConfig.BankNumber 	    			= 0;
 
    if (HAL_CAN_ConfigFilter(&hcan, &sFilterConfig) != HAL_OK)
    {
        // Filter configuration Error 
        _Error_Handler(__FILE__, __LINE__);
        char string[38];
 
        sprintf( string, "Can Filter configuration Error\n\r");
        puts1( string);
    }
	
 
    //##-2- Configure the 2nd CAN Filter ###########################################
    sFilterConfig.FilterNumber 			= 1;
    sFilterConfig.FilterMode   			= CAN_FILTERMODE_IDMASK;
    sFilterConfig.FilterScale  			= CAN_FILTERSCALE_32BIT;
    sFilterConfig.FilterIdHigh 			= 0x600 << 5;  //11-bit ID in top bits		// command channel
    sFilterConfig.FilterIdLow  			= 0x0000;
    sFilterConfig.FilterMaskIdHigh		= 0x700 << 5; 	// resolves as 0x0600 - 0x06FF
    sFilterConfig.FilterMaskIdLow		= 0x0000;
    sFilterConfig.FilterFIFOAssignment	= 1;
    sFilterConfig.FilterActivation		= ENABLE;
    sFilterConfig.BankNumber 			= 1;
 
    if (HAL_CAN_ConfigFilter(&hcan, &sFilterConfig) != HAL_OK)
    {
        // Filter configuration Error 
        _Error_Handler(__FILE__, __LINE__);
        char string[38];
        sprintf( string, "Can Filter configuration Error\n\r");
       puts1( string);
    }
 
 
 
after HAL Can init code:
 
void initCairoCan(void) {
    CAN_Config();
    canRxpointerIN = 0;
    canRxMsgIN  = 0;
    canRxMsgOUT = 0;
    canRxMsgTableEMPTY = true;
    canRxMsgTableFULL = false;            
 
    for (int i = 0; i < 16; i++)
        IOCanMsgFlag[i] = false;
 
			
	
    __HAL_CAN_ENABLE_IT(&hcan, CAN_IT_FMP1);
    __HAL_CAN_ENABLE_IT(&hcan, CAN_IT_FMP0);
		
    canTxMsgIN  = 0;
    canTxMsgOUT = 0;
    canTxMsgTableEMPTY 		= true;
    canTxMsgTableFULL 		= false;
    canTxMsgTableOverflow = false;
    canTxMsgOverrun = false;
    blockCanTx = false;
 
}
 
I edited the interrupt code :
void CEC_CAN_IRQHandler(void) {
    /* USER CODE BEGIN CEC_CAN_IRQn 0 */
    CAN_IRQFlag = true;
    checkCanRxFifos();
    /* USER CODE END CEC_CAN_IRQn 0 */
    HAL_CAN_IRQHandler(&hcan);
    /* USER CODE BEGIN CEC_CAN_IRQn 1 */
 
    /* USER CODE END CEC_CAN_IRQn 1 */
}
 
 
void checkCanRxFifos(void) {
	
    int readCanBytecount;
	
    char canFifo1FullFlag = CAN->RF1R & CAN_RF1R_FMP1;
    if (canFifo1FullFlag) {
        {
            readCanBytecount = (CAN->sFIFOMailBox[1].RDTR & 0x0f);	
 
            canRxMsgBottomWord[canRxMsgIN]	= CAN->sFIFOMailBox[1].RDLR;
            canRxMsgTopWord[canRxMsgIN]	= CAN->sFIFOMailBox[1].RDHR;
 
            canRxMsgID[canRxMsgIN] = CAN->sFIFOMailBox[1].RIR >> 21;
            CAN->RF1R |= CAN_RF1R_RFOM1;   						// release FIFO 
 
            canRxMsgLength[canRxMsgIN] =	readCanBytecount;  
 
            canRxMsgIN++;
            canRxMsgIN &= 0x3F;   	// 64 entries only
            canRxMsgTableEMPTY = false;
            if (canRxMsgIN == canRxMsgOUT) canRxMsgTableFULL = true;
        }
        CAN->IER |= CAN_IER_FMPIE1;  		  				  // (11)		Set FIFO1 message pending IT enable 
    }
                     
    char canFifo0FullFlag = CAN->RF0R & CAN_RF0R_FMP0;
    if (canFifo0FullFlag) {
        {
            readCanBytecount 		= (CAN->sFIFOMailBox[0].RDTR & 0x0f);	
 
            canRxMsgBottomWord[canRxMsgIN]	= CAN->sFIFOMailBox[0].RDLR;
            canRxMsgTopWord[canRxMsgIN]	= CAN->sFIFOMailBox[0].RDHR;
 
            uint32_t canRxmsgID =  CAN->sFIFOMailBox[0].RIR >> 21;
            CAN->RF0R |= CAN_RF0R_RFOM0;  						// release FIFO 
 
            if(canRxMsgTableFULL) {	
                canRxMsgTableOverflow = true;   // now dump new frame...
            }else {
                canRxMsgID[canRxMsgIN] = canRxmsgID;
                canRxMsgLength[canRxMsgIN] = readCanBytecount;  	
                canRxMsgIN++;
                canRxMsgIN &= 0x3F;  	// 64 entries only
                canRxMsgTableEMPTY = false;
                if (canRxMsgIN == canRxMsgOUT)
                    canRxMsgTableFULL = true;
            }
			
            //length = sprintf(string + length,"%08X, %08X :W",canFifoBuf.d32[0],canFifoBuf.d32[1]);
            //			for( int i = 0; i < readCanBytecount; i++){			
            //				canRxBuffer[canRxpointerIN++] =  canFifoBuf.d8[i];			
            //				canRxpointerIN &= 0xFF;
            //				if (canRxpointerIN == canRxpointerOUT )		CanRxbufferOverrun = true;
            //				//length += sprintf(string + length,"%02X, ",canFifoBuf.d8[i]);
            //			}
            //sprintf(string + length -2,"\n\r");		// remove 2 bytes, the last comma and space
            
        }
        CAN->IER |= CAN_IER_FMPIE0;  		// (11)		Set FIFO1 message pending IT enable 
    }
    //			if (length >0)  puts(string);
 
}
inside my foreground process I call DoCan, which only checks all new RXframes before it leaves.
 
 
void DoCan(void) {
    char string[1024];
    int length = 0;
    if (canRxMsgTableOverflow) {
        canRxMsgTableOverflow = false;
        sprintf(string, "CanFlags Table has Overflowed, new frames are being dumped\n\r");   // now dump new frame...
        puts1(string);
    }
    if (canRxMsgTableFULL) {
        canRxMsgTableFULL = false;
        sprintf(string, "CanFlags Table is Full, probably missing frames now\n\r");
        puts1(string);
    }
			
    if (checkCanRxSensorFreshDataCompleted())   
        showCanSensorADCArray();
    if (haveCanString)
        showCanString();
 
           			
    while (!canRxMsgTableEMPTY) {
        int canRxMsgBytecount = canRxMsgLength[canRxMsgOUT];
					
        int16_t dataChannel 	= canRxMsgID[canRxMsgOUT];
        canRxCmdFifoBuf.d32[0] 	= canRxMsgBottomWord[canRxMsgOUT];
        canRxCmdFifoBuf.d32[1] 	= canRxMsgTopWord[canRxMsgOUT];
 
        // zero data within RxCanFrameBuffer to make sure we only use valid data, but not necessary
        canRxMsgID[canRxMsgOUT] = 0; 
        canRxMsgBottomWord[canRxMsgOUT] = 0;
        canRxMsgTopWord[canRxMsgOUT] = 0;
        canRxMsgLength[canRxMsgOUT] = 0;				
 
        /*	if (dataChannel == 0x418) 
        		while(1);
        	if (dataChannel == 0x408) 
        		while(1);
        */
        canRxMsgTableFULL = false;
        canRxMsgOUT++;
        canRxMsgOUT &= 0x3f;  		// only 16 messages
        if(canRxMsgOUT == canRxMsgIN) canRxMsgTableEMPTY = true;
			.
			.
			.
			.
		process frame here
 
 
 
 
the transmit function
if ((CAN->TSR & CAN_TSR_TME0) == CAN_TSR_TME0) // (1) 
{
 
    CAN->sTxMailBox[0].TDTR = canTxMsgLength[canTxMsgOUT];      // (2) length
    CAN->sTxMailBox[0].TDLR = canTxMsgBottomWord[canTxMsgOUT]; // (3) 4bytes
    CAN->sTxMailBox[0].TDHR = canTxMsgTopWord[canTxMsgOUT];   // (3) 4bytes
    CAN->sTxMailBox[0].TIR  = ((uint32_t)canTxMsgID[canTxMsgOUT] << 21 | CAN_TI0R_TXRQ); // (4)  // send it now if the line is idle
 
 
}
 
if ((CAN->TSR & CAN_TSR_TME1) == CAN_TSR_TME1) // (1) 
     {
if ((CAN->TSR & CAN_TSR_TME2) == CAN_TSR_TME2) // (1) 
     {
 
 
 
void CAN1_RX0_IRQHandler(void)
{
  /* USER CODE BEGIN CAN1_RX0_IRQn 0 */
    //CAN_Rx0_IRQFlag = true;
    //CAN_IRQFlag = true;
    checkCanRxFifos();
 
  /* USER CODE END CAN1_RX0_IRQn 0 */
  HAL_CAN_IRQHandler(&hcan1);
  /* USER CODE BEGIN CAN1_RX0_IRQn 1 */
 
  /* USER CODE END CAN1_RX0_IRQn 1 */
}
 
/**
* @brief This function handles CAN1 RX1 interrupt.
*/
void CAN1_RX1_IRQHandler(void)
{
  /* USER CODE BEGIN CAN1_RX1_IRQn 0 */
//    CAN_Rx1_IRQFlag = true;
//    CAN_IRQFlag = true;
    checkCanRxFifos();
 
  /* USER CODE END CAN1_RX1_IRQn 0 */
  HAL_CAN_IRQHandler(&hcan1);
  /* USER CODE BEGIN CAN1_RX1_IRQn 1 */
 
  /* USER CODE END CAN1_RX1_IRQn 1 */
}