Pipeline delay in IRQ flag clearing before returning from ISR?

anya1 · ‎2024-11-01

Hello,

I have an ISR for TIM4 running on an STM32F2 microcontroller. The ISR triggers on a rising edge of the timer input channel, and its purpose is to record the microsecond timestamp of the last two rising edges in global variables for other parts of the firmware to use.

void TIM4_IRQHandler(void)

{

second_last_re_us = last_re_us;

last_re_us = time_us();

TIM4->SR = 0;

}

I notice that the ISR as written triggers twice on each rising edge. This is evident by last_re_us and second_last_re_us reading as equal, even when the PWM signal being input is only in the hundreds of hertz.

If I move the IRQ flag clearing to the beginning of the ISR, or if I add barriers before returning from the ISR, the issue goes away, and the ISR gets called only once. This is evident by last_re_us minus second_last_re_us giving the expected period of the PWM signal I am inputting.

void TIM4_IRQHandler(void)

{

TIM4->SR = 0;

second_last_re_us = last_re_us;

last_re_us = time_us();

}

void TIM4_IRQHandler(void)

{

second_last_re_us = last_re_us;

last_re_us = time_us();

TIM4->SR = 0;

__DSB();

__ISB();

}

Am I correct in understanding that this occurs because of the processor's pipeline delay causing the actual hardware writing of TIM4->SR = 0; to happen only after the ISR actually returns? What is the best practice for making sure that IRQ flag clearing actually completes before an ISR returns? I don't see use of barriers anywhere in example code for ISRs in general.

STOne-32 · ‎2024-11-01

Dear all,

You are completely right and this is not new and known since the propagation delay between the “Write” done by CPU inside the IRQ till it is effective at peripheral level . The more the AHB/APB prescaler is , more cycles are required and should not be done at end of the ISR routine , another alternative is to Read back just after the Register ( if it is not Read/Clear ) or as described above to be done as soon as we enter the IRQ . We had an FAQ/ Knowledge Article in the past but seems not ported due to Platform migration.
I found this thread - very similar 10 years ago

https://community.st.com/t5/stm32-mcus-products/clear-of-exti-pr-flag-at-the-the-end-of-isr-doesnâ-t-work/td-p/456382

We will propose a New Article on the topic and recommendations. Thank you a lot for this contribution.

Ciao

STOne-32

View solution in original post

Pavel A. · ‎2024-11-01

http://efton.sk/STM32/gotcha/g7.html

Tesla DeLorean · ‎2024-11-01

This is a well known hazard in the CM architecture.

Relates to Pipelining, Write Buffers, NVIC and Tail-Chaining

Blindly writing zero, is also going to clear things you don't want, you should write the inverse mask of the one thing you want to clear, as the TIM operates automously of the MCU, and often at a different clock rate. So don't do something that clears everything, when you expect UP, CC1, CC2, etc to be triggering.

It's also why you don't use RMW with an AND, because between the read and write a bit may go high, and that bit will be sent as a zero

If you clear it early, it's not a problem, as other read will force in-order-execution

Also why you should QUALIFY the source, so even if it double enters via tail-chaining, you don't break stuff

void TIM4_IRQHandler(void) // This is immune to the issue at TWO levels
{
  if (TIM4->SR & 1) // Qualify the source of interrupt
  {
    TIM4->SR = ~1; // Clear the specific source
    ...
    second_last_re_us = last_re_us;
    last_re_us = time_us();
  }
}

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Tesla DeLorean · ‎2024-11-01

Fencing alone might not help, because the buses are relatively slow, figure at least 4-cycles for a load or store.

The change of the register also needs to clock thru to the output side of the interrupt signal to the NVIC, which is the thing making the decision on what tail-chains next, which can be the SAME handler if the signal is still lingering.

Reading back the peripheral register might be more robust, so a WHILE(x) rather than an IF(x) would foreshadow whatever the NVIC might reasonably do.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

STOne-32 · ‎2024-11-01

Dear all,

You are completely right and this is not new and known since the propagation delay between the “Write” done by CPU inside the IRQ till it is effective at peripheral level . The more the AHB/APB prescaler is , more cycles are required and should not be done at end of the ISR routine , another alternative is to Read back just after the Register ( if it is not Read/Clear ) or as described above to be done as soon as we enter the IRQ . We had an FAQ/ Knowledge Article in the past but seems not ported due to Platform migration.
I found this thread - very similar 10 years ago

https://community.st.com/t5/stm32-mcus-products/clear-of-exti-pr-flag-at-the-the-end-of-isr-doesnâ-t-work/td-p/456382

We will propose a New Article on the topic and recommendations. Thank you a lot for this contribution.

Ciao

STOne-32

Tesla DeLorean · ‎2024-11-01

Yes, definitely a topic we've covered here multiple times, probably back into 2007/2008 time frame, but probably a couple of forum iterations beyond where we can reach back today.

And the design team was certainly cognisant of the potential race condition with TIM->SR based on how it was designed to obviate the use of the RMW at the MCU level.

And definitely an issue with bit-banding too, as discussed on Jan's pages.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..