cancel
Showing results for 
Search instead for 
Did you mean: 

Why is there so much latency in my interrupt?

arnold_w
Senior

I am working with the STM32 Nucleo-64 development board, which has an STM32L476R microcontroller. My SystemCoreClock is 16 MHz and TIM17 is clocked at 4 MHz. To my surprise, the code below only works well (the timer doesn't miss the next interrupt and wraps around) if I increment with at least 23:

#pragma GCC push_options
#pragma GCC optimize ("O3")
 
void TIM1_TRG_COM_TIM17_IRQHandler(void) {
    GPIOC->BSRR = 0x00000400;  // Set test pin PC10 high
    GPIOC->BSRR = 0x04000000;  // Set test pin PC10 low
    TIM17->SR = 0;             // Clear interrupt flags
    TIM17->CCR1 += 23;         // OK
//    TIM17->CCR1 += 22;         // Not ok
}
 
#pragma GCC pop_options

Now, 23 timer ticks corresponds to 23 x 4 = 92 CPU clock cycles and it seems unlikely that the 4 lines of code would occupy 92 instructions. When I store the TIM17->CNT value in a global variable first thing in the interrupt routine above I can see that TIM17->CNT is 8 (!) more than TIM17->CCR1 meaning it took roughly 8 x 4 = 32 CPU clock instructions just to enter the interrupt routine! I tried to put the interrupt vector in RAM, but that made it worse! What am I doing wrong, why is there so much latency in my interrupt?

22 REPLIES 22

Please see my correction above and apologize for misleading you.

> if I modify my code and store TIM17->CNT into a global variable (not a stack variable)

> first thing in the interrupt routine, then it is 8 more than TIM17->CCR1,

> meaning it takes 8 x 4 = 32 CPH clock cycles until the first line of code is executed.

Always post disasm; "first line" of C is irrelevant in context of cycle counting.

Also, read out and post RCC and FLASH-relevant registers content, to have some firm base for further discussion. For what I take from

>>> SystemCoreClock is 16 MHz and TIM17 is clocked at 4 MHz.

 >>Are you sure?

> Yes

it means that APB on which the timer sits is clocked at 2MHz, i.e. 8 AHB cycles per one APB cycle, correct?

Then read section from my post above above, containing "... accessing registers in timer (i.e. through the /4 APB bus) impose delays ..." and the post I've linked there.

JW

These are the clock settings I'm using:

0693W00000JQ4iQQAT.jpg 

This is my code when I sample TIM17->CLK asap in the interrupt routine:

static int timerCntValue;
void TIM1_TRG_COM_TIM17_IRQHandler(void) {
    timerCntValue = TIM17->CNT;
 802eec0:	4b0d      	ldr	r3, [pc, #52]	; (802eef8 <TIM1_TRG_COM_TIM17_IRQHandler+0x38>)
 802eec2:	490e      	ldr	r1, [pc, #56]	; (802eefc <TIM1_TRG_COM_TIM17_IRQHandler+0x3c>)
 802eec4:	6a58      	ldr	r0, [r3, #36]	; 0x24
    TOGGLE_TEST_PIN_1();
 802eec6:	4a0e      	ldr	r2, [pc, #56]	; (802ef00 <TIM1_TRG_COM_TIM17_IRQHandler+0x40>)
void TIM1_TRG_COM_TIM17_IRQHandler(void) {
 802eec8:	b410      	push	{r4}
    timerCntValue = TIM17->CNT;
 802eeca:	6008      	str	r0, [r1, #0]
    TOGGLE_TEST_PIN_1();
 802eecc:	f44f 6480 	mov.w	r4, #1024	; 0x400
 802eed0:	f04f 6080 	mov.w	r0, #67108864	; 0x4000000
    TIM17->SR = 0;                       // Clear interrupt flags

If I don't make a function call inside the interrupt routine, timerCntValue will be assigned the threshold TIM17->CCR1 plus 5 timer ticks. If I call a function (e.g. send the timerCntValue value to a UART) then timerCntValue will be TIM17->CCR1 plus 8 timer ticks

> I don't make a function call inside the interrupt routine, timerCntValue will be assigned the threshold

> TIM17->CCR1 plus 5 timer ticks.

So 20 system clocks. That's consistent with 12 system clocks of ISR latency + 2 system clocks to load r3 and r1 from FLASH + (0..7) + 8 system (AHB) clocks to get through the AHB/APB bridge with APB running at AHB/8 in order to load r0 from TIM_CNT. Give or take a couple of clocks, for I don't know exactly how things are synchronized.

JW