cancel
Showing results for 
Search instead for 
Did you mean: 

Why is setting up SPI to re-transmit by DMA taking too long, 1uS??

victagayun
Senior III

First, I would like to know if this code is correct to setup SPI to re-transmit by DMA again.

GPIOA->BSRR = (1<<5); // start
 
  DMA1->IFCR = 0x0FFFFFFF; // reset status
 
  DMA1_Channel3->CCR &= ~( DMA_CCR_EN );
 
  LL_DMA_SetDataLength(DMA1, LL_DMA_CHANNEL_3, ubNbDataToTransmit);
 
  DMA1_Channel3->CCR |= ( DMA_CCR_EN );
 
  GPIOA->BSRR = (1<<21); // reset/end +16

when taking the duration of the whole code, it is taking around 1uS as shown below.

CH1 (yellow) = GPIOA5

CH2 (blue) = SPI clk

0693W000001t8DHQAY.jpg

Why is it taking around 1uS to execute these codes?

13 REPLIES 13
KnarfB
Principal III

Whats your MCU clock? Compiler settings?

You could use your GPIOA5 instrumentation for a code bisection.

TDK
Guru

Always include your chip part number in the post.

Also what are your optimization settings?

If you feel a post has answered your question, please click "Accept as Solution".

Any interrupts?

JW

KnarfB
Principal III

Okay, I dug out my Nucleo STM32G431RB and made some tests with gcc and 170 MHz core clock. In Debug mode, 194 clock cycles, in release with -Ofast and minimum debug info (-g1) 47 clock cycles. The first figure matches about your observed 1µs.

Used the cycle counter for measurement:

ITM->LAR = 0xC5ACCE55;
CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk;
DWT->CTRL |= DWT_CTRL_CYCCNTENA_Msk;
 
LD2_GPIO_Port->BSRR = LD2_Pin;
uint32_t tick = DWT->CYCCNT;
DMA1->IFCR = 0x0FFFFFFF; // reset status
DMA1_Channel3->CCR &= ~( DMA_CCR_EN );
LL_DMA_SetDataLength(DMA1, LL_DMA_CHANNEL_3, 64 );
DMA1_Channel3->CCR |= ( DMA_CCR_EN );
uint32_t tock = DWT->CYCCNT;
LD2_GPIO_Port->BSRR = LD2_Pin<<16;

In debug mode, the call to LL_DMA_SetDataLength is a function call, in release mode it is inlined.

hth

KnarfB

Sorry, did not include NUCLEO-G474RE, running at 170Mhz

It is "Debug" compiled, that is the check box when I press the hammer button.

ohh ok, I made some mistake and review the codes.

Yes, these codes where in GPIO interrupt.

When I checked the codes again, I forgot to remove the "first" GPIO set high so it measured also the GPIO interrupt flags.

void EXTI15_10_IRQHandler(void)
{
  /* USER CODE BEGIN EXTI15_10_IRQn 0 */
 
   GPIOA->BSRR = (1<<5); // start
 
  /* USER CODE END EXTI15_10_IRQn 0 */
  if (LL_EXTI_IsActiveFlag_0_31(LL_EXTI_LINE_13) != RESET)
  {
    LL_EXTI_ClearFlag_0_31(LL_EXTI_LINE_13);
    /* USER CODE BEGIN LL_EXTI_LINE_13 */
    
    /* USER CODE END LL_EXTI_LINE_13 */
  }
  /* USER CODE BEGIN EXTI15_10_IRQn 1 */
 
  GPIOA->BSRR = (1<<5); // start
 
  DMA1->IFCR = 0x0FFFFFFF; // reset status
 
  DMA1_Channel3->CCR &= ~( DMA_CCR_EN );
 
  LL_DMA_SetDataLength(DMA1, LL_DMA_CHANNEL_3, ubNbDataToTransmit);
 
  DMA1_Channel3->CCR |= ( DMA_CCR_EN );
 
  GPIOA->BSRR = (1<<21); // reset/end +16
 
  /* USER CODE END EXTI15_10_IRQn 1 */
}

 After removing it, it went down to 600nS, which I think is still quite long.

0693W000001t8xUQAQ.jpg

Hi I am using NUCLEO-G474RE @ 170Mhz and found some error in the code which made the measurement include the the clearing of flags in the GPIO interrupt (shown above).

After modifying it, it went down from 1uS, it went down to 600nS which I think still quite long.

I did also Release compile (select Release under the Hammer button), but still get 600nS. Not much difference bet ween Release and Debug compile.

I am quite newbie in this optimization settings, can you show where can I modify it?

S.Ma
Principal

If you want to get nearly max performance at the expense of what is good enough for typical application, try the following:

- use IAR compiler

- compile with optimize for speed

- put your isr code and interrupt vectors in ram

- merge the LL C code with your code so compiler will inline all LL functions call

- put all const values in ram with default initializer.

S.Ma
Principal

and of course, prepare your dma next transfer at init and at the end of the interrupt will simulate the half buffer transfer double buffering as well.