cancel
Showing results for 
Search instead for 
Did you mean: 

Question regarding execution speed and interrupts optimization

megahercas6
Senior
Posted on August 12, 2015 at 09:46

Hello

At the moment i am facing problem that my interrupt routine is too long for camera, and it fails to work. Idea that HSYNC interrupt will should change DMA adress, and even with high degree of optimization/over clocking, it just take too long, and HSYNC rises before interrupt is complete. Does any one know how to seed up my interrupt code ? As example, i would like to use global variable that is kept inside RX register, so i will save few cycles. but i can't declare global as register, any workaround ? I am also modifying functions, so it will not check if passed parameters are valid, and only works with that line, like DMA INIT, DMA deinit, and so on. Is is H SYNC interrupt, i made it as simple GPIO interrupt instead of DCMI line interrupt, since is is 50ns faster to execute.

void DCMI_IRQHandler(void)
{
if(DCMI_GetFlagStatus(DCMI_FLAG_LINERI) == SET)
{
GPIOB->BSRRL=2;
//register uint32_t address ;
//DMA_DeInit_DCMI_SHORT(DMA2_Stream1);
//address = LINE*binning_addr;
//DMA_InitStructure.DMA_Memory0BaseAddr =(uint32_t)((SRAM_BANK_ADDR+address));
//DMA_Init_DCMI_SHORT(DMA2_Stream1, &DMA_InitStructure);
//DMA2_Stream1->CR |= (uint32_t)DMA_SxCR_EN;
LINE++;
DCMI->ICR = DCMI_FLAG_LINERI;
GPIOB->BSRRH=2;
} 
} 
void EXTI4_IRQHandler(void)
{
if ((EXTI->PR & EXTI_Line4) != (uint32_t)RESET)
{
//if(EXTI_GetITStatus(EXTI_Line4) != RESET)
//{
GPIOB->BSRRL=2;
LINE++;
DCMI->ICR = DCMI_FLAG_LINERI;
GPIOB->BSRRH=2;
//EXTI_ClearITPendingBit(EXTI_Line4);
EXTI->PR = EXTI_Line4;
}
}

This should work so much faster on STM32F7 that i soldered into same hardware, but is such a shame that i can't get even simple programs working. Any one want to help me out with SPI running in circular mode, and DCMI that handles data transfer to SRAM ? i mean, it should be 10min job for any good programmer, but not for me, more than a month , and nothing, only basic functionality is copied from STM32F429 to STM32F746
6 REPLIES 6
RomainR.
ST Employee
Posted on August 12, 2015 at 10:02

Do you mean it's PB2 that rising before the end of IRQHandler ? 

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

megahercas6
Senior
Posted on August 18, 2015 at 10:46

No, i mean HSYC is falling indicating blanking, at this moment that is 300ns long. So i must disable DMA, reconfigure DMA with new address and and enable DMA, so it will fill new line to SRAM as it should

Right now my interrupt takes 200ns, but it is shifted by 200ns, because it takes time for CPU to go into interrupt 0690X00000605ITQAY.png No, by the code, i am only doing few functions, so it should be smaller. Can any one help me write this in ASM, so it will be as fast as possible. So far, it looks like this:

void EXTI4_IRQHandler(void)
{
if ((EXTI->PR & EXTI_Line4) != (uint32_t)RESET)
{
DMA2_Stream1->CR &= ~DMA_SxCR_EN;
DMA2_Stream1->CR = 0;
DMA2->LIFCR = 0x00000F40;
DMA2_Stream1->CR = 0x02031400;
DMA2_Stream1->FCR = 0x21;
DMA2_Stream1->M0AR = ((SRAM_BANK_ADDR+(LINE*binning_addr)));
DMA2_Stream1->CR |= DMA_SxCR_EN;
LINE++;
EXTI->PR = EXTI_Line4;
}
}

So by using original code, it had lot of overhead, due to checking, and stream selecting, and so on. But this is just as little as to make this DMA reconfiguration work. It's only few numbers directly to registers, it should be faster than 200ns. Can any one show, how to write directly to registers in inline_asm ?
stm322399
Senior
Posted on August 18, 2015 at 11:35

If I were you, I'll try another way. Those modern full-featured MCU are not designed for that kind of fast interrupt response.

Writing in assembly language will not always bring you better performance, C compiler generate good enough code, especially when you only write constants values to constants locations. Did you ever try to have a look at the generated code ?

What are you trying to do ? Apparently you want to program a new DMA destination for every DCIM line ? right ?

What is the goal ? Does your camera image not fit into the SRAM ? What kind of processing are you willing to do with the incoming data ?

I see two solutions for you depending your needs:

* setup DMA to fill into a continuous circular buffer, and process data on the fly

* setup DMA to use flip-flop buffer, so you will have approx. the line duration to reprogram DMA for the next line. (I don't know when your MCU supports double buffer).

knielsen
Associate II
Posted on August 18, 2015 at 12:47

> Right now my interrupt takes 200ns, but it is shifted by 200ns,

> because it takes time for CPU to go into interrupt It should not require 200ns on a fast STM32F4 to enter the interrupt. What is your system clock speed? I will assume 168 / 180 MHz from the part number you mentioned. Cortex M4 interrupt latency is 12 cycles, so that would be 72ns. Be sure to configure your interrupt at the highest priority, so it does not get delayed by another interrupt. I think several parts of your interrupt routine are unnecessary:
  • if() - statement
  • Clearing and setting

    DMA2_Stream1->CR

  • Setting

    DMA2_Stream1->FCR

In fact, if you are careful to ensure that your DMA completes before this interrupt triggers (eg. only enable it after DMA completion), you could disable the DMA when it completes and prepare it in advance, so you only need to activate it in your interrupt:

void
EXTI4_IRQHandler(
void
)
{
DMA2_Stream1->CR |= DMA_SxCR_EN;
EXTI->PR = EXTI_Line4;
}

I think that should be able to complete in something like 100ns. With GCC, you can declare global register variables and do inline assembler with __asm__, but I doubt it will be necessary / useful here. Completing the interrupt in 300ns seems quite possible, I think, with careful C coding. Using double-buffer mode for DMA is generally the best way to ensure continuous transfer, but it sounds like in your case it the timing of when the DMA channel is started is critical?
megahercas6
Senior
Posted on August 18, 2015 at 13:09

This is lowest amount of code that works, if i change anything, it will stop getting images from camera. It would be nice to use dual buffer DMA mode, but i don't know how.

I RTFM did show how to do it based on registers, but it's still hard to reproduce in C, i am not very good programmer....

void DCMI_DMA_Int()
{
DCMI_InitTypeDef DCMI_InitStructure;
RCC_AHB2PeriphClockCmd(RCC_AHB2Periph_DCMI, ENABLE);
DCMI_InitStructure.DCMI_CaptureMode = DCMI_CaptureMode_Continuous;
DCMI_InitStructure.DCMI_SynchroMode = DCMI_SynchroMode_Hardware;
DCMI_InitStructure.DCMI_PCKPolarity = DCMI_PCKPolarity_Falling;
DCMI_InitStructure.DCMI_VSPolarity = DCMI_VSPolarity_Low;
DCMI_InitStructure.DCMI_HSPolarity = DCMI_HSPolarity_Low;
DCMI_InitStructure.DCMI_CaptureRate = DCMI_CaptureRate_All_Frame;
DCMI_InitStructure.DCMI_ExtendedDataMode = DCMI_ExtendedDataMode_8b;
RCC_AHB1PeriphClockCmd(RCC_AHB1Periph_DMA2, ENABLE);
DMA_DeInit(DMA2_Stream7);
DMA_InitStructure.DMA_Channel = DMA_Channel_1; 
DMA_InitStructure.DMA_PeripheralBaseAddr = DCMI_DR_ADDRESS; 
DMA_InitStructure.DMA_Memory0BaseAddr = (uint32_t)SRAM_BANK_ADDR;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralToMemory;
DMA_InitStructure.DMA_BufferSize = 1280;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Word;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_Byte;
DMA_InitStructure.DMA_Mode = DMA_Mode_Circular;
DMA_InitStructure.DMA_Priority = DMA_Priority_VeryHigh;
DMA_InitStructure.DMA_FIFOMode = DMA_FIFOMode_Enable;
DMA_InitStructure.DMA_FIFOThreshold = DMA_FIFOThreshold_Full;
DMA_InitStructure.DMA_MemoryBurst = DMA_MemoryBurst_Single;
DMA_InitStructure.DMA_PeripheralBurst =DMA_PeripheralBurst_Single;
DCMI_DeInit();
DCMI_Init(&DCMI_InitStructure);
DMA_Init(DMA2_Stream7, &DMA_InitStructure);
DMA_Cmd(DMA2_Stream7,ENABLE);
DCMI_Cmd(ENABLE);
DMA_InitStructure.DMA_Memory0BaseAddr =(uint32_t)(0x6C000000);
DMA_Init(DMA2_Stream7, &DMA_InitStructure);
DMA_Cmd(DMA2_Stream7,ENABLE);
}
void VSYNC_Reset(void)
{
VSYNC++;
LINE=0;
DMA_DeInit(DMA2_Stream1);
DMA_InitStructure.DMA_Memory0BaseAddr =(uint32_t)(SRAM_BANK_ADDR);
DMA_Init(DMA2_Stream1, &DMA_InitStructure);
DMA_Cmd(DMA2_Stream1,ENABLE);
}
uint32_t tmpreg = 0;
void DCMI_IRQHandler(void)
{
if(DCMI_GetFlagStatus(DCMI_FLAG_LINERI) == SET)
{
register uint32_t address ;
DMA_DeInit_DCMI_SHORT(DMA2_Stream1);
address = LINE*binning_addr;
DMA_InitStructure.DMA_Memory0BaseAddr =(uint32_t)((SRAM_BANK_ADDR+address));
DMA_Init_DCMI_SHORT(DMA2_Stream1, &DMA_InitStructure);
DMA2_Stream1->CR |= (uint32_t)DMA_SxCR_EN;
LINE++;
DCMI->ICR = DCMI_FLAG_LINERI;
} 
} 

megahercas6
Senior
Posted on August 19, 2015 at 08:15

With only interrupt code running, i have around 100ns free time before HSYNC goes high. only few numbers needs to be written to DMA registers, clearly, it could be done faster than 100ns ?

And i am running at 218MHz clock speed, because any lower, and i will massive errors with DMA buffer. PCLK is 53.3333MHz

I wish i could use STM32F7, but no one can help me with code, and i spend so much time, that makes me sick trying to go back to STM32F7 (only DCMI, and SPI_DMA in circular  parts is not working, so close, but so far at the same moment)