2025-10-14 6:39 AM
Hello,
I am using a STM32H757 to convert 12 analog channels using ADC2 and ADC3 (code running @ Cortex M7).
- Both ADC are configured at 42MHz sync clock, triggered by a timer
- ADC2 use DMA1s0 to transfert data to AXI SRAM.
- ADC3 use DMA1s1 to transfert data to AXI SRAM.
Using a scope, I can see a little drop on the analog lines revealing the sampling point. The 12 sampling point timing fit perfectly the expected behavior (relative to the timer pulse), so, ADC clock, sample time and conversion time are verified and good.
CPU clock = 336MHz
AXI/AHB clock = 84MHz
background, IRQ routine and vector table code are all moved in ITCMRAM. All data resides in DTCMRAM. There is nothing running from flash and no data in SRAM.
Code compiled in release mode, optimized for speed.
I'm using the HAL only for init. All run-time code is using direct register access.
I'm using a GPIO configured as EVENTOUT and asm("sev") to instrument the code.
void DMA1_Stream0_IRQHandler(void)
{
asm volatile("sev");
asm volatile("" ::: "memory"); // Avoid asm code re-ordering by the optimizer.
dmaADC2regs->LIFCR = 0x20; // Clear IT flag using CTCIF1 bit
asm volatile("" ::: "memory"); // Avoid asm code re-ordering by the optimizer.
// Copy 12 uint16_t word from AXI SRAM to DTCM RAM (64 bits data bus : 3 tranferts)
uint64_t* dst = (uint64_t*)adc_dtcm;
uint64_t const* src=(uint64_t const*)adc_axi;
for(int i = 0; i < 3; i++)
dst[i] = src[i];
asm volatile("" ::: "memory"); // Avoid asm code re-ordering by the optimizer.
asm volatile("sev");
}From ADC end of last sample conversion to GPIO/EVENTOUT edge : 556ns -0/+24ns (187 -0/+8 cycles)
Interrupt execution time ~370ns
The execution time is fine but the IRQ servicing time is definitely too long for my app. I don't see what can I do to be faster. 187 cycles looks too much, there should be something wrong somewhere.
Any suggestion will be well appreciated :)
Thanks
2025-10-14 7:00 AM
How about telling the DMA to fill a circular buffer twice the size you need.
When the first set of ADC readings come in, you'll get a half-transfer interrupt. Use that interrupt to copy over the first half of the buffer. While that's going on, the second set of readings are coming into the second half of the buffer.
Then you'll get a transfer-complete interrupt. On that, copy over the second half of the buffer.
2025-10-14 7:08 AM
Also just be aware that the consequences from an SEV event (GPIO change) may not happen exactly when the SEV instruction occurs. Same thing if you were to set a GPIO pin. Might be worth toggling the pin rather than using SEV to see if you get different results.
2025-10-15 8:27 AM
I measured the {IRQ service time + SEV event GPIO change} time using a square signal on an EXTI pin, and a sev instruction + GPIO write in the IRQ callback (not using the HAL functions). The EVENTOUT pin is a way faster than the GPIO pin set using BSRR register: 66ns for the sev/EVENTOUT and 190ns for the GPIO (all configured for speed = very high).
Watching carefully my scopes traces, I had the feeling that the IRQ service time was not the main problem, but the DMA transfer time could be.
Using 2 separate DMA (instead of 2 streams of the same DMA) improved "greatly" the things: the time from ADC end of last conversion to SEV/gpio change goes from 600ns to 345ns (200 to 115 CPU cycles). Actually, it could probably be better but this is fast enough for my app.
As an alternative, I could also have boosted the CPU and bus speed, but the ADC is tightly coupled to the different clocks of the whole CPU. Using these parameters make the analog sampling points perfectly deterministic and stable. Not absolutely necessary, but better. Running at 336MHz also lower the power consumption compared to 480MHz.
Anyway, if someone have some tricks to improve DMA transfert speed or IRQ latency, I'll take it.
2025-10-15 1:48 PM
Hi @tarzan2
This post has been escalated to the ST Online Support Team for additional assistance. We'll contact you directly.
Regards,
Billy