So I enabled DMA1, on STM32F103, copying two 32-bit words (set and clear PF.15) from flash to GPIOF->BSRR at 25kHz. I then run this code with all interrupts disabled
GPIOF->BSRR = 1 << 14;
GPIOF->BRR = 1 << 14;
The length of the high pulse on PF.14 is around 175nSecs when DMA is not active. When DMA fires and outputs a high pulse on PF.15, sometimes that pulse width of PF.14 drops to around 120nSecs. I understand that the bus matrix uses round robin scheduling and that sometimes the DMA or CPU will have to block, but I would expect that the blocking would increase a pulse length and not decrease it. What's happening here? Does this has anything to do with out of order execution and that I need to use memory barrier instructions somewhere?