2024-01-26 09:30 AM
I need fastest MCU that can drive GPIO pins but not with timers or DMA, just from source. I need to take fast data from a FPGA device. One of determination is max frequency of toggling pin:
I try STM32H562 to toggle pin mode max frequency is around 31 mhz : with this code and core frequency 250mhz:
asm volatile
(
" LDR R0,=0x42020414 ;\n" //GPIOB->ODR
" LDR R1,[R0] ;\n"
"loop: EOR R1,#0x00000001 ;\n"
" STR R1,[R0] ;\n"
" B loop ;\n"
:
:
:"r0","r1","r2","r3","r4","r5","r6","r7","r8","r9","r10"
);
I know too STM32H723 is not the best choise - GPIOs are slower than STM32F427.
Can some one offer me MCU with fastest GPIOs. and arm core ? Fast as possible.
2024-01-26 09:36 AM
We tried this on a few different chips before, nothing came close to 31 MHz, so possible H5 is the fastest.
Toggling a single pin as fast as possible is not a useful functionality. There are better and faster ways to do this.
2024-01-26 09:51 AM
Hello,
Unfortunately you need at least DMA + TIM for external synchronization to do a such transfer.
Because looking for a simple way by just using GPIOs is not the better method for a fast external data transfers like FPGA or something else ..
You can also use FMC for such data transfers ..
2024-01-26 10:28 AM
2024-01-26 10:30 AM
This methodology generally isn't desirable as it saturates the CPU with useless work. Sort of thing you do with GATES.
Writing two different register patterns to BSRR is preferable.
So
STR R1,[R0]
STR R2,[R0]
STR R1,[R0]
STR R2,[R0]
...
The H7 is one of the slowest, the GPIO is off the M4 Core, so a couple of bus transactions from the M7
The F1/F2/F4 were pretty fast as I recall, but I'd drive patterns via a DMA transaction
The FMC would be a way to do 10's of MHz. Parallel ingress perhaps DCMI
The OCTO / QUAD SPI can be used / disabused in some situations
Some of the newer STM32 have a parallel port PSSI
Perhaps make your FPGA interface actually work in a manner compatible with an available interface method rather than hammer the square peg into the triangular hole?
2024-01-27 06:15 AM
Tnx a lot of all people. As i see F7 is best choice.
With H5 (m33) core I have some "dark" problems. I don.t know why in code above, frequency is 31 mhz when core is 250mhz. First i can't find for M33 instructions duration (clocks). If i use one-clock -- one-instruction then loop repeat on every 4 instruction , so outpin freq. must be 250/4 = 62.5mhz , or if 'B' instruction is 2 clocks then must be 250/5 = 50mhz. So i will return to F7 MCU.
Thank you.
2024-01-27 07:58 AM
Beside your "dark" problems with H5xx ,
as i wrote:
In my speed test on H563 at 250M core , i got
bsrr 4ns , +12ns ( while loop 8+4ns) => 16ns total , so about 62MHz output at a pin, including the while loop.
But only if compiling with optimizer -O2 (or -Ofast) , these cpus are made to be fast only in cooperation with the optimized code. So set the optimizer...
----
2024-01-27 09:08 AM
To Ascha.3 As you know i could not turn on MCO1 pin to PLL (work connected to HSE or HSI), but start without problem MCO2 pin and check frequency - it is 250 mhz.
In code above compiler optimization of code no mater - all in ASM VOLATILE () executes without changes. May be H563 is little different from H562 - I don't know. If i use external quartz and PLL on 250mhz this is max toggling - 31 mhz . I use the same HAL , may be the same compiler as you. Please note: I don't calculate or emulate source - i just measure frequency on pin.
2024-01-27 09:57 AM
So extra for you: :)
simple hi-lo-hi-lo loop , 250MHz core, optim -O2 , (all setting in IDE "normal", just at max speed) :
/* USER CODE BEGIN WHILE */
while (1)
{
GPIOG->BSRR = GPIO_PIN_4<<16;
GPIOG->BSRR = GPIO_PIN_4;
GPIOG->BSRR = GPIO_PIN_4<<16;
GPIOG->BSRR = GPIO_PIN_4;
GPIOG->BSRR = GPIO_PIN_4<<16;
GPIOG->BSRR = GPIO_PIN_4;
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
}
you see : 4ns pin access, 16ns with while-loop : (bad picture, with a cheap 60MHz-probe)
82MHz - not bad , i would say...
2024-01-27 11:04 AM - edited 2024-01-27 11:09 AM
It's not the instruction speed that's relevant here, its the write buffers, into the bus interface(s) that's going to dictate the achievable throughput.
A GPIO bank attached to the primary AHB is going to be the fastest, unless you have a design bolting them to the TCM.
But again you're approaching the problem incorrectly, you've got an FPGA, make it emulate a high speed bus/protocol that's actually supported by buffering and DMA, and that you can pipeline or FIFO on your side. Be a 32-bit wide FMC bused external memory. Pick a package which permits the widest data bus.
The SDMMC can go 8-bit wide and clock on both edges. Pretty sure it's rated upwards of 100-150 MHz
DSI is running at 250 MHz, two-lanes, DDR, got a bandwidth approaching 1 Gbps
Hard to know where you'd source/sink that much in a continuous mode