cancel
Showing results for 
Search instead for 
Did you mean: 

Which is best STM32 MCU for fast driving of GPIO

Brussl
Associate II

I need fastest MCU that can drive GPIO pins but not with timers or DMA, just from source. I need to take fast data from a FPGA device. One of determination is max frequency of toggling pin:

I try STM32H562 to toggle pin mode max frequency is around 31 mhz : with this code and core frequency 250mhz:

asm volatile

(

" LDR R0,=0x42020414 ;\n" //GPIOB->ODR

" LDR R1,[R0] ;\n"

"loop: EOR R1,#0x00000001 ;\n"

" STR R1,[R0] ;\n"

" B loop ;\n"

:

:

:"r0","r1","r2","r3","r4","r5","r6","r7","r8","r9","r10"

);

I know too STM32H723 is not the best choise - GPIOs are slower than STM32F427. 

Can some one offer me MCU with fastest GPIOs. and arm core ? Fast as possible.

10 REPLIES 10
TDK
Guru

We tried this on a few different chips before, nothing came close to 31 MHz, so possible H5 is the fastest.

Toggling a single pin as fast as possible is not a useful functionality. There are better and faster ways to do this.

If you feel a post has answered your question, please click "Accept as Solution".
SofLit
ST Employee

Hello,

Unfortunately you need at least DMA + TIM for external synchronization to do a such transfer.

Because looking for a simple way by just using GPIOs is not the better method for a fast external data transfers like FPGA or something else ..

You can also use FMC for such data transfers ..

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
PS: This is NOT an online support (https://ols.st.com) but a collaborative space. So please be polite in your reply. Otherwise, it will be reported as inappropriate and you will be permanently blacklisted from my help/support.

@Brussl wrote:

 I need to take fast data from a FPGA device. 


So, as @SofLit suggested, why not just have the FPGA put out the data in a more useful/accessible format?

 


max frequency is around 31 mhz :

milli hertz - that's not very fast at all!

:thinking_face:

This methodology generally isn't desirable as it saturates the CPU with useless work. Sort of thing you do with GATES. 

Writing two different register patterns to BSRR is preferable.

So

STR R1,[R0] 

STR R2,[R0] 

STR R1,[R0] 

STR R2,[R0] 

...

The H7 is one of the slowest, the GPIO is off the M4 Core, so a couple of bus transactions from the M7

The F1/F2/F4 were pretty fast as I recall, but I'd drive patterns via a DMA transaction

The FMC would be a way to do 10's of MHz. Parallel ingress perhaps DCMI

The OCTO / QUAD SPI can be used / disabused in some situations

Some of the newer STM32 have a parallel port PSSI

Perhaps make your FPGA interface actually work in a manner compatible with an available interface method rather than hammer the square peg into the triangular hole?

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Tnx a lot of all people. As i see F7 is best choice.

With H5 (m33) core I have some "dark" problems. I don.t know why in code above, frequency is 31 mhz when core is 250mhz. First i can't find for M33 instructions duration (clocks). If i use one-clock -- one-instruction then loop  repeat on every 4 instruction , so outpin freq. must be 250/4 = 62.5mhz , or if 'B' instruction is 2 clocks then must be 250/5 = 50mhz.  So i will return to F7 MCU. 

Thank you.

 

Beside your "dark" problems with H5xx ,

as i wrote:

In my speed test on H563 at 250M core , i got

bsrr 4ns , +12ns ( while loop 8+4ns) =>  16ns total , so about 62MHz output at a pin, including the while loop.

But only if compiling with optimizer -O2 (or -Ofast) , these cpus are made to be fast only in cooperation with the optimized code.  So set the optimizer...

----

 

If you feel a post has answered your question, please click "Accept as Solution".

To Ascha.3 As you know i could not turn on MCO1 pin to PLL (work connected to HSE or HSI), but start without problem MCO2 pin and check frequency - it is 250 mhz.

In code above compiler optimization of code no mater - all in ASM VOLATILE () executes without changes. May be H563 is little different from H562 - I don't know. If i use external quartz and PLL on 250mhz this is max toggling - 31 mhz . I use the same HAL , may be the same compiler as you. Please note: I don't calculate or emulate source - i just measure frequency on pin. 

So extra for you:   :)

simple hi-lo-hi-lo loop , 250MHz core, optim -O2 , (all setting in IDE "normal", just at max speed) :

  /* USER CODE BEGIN WHILE */
  while (1)
  {


	  GPIOG->BSRR = GPIO_PIN_4<<16;
	  GPIOG->BSRR = GPIO_PIN_4;
	  GPIOG->BSRR = GPIO_PIN_4<<16;
	  GPIOG->BSRR = GPIO_PIN_4;
	  GPIOG->BSRR = GPIO_PIN_4<<16;
	  GPIOG->BSRR = GPIO_PIN_4;


    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
  }

you see : 4ns pin access, 16ns with while-loop : (bad picture, with a cheap 60MHz-probe)

AScha3_0-1706377849537.png

82MHz - not bad , i would say...

If you feel a post has answered your question, please click "Accept as Solution".

It's not the instruction speed that's relevant here, its the write buffers, into the bus interface(s) that's going to dictate the achievable throughput.

A GPIO bank attached to the primary AHB is going to be the fastest, unless you have a design bolting them to the TCM.

But again you're approaching the problem incorrectly, you've got an FPGA, make it emulate a high speed bus/protocol that's actually supported by buffering and DMA, and that you can pipeline or FIFO on your side. Be a 32-bit wide FMC bused external memory. Pick a package which permits the widest data bus.

The SDMMC can go 8-bit wide and clock on both edges. Pretty sure it's rated upwards of 100-150 MHz

DSI is running at 250 MHz, two-lanes, DDR, got a bandwidth approaching 1 Gbps

Hard to know where you'd source/sink that much in a continuous mode

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..