2024-01-22 10:49 PM
Hi team,
I have an STM32MP157 board with a Cortex-M4 processor. I have a simple task, which is to toggle the GPIO at maximum speed with accuracy, ensuring that the clock width does not vary.
The maximum toggle speed we require is approximately 15ns between the rising and falling edges. Using the Keil IDE, it is possible to achieve this speed with the Nucleo F746ZG board, which has the same speed as the STM32MP157 for the Cortex-M4 processor.
I have generated code in Cube IDE, but the speed we achieved for the STM32MP157 Cortex-M4 is slow, around 60ns. The code used to toggle the GPIO in a while loop is as follows:
GPIOF->BSRR = LED_Pin << 16;
GPIOF->BSRR = LED_Pin;
I appreciate any assistance in improving the toggle speed for the STM32MP157 Cortex-M4.
Thank you.
Kundan Jha
Solved! Go to Solution.
2024-01-23 01:56 AM
Apart code optimization (but Cortex-M4 would definitely not match Cortex-M7+caches), for GPIO bit banding operation (quite unusual), instead of doing it with CPU only, you might use DMA to write a prepared data table from SRAMx to GPIOF->BSRR. That's could run fast but need slight preparation time by the CPU and DMA programming.
Regards.
2024-01-22 11:40 PM
Hi @kundanJha ,
STM32F7 is using a Cortex-M7 with L1 cache which is much powerful then Cortex-M4 present in STM32MP15.
On STM32MP15, you might get slightly better performance by putting Code inside SRAM1 (@0x10000000) and Data in RETRAM (@0x00000000) or SRAM2_SBUS (@0x30020000), but I fear 15ns toggling is not achievable by SW.
What is the purpose of using 100% of the Cortex-M4 CPU to toggle a simple GPIO ?
Might be better to use a TIMer which is designed for that, allowing to use Cortex-M4 for other tasks.
Regards
2024-01-22 11:42 PM
Don't know if it helps as code is probably well optimized by the compiler, but you could try this:
GPIOF->BRR = LED_Pin;
GPIOF->BSRR = LED_Pin;
2024-01-22 11:46 PM
Hi, I ave tried this, but nothing has changed. The output remains the same.
2024-01-22 11:50 PM
Yes, I agree. As mentioned earlier, for STM32M7, I used Keil to achieve ~15 ns. However, when using the same controller in Cube IDE, the output differs, ranging from ~45 to 60 ns.
2024-01-23 01:39 AM
2024-01-23 01:56 AM
Apart code optimization (but Cortex-M4 would definitely not match Cortex-M7+caches), for GPIO bit banding operation (quite unusual), instead of doing it with CPU only, you might use DMA to write a prepared data table from SRAMx to GPIOF->BSRR. That's could run fast but need slight preparation time by the CPU and DMA programming.
Regards.
2024-01-28 11:03 PM
Hi @PatrickF ,
I'm looking for a board with a Cortex-M7 processor and multi-core support. The signal requirements are provided in the attached file ("printer_signal.png").
Thanks and regards
Kundan Jha
2024-01-28 11:42 PM
Hi @kundanJha
There is no Cortex-A+Cortex-M7 product, but you could have a look to:
- STM32H7 series : Cortex-M7+Cortex-M4 https://www.st.com/en/microcontrollers-microprocessors/stm32h7-series.html
- STM32MP25 series: Cortex-A35(Linux)+Cortex-M33 (sampling now, available in second half of 2024): STM32MP2 MPU series 64-bit microprocessors with neural processing unit
Cortex-M33 is not as powerful than Cortex-M7, but compare to Cortex-M4 in STM32M15, it run in STM32MP25 at twice the frequency and have instruction and data caches (so could probably achieve same or faster GPIO toggling than the STM32F7).
Anyway, I guess that the GPIO sequence you mention could certainly be achieved with DMA on existing STM32MP15 Cortex-M4 product (but you probably need to rework your SW concept).
Regards
2024-01-29 12:16 AM - edited 2024-01-29 12:20 AM
Hi,
Just a question: you have set the optimizer ? Because in my simple speed tests, on F4 core (F411 at 100MHz ), i got 60ns pin toggle (-O0 ) , but 10ns with optimizer -O2 .
And on H563 (M33 core, at 250MHz ) 4 ns . :)
see:
+
But the "better way" to get a pattern to the port pins, is using the DMA , as @PatrickF mentioned .