2022-04-07 05:31 AM
I tested the GPIO performance on both cores CM7 and CM4. To do this I used the following loop:
/* Infinite loop */
/* USER CODE BEGIN WHILE */
while (1)
{
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
LL_GPIO_SetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin); // PC8
LL_GPIO_SetOutputPin(DBG_D2_GPIO_Port, DBG_D2_Pin); // PC9
LL_GPIO_ResetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin);
LL_GPIO_ResetOutputPin(DBG_D2_GPIO_Port, DBG_D2_Pin);
LL_GPIO_SetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin);
LL_GPIO_SetOutputPin(DBG_D2_GPIO_Port, DBG_D2_Pin);
LL_GPIO_ResetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin);
LL_GPIO_ResetOutputPin(DBG_D2_GPIO_Port, DBG_D2_Pin);
HAL_GPIO_TogglePin(LD1_GPIO_Port, LD1_Pin);
HAL_Delay(1);
}
/* USER CODE END 3 */
On an oscilloscope I checked the outputs PC8 and PC9:
CM7:
CM4:
The CM7 needs for a single output operation about 25ns and the CM4 4,2ns. Why is the CM7 slower than CM4?
Used tools and configurations:
Test board: NUCLEO-H755ZI-Q
Initial config: STM32CubeMX v6.4.0
Toolchain: MDK-ARM v5.36.0.0
C/C++ Optimization: Level 3 (-O3)
CM7 config:
- Clock frequency: 480MHz (VOS0)
- CPU ICache and DCache enabled
- assigned memory: DTCM RAM (0x2000 0000 - 0x2001 FFFF)
- MPU not used
CM4 config:
- Clock frequency: 240MHz (VOS0)
- assigned memory: SRAM1, SRAM2, SRAM3 (0x1000 0000 - 0x1004 7FFF)
- MPU not used
Additionally I checked the created .axf files (see bellow). Booth cores used the same assembler instructions for the GPIO access.
fromelf H755ZI_CM7_TIM1_Perf_CM7.axf --disassemble --interleave=source --text -c --output=CM7_out.lst
--- CM7_out.lst ---
;;; ../CM7/Core/Src/main.c (151)
0x080036ee: 0075 u. LSLS r5,r6,#1
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (915)
0x080036f0: 042f /. LSLS r7,r5,#16
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (886)
0x080036f2: 61a6 .a STR r6,[r4,#0x18]
0x080036f4: 61a5 .a STR r5,[r4,#0x18]
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (915)
0x080036f6: f8c48018 .... STR r8,[r4,#0x18]
0x080036fa: 61a7 .a STR r7,[r4,#0x18]
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (886)
0x080036fc: 61a6 .a STR r6,[r4,#0x18]
0x080036fe: 61a5 .a STR r5,[r4,#0x18]
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (915)
0x08003700: f8c48018 .... STR r8,[r4,#0x18]
0x08003704: 61a7 .a STR r7,[r4,#0x18]
;;; ../CM7/Core/Src/main.c (158)
0x08003706: 2101 .! MOVS r1,#1
0x08003708: 4648 HF MOV r0,r9
0x0800370a: f7fdf83d ..=. BL HAL_GPIO_TogglePin ; 0x8000788
;;; ../CM7/Core/Src/main.c (159)
0x0800370e: 2001 . MOVS r0,#1
0x08003710: f7fcff12 .... BL HAL_Delay ; 0x8000538
0x08003714: e7ed .. B 0x80036f2 ; main + 282
fromelf H755ZI_CM4_TIM1_Perf_CM4.axf --disassemble --interleave=source --text -c --output=CM4_out.lst
--- CM4_out.lst ---
;;; ../CM4/Core/Src/main.c (140)
0x08102d2c: f8df9044 ..D. LDR r9,[pc,#68] ; [0x8102d74] = 0x58020400
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (915)
0x08102d30: f04f7880 O..x MOV r8,#0x1000000
0x08102d34: 046f o. LSLS r7,r5,#17
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (886)
0x08102d36: 61a5 .a STR r5,[r4,#0x18]
0x08102d38: 61a6 .a STR r6,[r4,#0x18]
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (915)
0x08102d3a: f8c48018 .... STR r8,[r4,#0x18]
0x08102d3e: 61a7 .a STR r7,[r4,#0x18]
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (886)
0x08102d40: 61a5 .a STR r5,[r4,#0x18]
0x08102d42: 61a6 .a STR r6,[r4,#0x18]
;;; ../Drivers/STM32H7xx_HAL_Driver/Inc/stm32h7xx_ll_gpio.h (915)
0x08102d44: f8c48018 .... STR r8,[r4,#0x18]
0x08102d48: 61a7 .a STR r7,[r4,#0x18]
0x08102d4a: 2101 .! MOVS r1,#1
0x08102d4c: 4648 HF MOV r0,r9
0x08102d4e: f7fdfd2f ../. BL HAL_GPIO_TogglePin ; 0x81007b0
;;; ../CM4/Core/Src/main.c (141)
0x08102d52: 2001 . MOVS r0,#1
0x08102d54: f7fdfbf0 .... BL HAL_Delay ; 0x8100538
0x08102d58: e7ed .. B 0x8102d36 ; main + 142
Is it possible to improve the GPIO performance on the core CM7?
Kind regards
Michael
2022-04-07 06:05 AM
Hello,
This is a known behavior in H7 family. Please check out these discussions in the community:
https://community.st.com/s/global-search/stm32h7-gpio-togle-max-frequency
2022-04-07 07:12 AM
The GPIO's are parked way off on AHB4/APB4, the M4 is closer, look at the organizational diagram.
>>is it possible to improve the GPIO performance on the core CM7?
Perhaps use the function to write the state of both pins together? Pretty much double the toggle rate right there.
Most people don't use a 400 MHz MCU to do this, couldn't you drive with a TIM, or a CPLD if you want wire-speed operation?
2022-04-13 01:22 AM
Thank you for your feedback.
I performed some further measurements using the DWT and an oscilloscope. It seems I have to accept the longer GPIO times of the CM7 core.
Summary of my measurements:
Core CM7, 480MHz
----------------
LL_GPIO_SetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin); // 12 cycles, 25ns
LL_GPIO_ResetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin); // 12 cycles, 25ns
WRITE_REG(TIM1->CCMR1, 0x00000050); // TIM1 CH1 FORCED_ACTIVE; 8 cycles, 16,7ns
WRITE_REG(TIM1->CCMR1, 0x00000040); // TIM1 CH1 FORCED_INACTIVE; 8 cycles, 16,7ns
Core CM4, 240MHz
----------------
LL_GPIO_SetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin); // 1 cycle, 4,2ns
LL_GPIO_ResetOutputPin(DBG_D1_GPIO_Port, DBG_D1_Pin); // 1 cycle, 4,2ns
WRITE_REG(TIM1->CCMR1, 0x00000050); // TIM1 CH1 FORCED_ACTIVE; 4 cycles, 16,7ns
WRITE_REG(TIM1->CCMR1, 0x00000040); // TIM1 CH1 FORCED_INACTIVE; 4 cycles, 16,7ns
Kind regards
Michael