2024-11-20 07:09 AM
Hello,
I succeed to run all demo programs of the H7 performance suite (STM32CubeExpansion_Performance_H7_V1.0.0) except the QSPI single mode demo (8 - D1_QuadSPI_Single - D1_DTCM) on STM32H753I_EVAL2. But I can run the dual mode demo (9 - D1_QuadSPI_Dual - D1_DTCM) successfully.
As far as I can see, both demos use the same source, the only difference is in the preprocessor settings, a symbol "QSPI_DUAL_FLASH" is present in the dual demo toolset settings. If this symbol is removed the same error as shown below appears too.
It seems that the call / execution of the FFT function crashes if run in the single QSPI mode. I can set a breakpoint on the function but as soon as I try to step over it
the debugger jumps to this line here in HAL_RCC_MCOConfig() function
which doesn't make much sense to me.
The debug window shows
According to the linker script, the fft functions are located in OSPI flash. The same code runs in dual mode but fails in single mode. This leads me to the assumption that there is something wrong in the QSPI interface configuration for single mode.
BTW: I can successfully run the memory mapped demo (as supplied by CubeMx for H743 evaluation board) in single and dual mode, also the XIP demo, which is also in single mode.
Any idea why the program above has stopped here? What can I do to gather more information?
Thanks
2024-11-20 09:03 AM
Hello,
I see you are using CubeIDE to build the examples.
Originally, the projects was provided with EWARM, MDK-RAM and System Workbench (AC6).
So I suggest you to try with the deprecated tool System workbench downloadable from here.
It could be something was missed when you imported the project from System Workbench to CubeIDE.
2024-11-29 02:59 AM
Hello @SofLit
I guess that the demos will work using one of those IDEs.
But on the other hand, it would be helpful for the community to get the stuff running in the latest officially supported tools promoted by STM32. Correct me if I am wrong.
I'd like to learn all about the GNU linker stuff, how to place code/data to memory locations and this is a good chance to do so.
What I found out, due to lack of time, is that according to the demo readme, the code is intended to run in the QSPI flash but obviously is running in the internal flash.
Regarding the .map file, code and FFT const data is generated for QSPI.
.QSPISection 0x90000000 0x2ea4
0x90000000 . = ALIGN (0x4)
*main.o(.text .text*)
*arm_bitreversal2.o(.text .text*)
*arm_cfft_f32.o(.text .text*)
After running the demo in the CubeIDE, I can see in the CubeProgrammer that the code & data has been programmed into QSPI flash (I've erased the entire flash before):
But according to the IDE debugger, the program code is executed from the internal flash:
I guess that the linker script here is not suitable to locate the FFT function code into the QSPI flash:
.QSPISection : {
. = ALIGN(4);
*main.o (.text .text*)
*arm_bitreversal2.o (.text .text*)
*arm_cfft_f32.o (.text .text*)
*arm_cfft_radix8_f32.o (.text .text*)
*arm_cmplx_mag_f32.o (.text .text*)
*arm_max_f32.o (.text .text*)
*main.o (.rodata .rodata*)
*arm_common_tables.o (.rodata .rodata*)
*arm_const_structs.o (.rodata .rodata*)
. = ALIGN(4);
} >QSPI
I checked the linker script of the QSPI_ExecuteInPlace demo as provided by CubeMx:
.qspi :
{
. = ALIGN(4);
_qspi_start = .; /* create a global symbol at qspi start */
*(.qspi) /* .qspi sections */
*(.qspi*) /* .qspi* sections */
. = ALIGN(4);
_qspi_end = .; /* define a global symbols at end of qspi */
} >QSPI AT> FLASH
Here the function code of GpioToggle() is placed and run from QSPI flash.
In the source code the function is placed to QSPI section like this:
#if defined(__CC_ARM)
#pragma arm section code = ".qspi"
#pragma no_inline
static void GpioToggle(void)
#elif defined(__ICCARM__)
static void GpioToggle(void) @ ".qspi"
#elif defined(__GNUC__)
static void __attribute__((section(".qspi"), noinline)) GpioToggle(void)
#endif
{
I guess that this mechanism is missing in the FFT demo if using the ST Cube tools.
Unfortunately, it seems that the FFT demo does not provide the source code and so I don't know how to implement this mechanism.
Any suggestions?
2024-11-29 05:06 AM - edited 2024-11-29 07:04 AM
@regjoe wrote:
I guess that this mechanism is missing in the FFT demo if using the ST Cube tools.
As I said previously this application note with its software was released at the time even CubeIDE was not available and also was not the intention to implement this mechanism and was not the focus of the AN. The memory regions was mainly managed in the linker files.
Meanwhile, for QSPI, you can inspire from this example from STM32H7 Cube package, i.e. executing from QSPI:
2024-11-29 11:36 AM
Hello @SofLit
this is the example I mentioned in my previous post. This linker mechanism differs from the FFT demo.
In the FFT demo the relocatable object file contain the function code to be placed into QSPI memory. This differs from the QSPI XIP demo, see previous post. I guess all I need to know is how to configure the linker script to place relocatable object code into QSPI memory.
2024-11-29 02:19 PM
Perhaps you can use objdump/objcopy to understand the content of the object and/or library files to see the section/symbol naming, and thus the filter requirements/criterion for the Linker Script
2024-12-02 02:39 AM
Hello @Tesla DeLorean , hello @SofLit
the objdump command output the FFT functions in question (e.g. arm_cfft_f32):
C:\Users\StJo\Documents\Data\Projekte\stm32h7\H7_Performance\Drivers\CMSIS\Lib\GCC>arm-none-eabi-objdump.exe -a libarm_cortexM7lfdp_math.a | grep -i arm_cfft_f32.o
arm_cfft_f32.o: file format elf32-littlearm
rw-rw-rw- 0/0 20704 Oct 20 10:23 2015 arm_cfft_f32.o
The linker also confirms that the function is found:
attempt to open ../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a succeeded
../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_const_structs.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_common_tables.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_max_f32.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_cfft_f32.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_cmplx_mag_f32.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_bitreversal2.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_cfft_radix8_f32.o
The .map file also indicates that the function has been placed to QSPI:
.QSPISection 0x90000000 0x2ea4
0x90000000 . = ALIGN (0x4)
*main.o(.text .text*)
*arm_bitreversal2.o(.text .text*)
*arm_cfft_f32.o(.text .text*)
*arm_cfft_radix8_f32.o(.text .text*)
*arm_cmplx_mag_f32.o(.text .text*)
*arm_max_f32.o(.text .text*)
*main.o(.rodata .rodata*)
.rodata.str1.4
Regarding the constant data, these are placed into QSPI memory (Cube Programmer QSPI read):
Nevertheless, the .list file claims that the function code has been placed to internal flash:
0800360c <arm_cfft_f32>:
800360c: 2a01 cmp r2, #1
800360e: e92d 41f0 stmdb sp!, {r4, r5, r6, r7, r8, lr}
8003612: 4606 mov r6, r0
I guess that there is a problem with the linker script. Seems to me that only constants are placed to QSPI but code is placed to internal flash.
For example, I'm a bit confused regarding these lines:
/* The program code and other data goes into FLASH */
.text :
{
. = ALIGN(4);
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} >FLASH
.QSPISection : {
. = ALIGN(4);
*main.o (.text .text*)
*arm_bitreversal2.o (.text .text*)
*arm_cfft_f32.o (.text .text*)
*arm_cfft_radix8_f32.o (.text .text*)
*arm_cmplx_mag_f32.o (.text .text*)
*arm_max_f32.o (.text .text*)
*main.o (.rodata .rodata*)
*arm_common_tables.o (.rodata .rodata*)
*arm_const_structs.o (.rodata .rodata*)
. = ALIGN(4);
} >QSPI
For me, the first section already commands the linker to put all text to internal flash, so there is no more text in the second section left to be placed into QSPI.
I've attached some files e.g. map, list, build log, linker script, makefile etc.
Any idea?
2024-12-02 03:00 AM
Hello,
The liker files related to QSPI projects are available under
STM32H743I_EVAL\stm32h7x3_cpu_perf\SW4STM32\8 - D1_QuadSPI_Single - D1_DTCM
and
STM32H743I_EVAL\stm32h7x3_cpu_perf\SW4STM32\9 - D1_QuadSPI_Dual - D1_DTCM
You can inspire from them.
2024-12-02 03:35 AM - edited 2024-12-02 03:35 AM
Hello @SofLit , hello @Tesla DeLorean
yes, these are exactly the .ld files I am talking about :)
For a first test I simply put the QSPI section before the internal FLASH section:
.QSPISection : {
. = ALIGN(4);
*main.o (.text .text*)
*arm_bitreversal2.o (.text .text*)
*arm_cfft_f32.o (.text .text*)
*arm_cfft_radix8_f32.o (.text .text*)
*arm_cmplx_mag_f32.o (.text .text*)
*arm_max_f32.o (.text .text*)
*main.o (.rodata .rodata*)
*arm_common_tables.o (.rodata .rodata*)
*arm_const_structs.o (.rodata .rodata*)
. = ALIGN(4);
} >QSPI
/* The program code and other data goes into FLASH */
.text :
{
. = ALIGN(4);
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} >FLASH
Now the main and FFT functions are placed in QSPI flash. For me, this seems OK and now the execution performance dropped to 436ms, factor 1.5 compared to internal flash, as mentioned in the FFT benchmark documentation. I'll check this in detail later.
So do you think this is a bug in the linker script?