cancel
Showing results for 
Search instead for 
Did you mean: 

H7 QSPI XIP single mode demo fails

regjoe
Associate III

Hello,

I succeed to run all demo programs of the H7 performance suite (STM32CubeExpansion_Performance_H7_V1.0.0) except the QSPI single mode demo (8 - D1_QuadSPI_Single - D1_DTCM) on STM32H753I_EVAL2. But I can run the dual mode demo (9 - D1_QuadSPI_Dual - D1_DTCM) successfully.

As far as I can see, both demos use the same source, the only difference is in the preprocessor settings, a symbol "QSPI_DUAL_FLASH" is present in the dual demo toolset settings. If this symbol is removed the same error as shown below appears too.

It seems that the call / execution of the FFT function crashes if run in the single QSPI mode. I can set a breakpoint on the function but as soon as I try to step over it

function_call.png

the debugger jumps to this line here in HAL_RCC_MCOConfig() function

wrong_code.png

which doesn't make much sense to me.

The debug window shows 

signal_handler.png

According to the linker script, the fft functions are located in OSPI flash. The same code runs in dual mode but fails in single mode. This leads me to the assumption that there is something wrong in the QSPI interface configuration for single mode.

BTW: I can successfully run the memory mapped demo (as supplied by CubeMx for H743 evaluation board) in single and dual mode, also the XIP demo, which is also in single mode.

Any idea why the program above has stopped here? What can I do to gather more information?

Thanks

8 REPLIES 8
SofLit
ST Employee

Hello,

I see you are using CubeIDE to build the examples.

Originally, the projects was provided with EWARM, MDK-RAM and System Workbench (AC6).

So I suggest you to try with the deprecated tool System workbench downloadable from here.

It could be something was missed when you imported the project from System Workbench to CubeIDE.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.

Hello @SofLit 

I guess that the demos will work using one of those IDEs.

But on the other hand, it would be helpful for the community to get the stuff running in the latest officially supported tools promoted by STM32. Correct me if I am wrong. 

I'd like to learn all about the GNU linker stuff, how to place code/data to memory locations and this is a good chance to do so.

 

What I found out, due to lack of time, is that according to the demo readme, the code is intended to run in the QSPI flash but obviously is running in the internal flash.

Regarding the .map file, code and FFT const data is generated for QSPI. 

.QSPISection    0x90000000     0x2ea4
                0x90000000                        . = ALIGN (0x4)
 *main.o(.text .text*)
 *arm_bitreversal2.o(.text .text*)
 *arm_cfft_f32.o(.text .text*)

After running the demo in the CubeIDE, I can see in the CubeProgrammer that the code & data has been programmed into QSPI flash (I've erased the entire flash before):

QSPI_flash_content.png

But according to the IDE debugger, the program code is executed from the internal flash:

QSPI_FFT_code.png

I guess that the linker script here is not suitable to locate the FFT function code into the QSPI flash:

 .QSPISection : {
 
  . = ALIGN(4);
   *main.o (.text .text*)
   *arm_bitreversal2.o (.text .text*)
   *arm_cfft_f32.o (.text .text*)
   *arm_cfft_radix8_f32.o (.text .text*) 
   *arm_cmplx_mag_f32.o (.text .text*) 
   *arm_max_f32.o (.text .text*) 
   *main.o (.rodata .rodata*) 
   *arm_common_tables.o (.rodata .rodata*) 
   *arm_const_structs.o (.rodata .rodata*) 
  . = ALIGN(4);
  
 } >QSPI 

 

I checked the linker script of the QSPI_ExecuteInPlace demo as provided by CubeMx:

   .qspi :
  {
    . = ALIGN(4);
    _qspi_start = .;        /* create a global symbol at qspi start */
    *(.qspi)         		/* .qspi sections */
    *(.qspi*)        		/* .qspi* sections */
    . = ALIGN(4);
    _qspi_end = .;         /* define a global symbols at end of qspi */
    
  } >QSPI AT> FLASH

Here the function code of GpioToggle() is placed and run from QSPI flash.

In the source code the function is placed to QSPI section like this:

#if defined(__CC_ARM)
#pragma arm section code = ".qspi"
#pragma no_inline
static void GpioToggle(void)
#elif defined(__ICCARM__)
static void GpioToggle(void) @ ".qspi"
#elif defined(__GNUC__)
static void __attribute__((section(".qspi"), noinline)) GpioToggle(void)
#endif
{

I guess that this mechanism is missing in the FFT demo if using the ST Cube tools.

Unfortunately, it seems that the FFT demo does not provide the source code and so I don't know how to implement this mechanism.

Any suggestions?

 


@regjoe wrote:

I guess that this mechanism is missing in the FFT demo if using the ST Cube tools.


As I said previously this application note with its software was released at the time even CubeIDE was not available and also was not the intention to implement this mechanism and was not the focus of the AN. The memory regions was mainly managed in the linker files.

Meanwhile, for QSPI, you can inspire from this example from STM32H7 Cube package, i.e. executing from QSPI:

https://github.com/STMicroelectronics/STM32CubeH7/tree/master/Projects/STM32H743I-EVAL/Examples/QSPI/QSPI_ExecuteInPlace

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
regjoe
Associate III

Hello @SofLit 

this is the example I mentioned in my previous post. This linker mechanism differs from the FFT demo.

In the FFT demo the relocatable object file contain the function code to be placed into QSPI memory. This differs from the QSPI XIP demo, see previous post. I guess all I need to know is how to configure the linker script to place relocatable object code into QSPI memory.

 

 

Perhaps you can use objdump/objcopy to understand the content of the object and/or library files to see the section/symbol naming, and thus the filter requirements/criterion for the Linker Script

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Hello @Tesla DeLorean , hello @SofLit 

the objdump command output the FFT functions in question (e.g. arm_cfft_f32):

 

C:\Users\StJo\Documents\Data\Projekte\stm32h7\H7_Performance\Drivers\CMSIS\Lib\GCC>arm-none-eabi-objdump.exe -a libarm_cortexM7lfdp_math.a | grep -i arm_cfft_f32.o
arm_cfft_f32.o:     file format elf32-littlearm
rw-rw-rw- 0/0  20704 Oct 20 10:23 2015 arm_cfft_f32.o

 

The linker also confirms that the function is found:

 

attempt to open ../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a succeeded
../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_const_structs.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_common_tables.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_max_f32.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_cfft_f32.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_cmplx_mag_f32.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_bitreversal2.o
(../../../Drivers/CMSIS/Lib/GCC\libarm_cortexM7lfdp_math.a)arm_cfft_radix8_f32.o

 

The .map file also indicates that the function has been placed to QSPI:

 

.QSPISection    0x90000000     0x2ea4
                0x90000000                        . = ALIGN (0x4)
 *main.o(.text .text*)
 *arm_bitreversal2.o(.text .text*)
 *arm_cfft_f32.o(.text .text*)
 *arm_cfft_radix8_f32.o(.text .text*)
 *arm_cmplx_mag_f32.o(.text .text*)
 *arm_max_f32.o(.text .text*)
 *main.o(.rodata .rodata*)
 .rodata.str1.4

 

Regarding the constant data, these are placed into QSPI memory (Cube Programmer QSPI read):

QSPI_flash_content.png

Nevertheless, the .list file claims that the function code has been placed to internal flash:

 

0800360c <arm_cfft_f32>:
 800360c:	2a01      	cmp	r2, #1
 800360e:	e92d 41f0 	stmdb	sp!, {r4, r5, r6, r7, r8, lr}
 8003612:	4606      	mov	r6, r0

 

I guess that there is a problem with the linker script. Seems to me that only constants are placed to QSPI but code is placed to internal flash.

For example, I'm a bit confused regarding these lines:

  /* The program code and other data goes into FLASH */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
    *(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH

 .QSPISection : {
 
  . = ALIGN(4);
   *main.o (.text .text*)
   *arm_bitreversal2.o (.text .text*)
   *arm_cfft_f32.o (.text .text*)
   *arm_cfft_radix8_f32.o (.text .text*) 
   *arm_cmplx_mag_f32.o (.text .text*) 
   *arm_max_f32.o (.text .text*) 
   *main.o (.rodata .rodata*) 
   *arm_common_tables.o (.rodata .rodata*) 
   *arm_const_structs.o (.rodata .rodata*) 
  . = ALIGN(4);
  
 } >QSPI 

For me, the first section already commands the linker to put all text to internal flash, so there is no more text in the second section left to be placed into QSPI. 

I've attached some files e.g. map, list, build log, linker script, makefile etc.

Any idea?

Hello,

The liker files related to QSPI projects are available under 

STM32H743I_EVAL\stm32h7x3_cpu_perf\SW4STM32\8 - D1_QuadSPI_Single - D1_DTCM

and 

STM32H743I_EVAL\stm32h7x3_cpu_perf\SW4STM32\9 - D1_QuadSPI_Dual - D1_DTCM

You can inspire from them.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.

Hello @SofLit , hello @Tesla DeLorean 

yes, these are exactly the .ld files I am talking about :)

For a first test I simply put the QSPI section before the internal FLASH section:

 

 

.QSPISection : {
 
  . = ALIGN(4);
   *main.o (.text .text*)
   *arm_bitreversal2.o (.text .text*)
   *arm_cfft_f32.o (.text .text*)
   *arm_cfft_radix8_f32.o (.text .text*) 
   *arm_cmplx_mag_f32.o (.text .text*) 
   *arm_max_f32.o (.text .text*) 
   *main.o (.rodata .rodata*) 
   *arm_common_tables.o (.rodata .rodata*) 
   *arm_const_structs.o (.rodata .rodata*) 
  . = ALIGN(4);
  
 } >QSPI 

 
  /* The program code and other data goes into FLASH */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
    *(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH

 

 

Now the main and FFT functions are placed in QSPI flash. For me, this seems OK and now the execution performance dropped to 436us, factor 1.5 compared to internal flash, as mentioned in the FFT benchmark documentation. I'll check this in detail later.

So do you think this is a bug in the linker script?