2024-11-07 10:47 AM
Hello,
I want to execute code located in external 16bit NOR FLASH e.g. on a STM32H7x3 evaluation board.
I can run the NOR FLASH demo (erase, write, read flash) successfully but have no clue how to create and flash code.
After googling 2 days I found plenty of examples for xSPI but none seems to suit this kind of flash type.
Can somebody point me into the right direction e.g. how to configure the linker script and flash the code using the debugger or Cube Programmer. Any example code or ANs?
Thank you
2024-11-07 11:26 AM - edited 2024-11-07 11:28 AM
Hello,
There is no direct example that provides an execution from External paralel NOR Flash.
As said in the previous thread, the AN4891 "STM32H72x, STM32H73x, and single-core STM32H74x/75x
system architecture and performance" with the package X-CUBE-PERF-H7 could help you to do what you are looking for and inspire from the execution from QSPI Flash.
What important is to configure the NOR flash in the system_stm32h7xx.c as done for QSPI memory in X-CUBE-PERF-H7. You need to find the Flash loader of that Flash memory and add it in the IDE. Otherwise you will use STM32CubeProgrammer and add that flash from External Flash loader and upload the .hex file generated from the IDE.
For the linker file you need to refer to the QSPI example scatter file: 8-M7-QSPI-Single_rwDTCM.sct for KEIL under Projects\STM32H743I_EVAL\stm32h7x3_cpu_perf\MDK-ARM\scatter_files
For IAR: C:\Users\sofiene\Downloads\en.x-cube-perf-h7\STM32CubeExpansion_Performance_H7_V1.0.0\Projects\STM32H743I_EVAL\stm32h7x3_cpu_perf\EWARM\icf_files
And for system workbench Projects\STM32H743I_EVAL\stm32h7x3_cpu_perf\SW4STM32\ where .ld file are located in the different project folders.
Hope it helps.
2024-11-07 11:55 PM
Hello SofLit,
in the eval board schematics of the STM32H7x3 Eval board in rev B and E are PC28F123M29EWLA and MT28EW128ABA1LPC-0SIT NOR FLASH, but the Cube Programmer seems to support M29W128GL only.
The M29W128GL seems to be out of production. I found a AN regarding writing external flash loaders and sources of the M29W128GL can be found on GITHUB, so it should be possible to use recent chips. Hope this task is not too complicated and time consuming. I will run into the same problem for QSPI FLASH later.
BTW: Beside ST there is another interesting project on GITHUB with lots of ports to the latest Q/OSPI flashes, but most of them seem to be in pending status.
I'll dive into the code you mentioned. I probably need additional information how to flash external memory using Cube IDE and where to find suitable flash loaders.
Best regards
2024-11-08 12:12 AM
Hello,
If possible to give a try by uploading the binary demo of that board using CubeProgrammer with the M29Wxxx flash loader: https://www.st.com/resource/en/compiled_demos/stm32h743i-eval_demo.zip
If it doesn't work, I will ask internally if there is a Flash loader for MT28EW128 available.
2024-11-08 12:30 AM - edited 2024-11-08 12:31 AM
Sorry, it seems that the demo doesn't use External Flash NOR on FMC. It uses only QSPI Flash NOR.
So I will ask internally for MT28EW128 Flash loader availability.
2024-11-08 12:46 AM
I think the flash on my STM32H753_EVAL2 board rev E has a MICRON logo printed on top.
The Cube Programmer contains an external loader for the M29W128GL for the STM32H743 evaluation board.
So hopefully, this might work. I'll check than as soon as I get the demo compiled.
Any additional information from your team is welcome.
2024-12-02 10:01 AM
Hello @SofLit
I modified the FFT benchmark in CUBE-PERF-H7 in order to execute code out of the parallel flash.
Now that the Dual QSPI is working (see https://community.st.com/t5/stm32cubeide-mcus/h7-qspi-xip-single-mode-demo-fails/td-p/745329 ), I copied this project and added the FMC NOR controller, NOR flash & MPU initialization taken from the NOR flash example (FMC_NOR CubeMx demo for H743).
What puzzles me now is that the FFT demo running in parallel flash is ca. 2x slower than running the FFT in the Dual QSPI flash. I expected at least the same performance for the parallel flash because the QSPI is fetching code with 8bits at 50 MHz DTR (~100Mbyte/s) and the parallel flash 16bits at 70ns (~133Mbyte/s).
Here are the test results of the benchmark suite running on H753 at 400MHz CPU and 200MHz bus clock.
Test | DCache | ICache | Data | Code | Const | Time [us] |
7 | ON | ON | DTCM | IFLASH | IFLASH | 285 |
|
|
|
|
|
|
|
9 | ON | ON | DTCM | DQSPI | DQSPI | 445 |
9 | ON | OFF | DTCM | DQSPI | DQSPI | 5.137 |
9 | OFF | ON | DTCM | DQSPI | DQSPI | 1.809 |
9 | OFF | OFF | DTCM | DQSPI | DQSPI | 7.408 |
|
|
|
|
|
|
|
11 | ON | ON | DTCM | PNOR | PNOR | 1.036 |
11 | ON | OFF | DTCM | PNOR | PNOR | 16.408 |
11 | OFF | ON | DTCM | PNOR | PNOR | 1.029 |
11 | OFF | OFF | DTCM | PNOR | PNOR | 16.408 |
Any idea why code execution from the parallel NOR flash is slower than from Dual QSPI?
2024-12-02 10:18 AM - edited 2024-12-02 11:39 AM
>> but most of them seem to be in pending status.
Right, because I don't have an infinite amount of time or hardware to throw at the problem. I tend to build to specific hardware combinations that people actually have, and are willing to support work on.
Most of my stuff is using QSPI/OSPI hardware, because that's what everyone moved too, and away from high pin count, capacity limited Parallel NOR Flash.
There's flash loader source for KEIL in some of ST's most recent PACKs
If the memory is mapped and readable, you should be able to execute code from it, and send the magic sequences to Erase / Write, from Application Space, if you're not rated to code External Loaders.
2024-12-02 10:33 AM
Hello @Tesla DeLorean
I'd be glad if I could use (D)QSPI for code storage but unfortunately this peripheral is used for data logging in our application. Therefore code/const must reside in internal flash and parallel NOR flash. I don't need high capacity in PNOR (~4MB) but ~2Gbit QSPI flash.
I've already solved the loader problem mentioned before and I can erase/write/read the parallel flash.
I just wonder if the NOR flash timings in the NOR Flash demo are not optimal. I'll check that.
Thank you for your time.
2024-12-02 11:54 AM
It should also have one or two 16-SOICW parts for a total of 1Gbit (128MB) of Serial NOR Flash, that can map code at 0x90000000 in the address space.
You'd need to build code targeting that space, with code in Internal Flash to bring up the memory interface and pins, and transfer control of the running MCU to the code you've parked up there.
The original H743/H753 EVAL boards were materially the same, the EVAL2 might have a handful of pins reassigned. Check the demo README files.
These should be the immediate proxies
STM32CubeProgrammer_v2.15.0\bin\ExternalLoader\M29W128GL_STM32H743I-EVAL.stldr
STM32CubeProgrammer_v2.15.0\bin\ExternalLoader\MT25TL01G_STM32H743I-EVAL.stldr
Example for Parallel NOR on another platform
STM32CubeProgrammer_v2.15.0\bin\ExternalLoader\M29W128GL_STM3210E-EVAL