cancel
Showing results for 
Search instead for 
Did you mean: 

Custom Loaders for the STM32L496 with MT25QL128 QuadSPI External Flash (ultimately for touchGFX)

BWKidd
Associate III

Hi All,

I've been trying to bring up a custom external loader for well over a week and I'm just not getting anywhere.

My processor is an STM32L496VGT6 and its connected via quadSPI to an MT25QL128A 16MB external flash memory. I'm running the clock at 80MHz, and my QUADSPI is setup like this (both in my custom loader, and in the application code I'm developing):

BWKidd_3-1725980804997.png

BWKidd_4-1725980856276.png

Following mostly along with the "External QSPI loader how to" (from here External QSPI loader how to - 05 – QSPI driver coding - YouTube), and utilizing the files in the contrib branch of the stm external loader github repository (GitHub - STMicroelectronics/stm32-external-loader at contrib), I've put together what seems like a working driver for the MT25QL128. I got there by starting with the quadspi.h/c files in the stm32-custom-loader/QSPI_Drivers/MT25QL512/ folder, and then modifying them using the micro datasheet for the part to swap different commands where necessary. I pulled in test code from the main.c file in stm32-external-loader-contrib\Loader_Files\other devices\, and also used the Loader_Src.c, Dev_Inf.h/c, an linker.ld files from the same folder. The following test code runs with no issues when debugging:

 

/* USER CODE BEGIN 2 */
uint8_t buffer_test[MEMORY_SECTOR_SIZE];
uint32_t var = 0;

CSP_QUADSPI_Init();

for (var = 0; var < MEMORY_SECTOR_SIZE; var++) {
buffer_test[var] = (var & 0xff);
}

for (var = 0; var < SECTORS_COUNT; var++) {

if (CSP_QSPI_EraseSector(var * MEMORY_SECTOR_SIZE, (var + 1) * MEMORY_SECTOR_SIZE - 1) != HAL_OK) {

while (1)
; //breakpoint - error detected
}

if (CSP_QSPI_WriteMemory(buffer_test, var * MEMORY_SECTOR_SIZE, sizeof(buffer_test)) != HAL_OK) {

while (1)
; //breakpoint - error detected
}

}

if (CSP_QSPI_EnableMemoryMappedMode() != HAL_OK) {

while (1)
; //breakpoint - error detected
}

for (var = 0; var < SECTORS_COUNT; var++) {
if (memcmp(buffer_test, (uint8_t*) (QSPI_BASE + var * MEMORY_SECTOR_SIZE), MEMORY_SECTOR_SIZE) != HAL_OK) {
while (1)
; //breakpoint - error detected - otherwise QSPI works properly
}
}
/* USER CODE END 2 */

 

 

I then modified my project options to generate the stldr file per the youtube video, changed to the "linker.ld" file, and copied the stldr file it into my STM32CubeProgrammer ExternalLoader folder. I used the STM32CubeProgrammer to read memory (starting at 0x90000000), and write to the same memory area using the "testbinary1M.bin" file from \stm32-external-loader-contrib\QSPI_testing\.

Everything up to this point seems to work without problems so far.

Where my problem begins is when I attempt to use the new external loader I generated in my debug sessions while trying to develop my main application. Here I've gone into the debug sessions, pointed to my generated loader file, and started to debug.

BWKidd_2-1725978700482.png

BWKidd_0-1725978665088.png

With the external loader enabled as above, the code stops on the first use of:

HAL_Delay(1);

If I drill down a little with the debugger, the code is stuck within HAL_Delay() in the while loop:

 

 

__weak void HAL_Delay(uint32_t Delay)
{
uint32_t tickstart = HAL_GetTick();

uint32_t wait = Delay;

/* Add a period to guaranty minimum wait */
if (wait < HAL_MAX_DELAY)
 {
 wait += (uint32_t)uwTickFreq;
 }

while ((HAL_GetTick() - tickstart) < wait) // <===== STUCK HERE!
 {
 }
}

Where HAL_GetTick() constantly returns 0, which doesn't seem right.

 

 

When I disable the external loader, everything goes back to working fine. What do I check next? Is my external loader really just not working in some untested way? Or is there some initialization I need to do in my main() initialization when using an external loader? Or maybe some deinitialization I didn't do in my loader? 

17 REPLIES 17
BWKidd
Associate III

Are there any external loader examples that you can point me towards that shows how to do this?

And for anyone out there at ST, any chance the tutorials, examples, and such could be updated? There is a lot of missing information on how to construct these external loaders. 

BWKidd
Associate III

In my flailing about over the past couple of hours, I decided to try two things.

1. I replaced my Loader_Src.c file with the one from the Demo_Project in the contrib branch of the stm-external-loader respository (stm32-external-loader/Demo_Project at contrib · STMicroelectronics/stm32-external-loader · GitHub). The main difference is that this uses HAL_SuspendTick() instead of __set_PRIMASK(). What is the difference? I have no idea, but it seemed like a good idea at the time.

2. I decided to reset SCB->VTOR back to its previous value from the beginning of the external loader Init() function, like this:

 

int Init(void) {
// Code... 
    uint32_t Prev_VTOR_Val = SCB->VTOR;
    SCB->VTOR = 0x20000000 | 0x200;		// Change Vector Table Offset Register to point to the start of SRAM1

    HAL_Init();
// Code...
    SCB->VTOR = Prev_VTOR_Val;
    return LOADER_OK;
}

 

This might be somewhat what @Tesla DeLorean was talking about...maybe? Is this the right way to do it? Almost certainly not. Does it work? Well my application code now runs after the external loader...so...maybe?

My full Loader_Src.c file is posted below.

Update 9/12/2024: Never mind, it ran, but only without making the necessary updates to the linker file to place the touchgfx assets into the external memory. Once I put those changes in place and it actually tried to erase and program the QSPI flash, the STM32CubeIDE debugger started throwing error messages and I wasn't able to program anything. Back to the drawing board.

The hardware will remain in the state you leave it.

If you bring up an QSPI and change it's modes, it will stay in the new mode, there's not an async reset pin, just power cycling.

Like I said, you don't need to use interrupts in these things at all, the mechanics were built on not needing them, the SPL didn't need them.

The HAL only needs some method for ticks to advance so the delays and timeouts function in a semi desirable fashion.

HAL_GetTick() can read a HW counter, and the TIM don't need to interrupt in free-run / maximal mode, and neither does DWT->CYC_CNT

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
BWKidd
Associate III

Holy Carp! 

BWKidd_1-1726261865191.png

My external loader was more or less working all along. I just needed to not have "initialize" checked in the external loader debug settings.

An ST field applications engineer took a look at it and that was all it was. I'll try to post a follow up with a few more details on Monday, but that was pretty much it. Many thanks for those of you who offered advice, and a special thanks to @Tesla DeLorean for trying to help me work through it.

BWKidd_0-1726261321683.png

 

BWKidd
Associate III

As a follow up, I want to list in detail how I created a custom external loader using STM32CubeIDE 1.16.0. All of this information is out there in bits and pieces, but one of the really difficult things for me was pulling it all together. In particular, most of the external loader examples out there seem to be setup to be built in IAR, which wasn't helpful for me. Full disclaimer, I'm not putting this forward as the right way to do it, it's just the way I did it, and at the moment seems to be working. I want to post this because it might save someone else some of the heartache I just went through...and there's a good chance I'll need it myself in about 6 months.

My Hardware Details

MCU: STM32L496VGT6

External Flash: MT25QL128A (16 MB Quad-SPI Flash Memory)

Connections: 

BWKidd_0-1726498532058.png

This is of course a custom board I've designed, and there are many more connections, but none of those should be necessary to consider for the purposes of constructing the external loader.

To construct the loader, I more or less followed the instructions in the "External QSPI loader how to" here: External QSPI loader how to - STMicroelectronics. This tutorial in turn references file on the "stm32-external-loader" github repository on the contrib branch (all the ones in the main branch seem to be setup to be built in IAR) GitHub - STMicroelectronics/stm32-external-loader at contrib

Setting up the External Loader Project in STM32CubeIDE

In STM32CubeIDE, I created a new "STM32 project" for the STM32L496VGT6 part, and then started editing the pinout and configuration using MX with STM32CubeIDE. Starting with the clock configuration, I set my HCLK to the maximum frequency available of 80MHz. I probably could have used the default settings, but this is the clock setting using for my application so I went with it.

BWKidd_1-1726499221950.png

Next, I changed the Project Manager / Code Generator settings to "Generate peripheral initialization as a pair of '.c/.h' files per peripheral". This makes the peripheral file setup match how the previously mentioned template code in the stm32-external-loader contrib repository.

BWKidd_2-1726499511771.png

Next came the QUADSPI setup for the MT25QL128. The QuadSPI mode was set to 'Bank2 with Quad SPI Lines', and set the rest of the Parameter Settings as below:

BWKidd_3-1726499927323.png

Notes:

1. The Clock Pre-scaler - I'm still a little fuzzy on how to set this based on the datasheet values of the memory, but my clock was pretty close to the one in the tutorial, so I went with it and it worked.

2. Fifo Threshold - again, I went with the value recommended in the tutorial of '4'

3. The flash size is given in a table in the tutorial (excerpted here)

BWKidd_4-1726500053958.png

4. All the rest of the settings were left in their default.

Finally, I needed to change the QUADSPI_CLK line from its default pin to PB10 due to constraints in my design (the default pin for this signal was in use for other purposes in my design).

BWKidd_5-1726500585649.png

With that adjustment made I saved and generated code. Now it was time to bring in the code template files from the stm32-external-loader repository. 

I copied the following files from the stm32-external-loader-contrib repository files into the following directories in my external loader project:

Location in RepositoryFile NameCopied to Location
\Loader_Files\other devices\Dev_Inf.c \Core\Src\
\Demo_Project\Core\Src\Loader_Src.c\Core\Src\
\Loader_Files\other devices\linker.ldroot of project
\Loader_Files\other devices\Dev_Inf.h\Core\Inc\
\QSPI_Drivers\MT25QL512\quadspi.h\Core\Inc\
\QSPI_Drivers\MT25QL512\quadspi.c\Core\Src\

I also copied the code from \QSPI_testing\main_test.c into my main.c file in the appropriate places.

I then went back into MX and re-generated the code again to bring all of the generated code up to date.

Modifying the External Loader Code

The quadspi.h/c driver files had to be modified to change it from working with the MT25QL512 to instead operate with the MT25QL128. This required some time with the datasheet to figure out equivalent command commands and such. I won't go into detail here since this is its own rabbit hole, but the External QSPI loader how to - STMicroelectronics tutorial did have some useful things to say about this. It's also worth noting that that several of the loaders on the main branch of the github repository had some code for interfacing with the MT25QL128, along with several other flash memory chips, and those could probably be modified to work without too much effort as well.

Per this post https://community.st.com/t5/stm32-mcus/custom-external-loader-quot-failed-to-download-segment-0-quot/ta-p/49307 I added some override functions to simplify the HAL time-keeping functions at the top of the main.c file, and put prototypes for them in main.h to allow them to be accessed by the Loader_Src.c file.

 

 

// The following functions override the HAL systick related timekeeping functions
// as the custom loader does not need all the "extra" stuff that comes along with
// the normal implementation of these.
//
// More information can be found in this post:
// https://community.st.com/t5/stm32-mcus/custom-external-loader-quot-failed-to-download-segment-0-quot/ta-p/49307

HAL_StatusTypeDef HAL_InitTick(uint32_t TickPriority) {
  return HAL_OK;
}

uint32_t HAL_GetTick(void) {
  return 1;
}

void HAL_Delay(uint32_t Delay) {
  int i=0;
  for (i=0; i<0x1000; i++);
}

 

 

Once the driver code modifications were complete, I tested them by simply plugging up the hardware with my stlink programmer and PCB and starting the debugger. Instrumenting the code with a UART would probably be a better way of testing, but I just used the debugger along with breakpoints to step through the code in my main() function. After a little more debugging, my code made it all the way to the while(1) loop, which meant that things seemed to be working with my quadspi.c/h driver code.

Final External Loader Project Settings (Generating the STLDR file)

Now I needed to change linker files to the linker.ld and generate the .stldr file. This was also covered pretty well in the tutorial, but all that has to be done is modify the projects options to:

1. In C/C++ Build --> Settings --> Tool Settings --> MCU G++ Linker/General: I Pointed to the linker.ld file and uncheck "Discard unused sections"

BWKidd_0-1726517764348.png

 

2. Under C/C++ Build --> Settings --> Build Steps, I added the following command to the "Post-build steps" Command: \'cmd.exe /C copy/Y "${BuildArtifactFileBaseName}.elf" "..\custom_flashloader_name.stldr"

BWKidd_2-1726518571494.png

 

I then pressed the Apply and Close buttons and rebuilt the project. The AQM_REVB_MT25QL128.stldr appeared in the project's root folder. I copied this file to the location of the other external loaders for the STM32CubeProgrammer so that I could test the generated loader. The location on my computer for the external loaders was here: 

C:\Program Files\STMicroelectronics\STM32Cube\STM32CubeProgrammer\bin\ExternalLoader

Testing the external Loader with STM32CubeProgrammer

Opening STM32CubeProgrammer, I clicked the External loader button, scrolled through the list, made sure all other external loaders were not selected, and then selected my custom loader.

BWKidd_3-1726518655831.png

Then I went to the Erasing and Programming tab, and attempted an erase on the external memory on one of the segments, and then the entire memory. Both seemed to complete successfully.

BWKidd_7-1726519081091.png

Next, I browsed for the testbinary1M.bin (\stm32-external-loader-contrib\QSPI_testing\), checked the "download file" box, changed the Start address to 0x90000000 (the memory mapped address for a quadspi flash), pressed "Connect" and "Start Programming".

BWKidd_6-1726519065591.png

With that working, the final test was to read the memory and see if it contained the expected values. On the Memory & File Editing tab, I changed the address to 0x90000000 and clicked "read", and confirmed everything was as expected (the TestBinary.hex file contains a sequence 32 bit numbers 00000000, 11111111, 22222222, etc).

BWKidd_8-1726519256741.png

With that out of the way, I was pretty confident my external loader was functioning properly. Now it was time to start integrating the loader with my main application.

Setting up the Main Application to use External Flash

First, I setup the hardware for the QuadSPI in MX just like I did previously for the custom loader. In this particular case, I didn't check the "Generate peripheral initialization as a pair of '.c/.h' files per peripheral", so I copied all of the QuadSPI interface functions that were in quadspi.h/c into a new MT25QL128.h/c files. I did have to delete the MX_QUADSPI_Init routine, as that was now defined in main.c. I then added the necessary initialization and memory mapping into this routine like so:

 

 

 

static void MX_QUADSPI_Init(void)
{

  /* USER CODE BEGIN QUADSPI_Init 0 */

  /* USER CODE END QUADSPI_Init 0 */

  /* USER CODE BEGIN QUADSPI_Init 1 */

  /* USER CODE END QUADSPI_Init 1 */
  /* QUADSPI parameter configuration*/
  hqspi.Instance = QUADSPI;
  hqspi.Init.ClockPrescaler = 2;
  hqspi.Init.FifoThreshold = 4;
  hqspi.Init.SampleShifting = QSPI_SAMPLE_SHIFTING_NONE;
  hqspi.Init.FlashSize = 23;
  hqspi.Init.ChipSelectHighTime = QSPI_CS_HIGH_TIME_1_CYCLE;
  hqspi.Init.ClockMode = QSPI_CLOCK_MODE_0;
  hqspi.Init.FlashID = QSPI_FLASH_ID_2;
  hqspi.Init.DualFlash = QSPI_DUALFLASH_DISABLE;
  if (HAL_QSPI_Init(&hqspi) != HAL_OK)
  {
    Error_Handler();
  }
  /* USER CODE BEGIN QUADSPI_Init 2 */

  // Initialize QUADSPI Flash Memory
   if (QSPI_ResetChip() != HAL_OK) {
 	  ;	// TODO: Add Error Handling
   }

   HAL_Delay(1);

   if (QSPI_AutoPollingMemReady(HAL_QPSI_TIMEOUT_DEFAULT_VALUE) != HAL_OK) {
 	  ;	// TODO: Add Error Handling
   }

   if (QSPI_WriteEnable() != HAL_OK) {
 	  ;	// TODO: Add Error Handling
   }

   if (QSPI_Configuration() != HAL_OK) {
 	  ;	// TODO: Add Error Handling
   }

   if (QSPI_AutoPollingMemReady(HAL_QPSI_TIMEOUT_DEFAULT_VALUE) != HAL_OK) {
 	  ;	// TODO: Add Error Handling
   }

   // Setup memory mapping
 	if (CSP_QSPI_EnableMemoryMappedMode() != HAL_OK) {
 		;	// TODO: Add Error Handling
 	}
  /* USER CODE END QUADSPI_Init 2 */

}

 

 

Modifying the Main Application's Linker file to use the External Flash

Next the projects linker file needed to be modified (In my case named STM32L496VGTX_FLASH.ld). First, I added a definition for the QuadSPI flash to the memories definition:

 

 

/* Memories definition */
MEMORY
{
  RAM    (xrw)    : ORIGIN = 0x20000000,   LENGTH = 320K
  RAM2    (xrw)    : ORIGIN = 0x10000000,   LENGTH = 64K
  FLASH    (rx)    : ORIGIN = 0x08000000,   LENGTH = 1020K
  EEPROM_EMULATION    (rx)    : ORIGIN = 0x0807F000,   LENGTH = 4K
  QUADSPI (r) : ORIGIN = 0x90000000, LENGTH = 16M /* <== QUADSPI DEFINITION ADDED HERE */
}

 

 

Then I added the following section definitions at the very end of the file, just before the last closing bracket:

 

 

 ...
  ExtFlashSection :
  {
  	*(ExtFlashSection ExtFlashSection.*)
  	*(.gnu.linkonce.r.*)
  	. = ALIGN(0x4);
  } >QUADSPI
  FontFlashSection :
  {
  	*(FontFlashSection FontFlashSection.*)
  	*(.gnu.linkonce.r.*)
  	. = ALIGN(0x4);
  } >QUADSPI
  TextFlashSection :
  {
  	*(TextFlashSection TextFlashSection.*)
  	*(.gnu.linkonce.r.*)
  	. = ALIGN(0x4);
  } >QUADSPI
} /* <== last closing bracket */

 

 

These tell the linker that all the graphics assets that TouchGFX will use (bitmaps, fonts and text) are located in the external QuadSPI flash memory.

After building the project, I was able to use the Build Analyzer (in STM32CubeIDE, Window --> Show View --> Build Analyzer) verify that the linker was planning on placing all those assets in the QUADSPI section of memory:

BWKidd_2-1726586210383.png

Setting up the External Loader in the Main Application

Finally, the last step was to enable the external loader in the main application's debug settings. I did this by pushing the arrow next to the debug button and selecting "Debug Configurations...". With my project selected in the left-hand pane, I clicked on the debugger tab, scrolled down to "External Loaders" and clicked the "Add..." button.

BWKidd_0-1726585262969.png

In the "Add External Loader" I clicked the "workspace..." button and directly pointed it to my custom loader - no need to copy it to a special external loaders directory! I suppose that might be helpful to do later on if I were using it on multiple projects, but for now this works well and makes it so if I make any changes to my external loader, they'll be automatically incorporated in the next debug session for my main application. Also be sure to check the "Enabled" check box, and DO NOT CHECK the "Initialize" (the root of my problem).

BWKidd_1-1726585933466.png

Why don't click the initialize button you ask? According to the STM32CubeIDE user manual (um2609-stm32cubeide-user-guide-stmicroelectronics.pdf:(

"When the [Initialize] property of an external loader is enabled, the loaders Init() function is automatically called after reset operations. It can be used to configure the device for external memory access. Usually, the debugged application must perform the initialization."

I guess its saying that it calls the loaders Init() function again after it finishes using the external loader, which would then mess with my main apps initialization?

 

And finally I had a working project with touchGFX and working external flash memory. And it only took me 3 weeks...assuming you don't count the months I spent just trying to get touchGFX to work. Easy!

 

 

BWKidd
Associate III

Custom Loader Dev_Inf.h/c and Loader_Src.c file

BWKidd
Associate III

MT25QL128 driver code (use at your own risk!)

BWKidd
Associate III

And finally main.h/c