External loader for a custom board issues

SakuGlumoff-bittium · ‎2024-08-23

Hello everyone!

I am trying to write a custom external loader for a custom board that has the STM32U5A9 microcontroller.

I tried following the guide from this forum post: How to implement and use your own external flash loader: An example using STM32U5A9J-DK and from the GitHub repository found here: https://github.com/STMicroelectronics/stm32-external-loader.

With those, I was able to build an external loader that is visible and usable in the STM32CubeProgrammer tool. However, every operation on the external memories when using it failed as I received errors that the operations could not be done.

So I set up to create a new project from the ground up by generating the STM32CubeIDE project from a STM32CubeMX project file and altering it to fit my needs. Again, I can get the build output to be usable by the STM32CubeProgrammer tool but this time I just stubbed the memory access functions so that they always return LOADER_OK (0x1). Even when configuring that each operation should succeed, the STM32CubeProgrammer reports that they fail.

Here's how the external loader looks like in the list of external loaders in STM32CubeProgrammer:

For example, if I try to invoke MassErase using STM32CubeProgrammer by clicking on this button:

I get the following output (verbosity level set to 3):

My implementation of the MassErase function is as follows:

int MassErase ( void )

{

return LOADER_OK;

}

And here's how the disassembly looks like for it:

20006800 <MassErase>:

* outputs :

* none

* Note: Optional for all types of device

*/

int MassErase ( void )

{

20006800: b480 push {r7}

20006802: af00 add r7, sp, #0

return LOADER_OK;

20006804: 2301 movs r3, #1

}

20006806: 4618 mov r0, r3

20006808: 46bd mov sp, r7

2000680a: f85d 7b04 ldr.w r7, [sp], #4

2000680e: 4770 bx lr

Based on the output from STM32CubeProgrammer, I suspect that R0 value should be either 1 (if OK) or 0 (if failed). Am I correct?

The disassembly for MassErase shows that the value 1 (OK) is written to R0 through R3 (or am I wrong?).

Also, it seems that the STM32CubeProgrammer sets the PC correctly to the MassErase function and that most likely means that the binary is built correctly. Am I correct?

Could you please clarify me how I can implement the MassErase function (in this case) in such a way that it succeeds?

That way when I implement the actual memory erasure, I can be sure that the implementation works and that the STM32CubeProgrammer doesn't give me false negatives.

SakuGlumoff-bittium · ‎2024-09-04

Solved: I ended up implementing the flash loader from Segger. It worked right out of the gate.

View solution in original post

Tesla DeLorean · ‎2024-08-23

The disassembly looks far more convoluted than necessary.

Static initialization is not done as Reset_Handler and main() are not called in the load process. The External Loaders are more like a DLL

The U5 loader build for a 0x20000004 RAM address.

Perhaps inspect .ELF (.STLDR) via objdump or objcopy, with disassembly if permitted.

What memory part and pins are you using?

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

SakuGlumoff-bittium · ‎2024-08-25

Thank you for replying!

I suspected as much, based on the programmer tool logs, that this is not a regular program running in flash. Instead of starting from the reset handler, I suspected that the programmer tool calls these functions by setting the PC register to functions and initializes the other registers. This is visible in the logs where after it says "Init flashloader", the PC register value is written to be the address of the Init function found in the map/disassembly file.

When you suggest that I inspect the elf file, what do you expect me to find and/or not find?

The memory part used is MX25L6433F from Macronix. However, as I said in the post, it's currently not used and instead each function related to the flash loader tool just returns LOADER_OK (0x1) for now. I use the same drivers for the real application and they work fine. I also used another linker script that places the code into the flash to verify that the loader functions work and I could verify that with a debugger and LEDs. Just to emphasize, I currently am not using the external flash chip (nor the peripheral connected to it) just so that i can verify that the flash loader functions work together with the programmer tool.

Do you have an idea as to how the programmer tool understands that the flash loader function (for example, the MassErase) is ran successfully? Based on the logs from the programmer tool, I suspect that it reads the R0 register value and compares it to a value that is supposed to be set in the success case. Am I wrong here? Does it actually read through the whole external memory area and check that each byte is set to an erasure value? So is it even possible to implement the external flash loader in this "step-by-step" way, where you first implement the flash loader with function stubs and only when verifying that it works, you implement the functions by using the external memory?

Because that feels much more sane to me rather than implementing everything in one go. If you do that, you don't know where and why the flash loader fails.

Tesla DeLorean · ‎2024-08-25

When looking at the ELF you're trying to confirm the construction, the number of sections, that it will load to the correct addresses in RAM, that it will fit, that the entry points are correct. That the code at the points of entry does the things it should.

Using your simple return method you should see some basic code loading R0 with a small constant and returning via LR.

You can build and debug this however you want.

I'd suggest getting all the BSP code tested and functional from regular application space. Make a framework that emulates the usage be the programmer but in an environment you can debug and instrument more easily.

As a NOR FLASH memory it's going to expect you to memory map the device into the reported address space. ie Init() should map 0x90000000 into a region the MCU can read via the debug interface if it succeed. And not Hard Fault the MCU.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

SakuGlumoff-bittium · ‎2024-08-26

I debugged the functionality of external loader further and noticed that at the start of the Init command there is an instruction to push the R7 onto the stack. However, as you can see in the log output I provided, the programmer tool sets the stack pointer to 0, even when the linker script provides the _estack symbol. After setting the registers according to the programmer tool log output and stepping into the next instruction in a GDB session, a crash happens. If I set the MSP to point to the end of the stack before stepping into the next instruction like before, then the instruction is ran successfully (as expected) and it doesn't crash. I was able to step the Init function all the way to the end until it hits the bx lr instruction.

Using objdump -t I can see that the _estack and its value is found in the elf file:

So how does the programmer tool choose what value to write to the MSP register? And why would it write the MSP to 0 for the Init function?

SakuGlumoff-bittium · ‎2024-09-04

Solved: I ended up implementing the flash loader from Segger. It worked right out of the gate.