How to share an API between a bootloader and an application

B.Montanari · ‎2024-10-29

Summary

This article provides a step-by-step guide on how to configure your STM32 to have a super simple bootloader and application. Additionally, a third region in memory where a shared set of functions between the two other codes can reside. The basic usage and a simple example application are provided using the NUCLEO-G071RB board, but can be easily tailored to any other STM32.

Introduction

Being able to share functions between the bootloader code and the application can be an interesting way to avoid duplicating functions and saving memory in general. Of course, there are several cases where this isn't recommended. However, the goal of this article is to demonstrate how to create this shared region between the two codes and the necessary changes to accomplish that. The steps provided cover the simplest bootloader function. In this case, it's a check to see if there’s an application available at a certain address and then jump to it. The basic application will perform minor functions in this example (LED toggle). Both codes will leverage from functions available in what's called as the shared API section.

When i comes to memory placements, the three different programs are divided as follows:

The bootloader contains a simple check that ensures an existing application is present before jumping. In case it does, another condition such as pressing the user button will allow the program to jump to the application address.

The application runs the main program, a simple code that will blink the LED. The API section allows the sharing of functions between both the bootloader and the application.

There are two key elements that we need to be aware of when doing this approach.

The first key element being flash page size. We need to make sure we aren’t in the middle of a page when allocating the initial position of the application or shared API. The page size will be different depending on the STM32 used. For the selected NUCLEO-G071RB, all pages have the same fixed size of 2 KB. For more information on any specific STM32, check the flash section in the reference manual.

The second key element being the interrupt vector table. The interrupt vector table needs to be remapped as the bootloader and application will typically have independent interrupt schemes and needs.
The vector table base offset field must be a multiple of 0x200 and should be aligned at the beginning of a flash page. It’s worth noting that Cortex® M0, such as the STM32F0 series, doesn’t have this capability, so the developer should consider either sharing or moving the NVIC to RAM. For all other cores, such as M0+, M3, M4, M33, and M7, relocating NVIC is possible and straightforward.

1. Development

1.1. Project configuration

All the following instructions are made using a STM32G071RBT6 microcontroller and the STM32CubeIDE version 1.16.1 is used for programming and debugging. Tera Term is used as the terminal software.

Start by creating a new project in STM32CubeIDE by clicking [File] → [New] → [STM32 Project]. Select the [Board Selector] and in the empty field of [Commercial Part Number] type [NUCLEO-G071RB].
Select the first option in [Boards List] and then click [Next]. Choose a name for the project and press [Finish]. Use everything in default mode.

1.2. Bootloader configuration

This simple bootloader is divided into two parts.

1.2.1. A check if the informed address contains an executable application

There are several ways to accomplish this, but the selected method will be the same one used in the How to create a super simple bootloader YouTube video series. In this video series, a mask is used to see if there’s a stack pointer placed in the RAM region.

In this code example, the mask 0x2FFE0000 is chosen because it isolates the upper bits of the stack pointer to ensure it falls within the valid SRAM address range. For the STM32G071RB, the RAM origin of 0x20000000 and has a length of 36 KB, so the valid addresses range from 0x20000000 to 0x20008FFF.

It’s worth noting that the initial stack pointer is the first entry in the vector table, which is located at the beginning of the application in flash memory, that’s why this mask is used with the FLASH_APP_ADDR (0x08008000). The code example provides a simple macro that adjusts the mask based on the RAM size.

1.2.2. The actual bootloader to the application jump

This can be done in a couple of different ways, such as in the mentioned video series, where the address is assigned to a function pointer and then doing stack pointer (MSP) manipulation. An alternative is presented in this article, which uses a code example aimed to be less cryptic. It aims to clearly show that the Reset_Handler function from the applications code is the entry point when making the jump.

For all the code snippets, add them in the main.c file and use the USER CODE XYZ as a reference for the code location. Just as a reminder, as defined in the introduction section, the application starting address will be 0x8008000. To make the code more flexible and portable, the number of NVIC IRQs is added as a define, so the user should adjust it accordingly to the STM32 and disable all interrupts prior to making the jump to the application.

Special thanks to @gbm for proposing this easier and more optimized code in the comments from this article: How to jump to system bootloader from application code on STM32 microcontrollers, which is reused and adjusted for this one.

/* USER CODE BEGIN Includes */
#include <stdio.h>
/* USER CODE END Includes */

/* USER CODE BEGIN PD */
#define FLASH_APP_ADDR       0x8008000    // my MCU APP code base address
#define        MCU_IRQS   64u     // no. of NVIC IRQ inputs

struct app_vectable_ {
    uint32_t Initial_SP;
    void (*Reset_Handler)(void);
};

#define APP_VTAB          ((struct app_vectable_ *)FLASH_APP_ADDR)
// Base address of SRAM
#define SRAM_BASE_ADDR 0x20000000
// Macro to calculate the mask based on the RAM length
#define SRAM_MASK(length_kb) (0x20000000 | (((length_kb) - 1) << 17))

// Example usage for the STM32G071RB:
// Define the RAM length in KB
#define RAM_LENGTH_KB 36

// Generate the mask based on the RAM length
#define GENERATED_MASK SRAM_MASK(RAM_LENGTH_KB)

// Example check using the generated mask
#define IS_VALID_STACK_POINTER(addr) (((*(uint32_t*)(addr)) & GENERATED_MASK) == SRAM_BASE_ADDR)

/* USER CODE END PD */

/* USER CODE BEGIN PFP */
int __io_putchar(int ch)
{
          HAL_UART_Transmit(&huart2, (uint8_t *)&ch, 1, 0xFFFF);
          return ch;
}
/* USER CODE END PFP */

/* USER CODE BEGIN 0 */
void JumpToApp(void)
{
          printf("BOOTLOADER Start \r\n");
          // Tests if there is an application at FLASH_APP_ADDR
          if (IS_VALID_STACK_POINTER(FLASH_APP_ADDR))
          {
                    printf("APP Start ...\r\n");
                    /* Disable all interrupts */
                    __disable_irq();
                    /* Disable Systick timer */
                    SysTick->CTRL = 0;
                    /* Set the clock to the default state */
                    HAL_RCC_DeInit();
                    /* Clear Interrupt Enable Register & Interrupt Pending Register */
                    for (uint8_t i = 0; i < (MCU_IRQS + 31u) / 32; i++)
                    {
                              NVIC->ICER[i]=0xFFFFFFFF;
                              NVIC->ICPR[i]=0xFFFFFFFF;
                    }
                    /* Re-enable all interrupts */
                    __enable_irq();
                    // Set the MSP
                    __set_MSP(APP_VTAB->Initial_SP);
                    // Jump to APP firmware
                    APP_VTAB->Reset_Handler();
          }
          else
          {
                    printf("No APP found\r\n"); // No application installed
          }
}
          /* USER CODE BEGIN WHILE */
          while (1)
          {
                    /* USER CODE END WHILE */

                    /* USER CODE BEGIN 3 */
                    JumpToApp();
                    HAL_Delay(1000);
          }
          /* USER CODE END 3 */

2. Application configuration

In this step, a simple Blink LED program is created to serve as the user application in execution. Follow the same steps to create a new project and make sure to edit the linker script accordingly.

In the introduction, we defined the application starting point in memory as 0x8008000 and our desired size as 64 KB. To modify this configuration in the application, we must use the linker script file, which is responsible for providing specific control over the linking process inside the STM32. Consequentially, the configuration from this file can map addresses in memory, change memory sizes, or create new sections.

Open the linker script file [STM32G071RBTX_FLASH.ld] of this new project, change the memory length of the flash from 128K to 64K and the FLASH ORIGIN from 0x8000000 to 0x8008000.

MEMORY
{
  RAM    (xrw)    : ORIGIN = 0x20000000,   LENGTH = 36K
  FLASH    (rx)    : ORIGIN = 0x8008000,   LENGTH = 64K
}

To change the NVIC’s location, go to the [Project Explorer] tab. Inside the [Core] folder, open the [Src] folder, and locate the [system_stm32g0xx.c]. Uncomment the #define USER_VECT_TAB_ADDRESS and change the #define VECT_TAB_OFFSET from 0x00000000U to 0x8000U, as illustrated:

Just to test that both projects can coexist and it's possible to debug, let’s add a simple LED toggle in the applications main loop. Two functions are created for that as they’ll be moved to the API shared region later:

/* USER CODE BEGIN 0 */
void TurnLedOn(void)
{
       HAL_GPIO_WritePin(LED_GREEN_GPIO_Port, LED_GREEN_Pin, GPIO_PIN_SET);
}

void TurnLedOff(void)
{
       HAL_GPIO_WritePin(LED_GREEN_GPIO_Port, LED_GREEN_Pin, GPIO_PIN_RESET);
}
/* USER CODE END 0 */

Build both applications and let’s check how to enable the debugging for both projects.

3. Debugging both projects

Finally, some configurations are necessary to download both projects inside the STM32. First, access [Debug Configurations] in the bootloader application.

Inside [Debug Configurations], go into the [Startup] tab then click in [Add], select the project to be the application created and click [OK], finally apply modifications.

With the configurations in place, every time the bootloader project is debugged, the code from the application is also downloaded inside the STM32.

Now, when testing, you should be able to see in the terminal that the bootloader has jumped into the application and the LED should be blinking. The screenshot displays the terminal message and a breakpoint is placed in the applications main.c file:

4. Creating an API

An Application Program Interface (API) allows for one or more applications to share functions, variables, datatypes, and more. In this example, we’ll share two simple functions that turn an LED on and off. The same ones that are previously created, but now they’ll be placed in a section, visible to the bootloader and to the application.

To create the API region in the STM32 memory following the size and memory origin from the introduction, we’ll have to repeat the same process in both existing projects. This means the application and bootloader, unless stated otherwise.

Open the linker script file [STM32G071RBTX_FLASH.ld], locate the MEMORY section and insert a new region called API, between parenthesis type rx. This means that memory from API is only read or executed while the application is running. Add the starting memory address as 0x8018000 and a length of 32K.

Bootloader:

MEMORY
{
  RAM    (xrw)    : ORIGIN = 0x20000000,   LENGTH = 36K
  FLASH    (rx)    : ORIGIN = 0x8000000,   LENGTH = 96K
  API (rx)   : ORIGIN = 0x8018000,     LENGTH = 32K
}

Application:

MEMORY
{
  RAM    (xrw)    : ORIGIN = 0x20000000,   LENGTH = 36K
  FLASH    (rx)    : ORIGIN = 0x8008000,   LENGTH = 64K
  API (rx)   : ORIGIN = 0x8018000,     LENGTH = 32K
}

Inside the SECTIONS portion, create two new memory sections. This can be added after the ".text" section:

  .SHARED_FUNC_PTR 0x8018000 (READONLY):
       {
             KEEP(*(.SHARED_FUNC_PTR))
       } > API  

  .SHARED_FUNC :
       {
       . = ALIGN(4);
             __SHARED_FUNC_start__ = .;
             *(.SHARED_FUNC)
             __SHARED_FUNC_end__ = .;
       } > API

4.1. Application project editing

Inside the [main.c] from the application project, define a reference to the memory section created. Add the declaration to the struct AppSharedAPI.

/* USER CODE BEGIN PD */
#define LOCATE_FUNC __attribute__((section(".SHARED_FUNC")))
/* USER CODE END PD */

/* USER CODE BEGIN PM */
struct AppSharedAPI
{
      void(*TurnOn)(void);
      void(*TurnOff)(void);
};
/* USER CODE END PM */

Whenever a function contains LOCATE_FUNC, such as the two functions that are shown below. It resides in the .SHARED_FUNC inside the API address memory. Also, define the AppSharedAPI structure, which will take function pointers as a reference to the LED functions, and attribute it to the SHARE_FUNC_PTR memory section.

These functions make use of the HAL driver, which is part of both projects, this works in the favor of this simple use case as it can be really straightforward. All that's needed is to remap the functions placed in the API section. Be aware that if the functions inside the API aren’t present as part of the available project on any of the sides, errors will occur. For a static library, refer to this video.

/* USER CODE BEGIN 0 */
void LOCATE_FUNC TurnOn(void)
{
      HAL_GPIO_WritePin(LED_GREEN_GPIO_Port, LED_GREEN_Pin, GPIO_PIN_SET);
}

void LOCATE_FUNC TurnOff(void)
{
      HAL_GPIO_WritePin(LED_GREEN_GPIO_Port, LED_GREEN_Pin, GPIO_PIN_RESET);
}

struct BootloaderSharedAPI api __attribute__((section(".SHARED_FUNC_PTR"))) = {
            TurnOn,
            TurnOff
};
/* USER CODE END 0 */

In the applications main loop, change the previous code to have a 50 ms toggle using the functions that are placed in the API section:

  /* USER CODE BEGIN WHILE */
  while (1)
  {
    /* USER CODE END WHILE */
    /* USER CODE BEGIN 3 */
         api.TurnOn();
         HAL_Delay(50);
         api.TurnOff();
         HAL_Delay(50);
  }
  /* USER CODE END 3 */

4.2. Bootloader Project Editing

Inside the bootloader project [main.c] add the same struct declaration given to the application project. Reference the struct address to the memory address that contains the data from the API memory section:

/* USER CODE BEGIN PV */
struct BootSharedAPI  {
       void(*TurnOn)(void);
       void(*TurnOff)(void);
};
struct BootSharedAPI *api = (struct BootSharedAPI *) 0x8018000;
/* USER CODE END PV */

And finally, add the function calls into the main loop to make the LED blink a few times before jumping to application:

 /* USER CODE BEGIN 2 */
       for(uint8_t i = 0; i < 4; i++)
       {
             api->TurnOn();
             HAL_Delay(200);
             api->TurnOff();
             HAL_Delay(200);
       }
       /* USER CODE END 2 */

5. Results

Press [Ctrl + B] to build the projects. There should be no errors or warnings. You can click on the [Refresh] icon in the [Build Analyzer] tab and then expand the [Memory Details] to see the [API] section and usage. The memory used will be either 0 or a few bytes, depending on where the API is created from. In this example, since it was created in the application project, it's displayed there:

To debug, enter using the bootloader project, same as done earlier. You should see the LED blinking before making the jump at a lower pace and then the jump to the application, where it blinks at a higher frequency.

Conclusion

This article provides a quick overview on configuring an STM32 microcontroller to implement a simple bootloader, an application, and a shared API region for function sharing between the bootloader and application.

By following the steps, developers can avoid function duplication and save memory, while ensuring proper memory allocation and interrupt vector table remapping. The example uses the NUCLEO-G071RB board, but the principles can be applied to other STM32 devices.

The article also covers project configuration, bootloader and application setup, and API creation, ensuring a seamless integration and debugging process. This approach not only optimizes memory usage but also demonstrates a practical method for managing multiple code regions within an STM32 microcontroller.