on 2026-04-22 5:30 AM
This article explains how to execute selected STM32 application code from SRAM on the NUCLEO‑C562RE using STM32CubeMX2 and VS Code with a GCC toolchain. It shows how to configure the linker script, tag functions for placement in SRAM, and set up VS Code so you can run critical routines from SRAM to improve performance, flexibility, and debug efficiency.
On STM32 microcontrollers, application code typically runs directly from embedded Flash. With prefetch, cache and accelerators, this is usually fast enough.
However, there are situations where executing code from SRAM is beneficial or even required:
This article shows, step by step, how to:
The example uses a prime-number computation function moved to SRAM on a NUCLEOC562RE board, but the method applies broadly to other STM32 boards (F0/F1/F4/L4/L5/U5/H5/C0, etc.) with only minor linker and device name adjustments.
Ensure that you have installed:
The hardware used in this tutorial is the NUCLEO-C562RE board.
There are four main scenarios where executing code from SRAM is attractive.
For performance-critical routines, you may want faster access and lower interrupt latency than flash can provide. This is particularly relevant for DSP kernels, motor control loops, real-time communication stacks, or any interrupt handler where every cycle counts. Executing such code from SRAM helps to tighten timing margins and reduce jitter.
In low-power applications, you might want to keep the CPU active while putting flash into a low-power state or powering down some flash banks. Running the active compute phase from SRAM lets you reduce energy consumption during short processing bursts between longer sleep periods, assuming the SRAM is retained.
In In-Application Programming (IAP), bootloaders, or firmware update routines often need to erase or program flash while the device remains responsive. Code cannot safely execute from a flash sector that is being modified. The standard solution is to copy the flash programming routines into SRAM and jump there for the erase/program sequence. Afterwards, return to the main application once the operation completes.
Finally, during early bring up and experimentation, loading small test images into SRAM instead of reprogramming flash on every iteration can significantly speed up the debug cycle. You can quickly iterate on low level code without wearing the flash or waiting for repeated erase/program operations.
Start STM32CubeMX2 and create a new project. In the Board Selector, choose: NUCLEO-C562RE.
STM32CubeMX2 creates a directory containing:
There are two typical generations that you may encounter:
The build systems are different, but from the linker’s point of view, they do the same thing.
STM32CubeMX – Makefile project
In a Makefile based project (from CubeMX), the final link command typically passes the linker script like this:
# Somewhere in the Makefile
LDFLAGS += -Tstm32c562xe_flash.ld
When you run:
make
The build system eventually calls:
arm-none-eabi-gcc ... -Tstm32c562xe_flash.ld ...
STM32CubeMX2 – CMake project
With STM32CubeMX2, you get a CMakeLists.txt based project. The linker script is passed via CMake’s target_link_options:
target_link_options(${CMAKE_PROJECT_NAME} PUBLIC
-T${CMAKE_SOURCE_DIR}/user_modifiable/Device/STM32C562RET6/stm32c562xe_flash.ld
)
When you run (for example):
cmake -S . -B build
cmake --build build
CMake generates Ninja/Make files, and they call:
arm-none-eabi-gcc ... -T/path/to/stm32c562xe_flash.ld ...
__attribute__((section(".RamFunc")))
behaves the same in both cases, as long as .RamFunc is declared in the linker script with:
>RAM AT> FLASH /* or ROM */
Regardless of whether your project came from:
You only need to:
The result is identical: those functions are stored in flash, copied to SRAM at startup, and then executed from SRAM at runtime.
CubeMX generates linker scripts with names like:
Inside, you find a MEMORY definition similar to:
MEMORY
{
ROM (rx) : org = 0x8000000, len = 0x80000
RAM (xrw) : org = 0x20000000, len = 0x20000
}
On STM32 devices:
The crucial linker concept is this pattern:
>RAM AT>FLASH /* or ROM */
This means:
We use this mechanism to create or reuse a .RamFunc section:
Functions linked there run from SRAM, but the binary image still resides in flash.
For this example, the following is done:
Modify your CMakeLists.txt to specify explicitly the linker to be used.
-T${CMAKE_SOURCE_DIR}/user_modifiable/Device/STM32C562RET6/stm32c562xe_flash.ldIn user_modifiable\Device\STM32C562RET6 open the file stm32c562xe_flash.ld and locate the .data section.
Many recent STM32Cube templates already include a .RamFunc output section, sometimes separate from .data, but verify or add it:
/* Initialized data sections into "RAM" Ram type memory */
.data :
{
_sdata = .; /* .data sections */
*(.data); /* .data* sections */
*(.data*);
. = ALIGN(8);
/* Functions to be executed from SRAM */
*(.RamFunc)
*(.RamFunc*)
_edata = .; /* end of .data in RAM */
} > RAM AT> ROM
Key points:
Open user_modifiable/Application/STM32C62RET6/main.c.
Prototype and attribute
main.c file, add:
/* Private functions prototype -----------------------------------------------*/
/* Place function in .RamFunc section so it runs from SRAM */
static void __attribute__((section(".RamFunc"))) Prime_Calc_SRAM(void);
Alternatively, define a helper macro (if not already provided by HAL):
/* Includes ------------------------------------------------------------------*/
#include "main.h"
/* Private typedef -----------------------------------------------------------*/
/* Private define ------------------------------------------------------------*/
#define RAMFUNC __attribute__((section(".RamFunc")))
/* Private macro -------------------------------------------------------------*/
/* Private variables ---------------------------------------------------------*/
/* Private functions prototype -----------------------------------------------*/
/* Place function in .RamFunc section so it runs from SRAM */
static void __attribute__((section(".RamFunc"))) Prime_Calc_SRAM(void);
/*Some HAL packages define __RAM_FUNC already; you can reuse it instead of defining your own macro.*/
#define __RAM_FUNC hal_status_t __attribute__((section(".RamFunc")))
Data buffer
In the private defines and variables sections, add a small prime table buffer:
/* Private define ------------------------------------------------------------*/
#define PRIM_NUM 64U /* Size of the prime array */
/* Private macro -------------------------------------------------------------*/
/* Private variables ---------------------------------------------------------*/
/* Align on 32 bytes in case you later enable caches or DMA */
static __attribute__((aligned(32))) uint32_t primes_ram[PRIM_NUM];
Function implementation
Before main.c create the Prime_Calc_SRAM() function declaration.
This code is in charge of computing the prime numbers from 2 to 311 (PRIM_NUM) and storing them into the primes_ram array variable.
It produces the prime numbers below and stores them in primes_ram array variable with these values: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131,137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311.
/**
* @brief Compute prime numbers and store them in primes_ram.
* Function body is placed in SRAM (.RamFunc).
*/
static void __attribute__((section(".RamFunc"))) Prime_Calc_SRAM(void)
{
/* First prime number */
primes_ram[0] = 2U;
uint32_t count = 1U; /* number of primes found so far */
uint32_t num = 3U; /* candidate number */
while (count < PRIM_NUM)
{
uint8_t is_prime = 1U;
for (uint32_t i = 0U; i < count; i++)
{
if ((num % primes_ram[i]) == 0U)
{
is_prime = 0U;
break;
}
if ((primes_ram[i] * primes_ram[i]) > num)
{
break;
}
}
if (is_prime != 0U)
{
primes_ram[count] = num;
count++;
}
/* Skip even numbers */
num += 2U;
}
}
You now have a function that is linked into .RamFunc and thus copied to SRAM at startup.
Call the function from main()
In main() after the initialization code:
int main(void)
{
/** System Init: this code placed in targets folder initializes your system.
* It calls the initialization (and sets the initial configuration) of the peripherals.
* You can use STM32CubeMX to generate and call this code or not in this project.
* It also contains the HAL initialization and the initial clock configuration.
*/
if (mx_system_init() != SYSTEM_OK)
{
return (-1);
}
else
{
/*
* You can start your application code here
*/
/* Example: run the RAM-resident computation once at startup */
Prime_Calc_SRAM();
while (1) {}
}
} /* end main */
After running, you can inspect primes_ram[] in the debugger to confirm it holds prime numbers.
Build the project:
Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P) and run [CMake: Build] or press F7.
The build output appears in the terminal panel, showing compilation progress and any errors or warnings.
Debug the application:
This generates a launch.json file configured for your STM32 project.
To confirm that Prime_Calc_SRAM() is executed from SRAM:
Optionally:
You should see identical code in both locations, with the PC pointing to the SRAM region during execution.
Verify the following:
Once the basic mechanism is understood, several extensions are natural.
To run specific IRQ handlers from SRAM, apply the same attribute to the interrupt function:
void __attribute__((section(".RamFunc"))) EXTI13_IRQHandler(void)
{
/* User code before HAL handler */
HAL_GPIO_EXTI_IRQHandler(USER_BUTTON_Pin);
}
As long as .RamFunc is mapped to RAM AT> FLASH (ROM) and the handler name matches the startup file vector table, the handler body executes from SRAM.
You can link most or all of your application to SRAM by:
/* The startup code into RAM */
.isr_vector :
{
KEEP(*(.isr_vector))
} >RAM
/* Code into RAM */
.text :
{
*(.text)
*(.text*)
/* ... other code sections */
} >RAM
For debugging, the debugger loads the ELF directly into SRAM; this works even if flash is blank.
For standalone boot from SRAM, you also need:
In many practical applications, a small bootloader still resides in flash and copies the main application image into SRAM at startup.
Running code from SRAM on STM32 is mainly a matter of configuring the linker and tagging the right functions, not of changing your whole toolchain. In this article we started from a STM32CubeMX2generated project for NUCLEOC562RE, integrated it with VS Code, and showed how to place a simple Prime_Calc_SRAM() function into a dedicated .RamFunc section.
By mapping .RamFunc as >RAM AT> FLASH in the linker script and adding __attribute__((section(".RamFunc"))) to selected functions, the startup code automatically copies those functions from flash to SRAM before main(), so they execute entirely from RAM. Using the STM32CubeIDE extension pack in VS Code, you can then confirm that the program counter is in the SRAM range while these routines run.
The same pattern scales beyond the prime example. You can move time-critical ISRs, flash programming routines, or even large parts of an application into SRAM, while keeping the overall STM32CubeMX2 + VS Code development flow unchanged.