How to implement custom data processing in STM32CubeMonitor using the function node

podhosim · ‎2025-05-26

Summary

This article aims to familiarize readers with STM32CubeMonitor by focusing on understanding its message structure and using the function node to create custom functionalities. The knowledge gained is applied in an example that implements a simple integral on variables.

Introduction

STM32CubeMonitor is a Node-RED based tool designed for monitoring variables on STM32 MCUs during runtime. It becomes particularly useful when conventional debugging is not feasible due to timing constraints. STMicroelectronics provides a library of nodes specific to STM32 MCUs implementing data acquisition, processing, and visualization. Although these nodes provide a wide range of functionalities sufficient for most applications, you might find yourself in need of an operation that is not available in the processing node. This article guides you in creating custom functionalities to bridge that gap. The example STM32CubeIDE project together with the example flow is attached at the end of this article.

Prerequisites

The version of STM32CubeMonitor used in this article is 1.10.0, but the minimum version needed is 1.9.0.

1. Understanding data flow and message structure

Before we dive into implementing custom functionalities in STM32CubeMonitor, we need to understand how the variables are acquired and how they are sent between nodes.

1.1 Data flow

The first link in the acquisition chain is the variables node. This node lets us choose variables to acquire and the parameters of the acquisition, for example, sampling frequency (in direct mode). This node then sends commands to the "acquisition out" node. Two versions exist, one for STLINK acq stlink out and one for J-Link acq jlink out. This article is written with STLINK but is also applicable to J-Link by replacing the acquisition nodes with J-Link ones. The acq stlink out node connects to the STLINK connected to our STM32 MCU. Once this connection is set up, we can use the acq stlink in node to retrieve the acquired data. The output of the acq stlink in node can be connected to the processing node.

Fig. 1: Data flow in STM32CubeMonitor

1.2 Message structure

In STM32CubeMonitor, nodes communicate via messages msg as JavaScript objects. Focusing on messages containing variable's data, the most interesting part is msg.payload. In STM32CubeMonitor, these are produced by acq stlink in and processing nodes. There is however a significant difference in the msg.payload contents output from these nodes.

1.2.1 acq stlink in node

The acq stlink in node produces the msg.payload with three fields: data, first, and groupname. The groupname field is a string of the name of the variables group defined in variables node. The first field contains a boolean value of whether the msg is the first one sent after starting acquisition. The data field is an array of arrays with the variables' measurements as objects with fields x and y, where y is the value of the variable at time x. Time x is given in ms. The order of the variables in the data field array is given by the order seen in the variables node.

1.2.2 processing node

The processing node computes the postprocessing values and reorganizes the data structure. As is apparent from above the acq stlink in node outputs message each measurement of all the variables at once. On the other hand, the processing node serializes the data somewhat by producing a separate message for each variable of measurements acquired over the last 50 ms. The variables' messages appear in the same order as the variables in the variables node.

Here the msg.payload object contains fields groupname, variablename, and variabledata. The groupname has the same contents as above. The variablename field contains a string with the name of the variable as defined in variables node. The variabledata field contains an array of objects with the same x and y fields as above. The array's length depends on the sampling frequency.

The rule of thumb is that if the sampling frequency defined in variables node is higher than 20 Hz ('sampling period' shorter than 50 ms), some arrays have length greater than 1. If the sampling frequency is lower than 20 Hz, the arrays usually have length of 1. If there was no measurement acquired in the 50 ms interval, no message is sent.

Fig. 2: Data structure after acq stlink in and processing nodes

2. Understanding function node and context data storing

Now that we know how the measured variables are acquired and how we can access them, it is time to learn how to further process them using the function node.

2.1 function node

The function node allows you to write your own JavaScript function that accesses the message msg contents to further process and modify them. If you are not familiar with JavaScript, don't worry. Given that you have already most likely written some C/C++ code for STM32 MCUs, you should do quite alright with your knowledge.

In STM32CubeMonitor with the node open you can see it has four tabs:

"Setup"
"On Start"
"On Message"
"On Stop"

In the "Setup" tab, you can set the number of outputs, which should equal to the number of elements you want to send. The behavior of the other tabs is straightforward. Code under the "On Start" tab runs once whenever the node is started. Similar with the code in the "On Stop" tab only now it executes when the node is stopped. The "On Message" code runs whenever the function node receives a message.

Fig. 3: Function node tabs

When working with the msg object, it is best practice to only modify the specified message contents, if possible. It is more robust than creating a new message object as the old message retains fields that might be needed downstream.

2.2 Context data storing

The main advantage of custom data processing in function node over the processing node is the ability to directly store and access data between messages. This allows you to implement processing of data, where such functionality is crucial - cumulative sum/integrals, more involved statistics methods etc.

Node-RED defines three scopes of context - node, flow and global. node context includes only the function node itself, while data stored in the flow context can be accessed across the flow, in which the function node is located. In STM32CubeMonitor each tab is a flow. With global context the stored data are available in each flow with its nodes. See the image below, demonstrating contexts and examples of storing and loading data in different contexts.

Fig. 4: Visualization of different scopes

// in function node
// store and load of variable world with identifier "hello" in the node context
var world = 1;
context.set("hello", world);
world = context.get("hello")

// store and load of variable foo with identifier "bar" in the flow context
var foo = 3.14;
flow.set("bar", foo);
foo = flow.get("bar")

// load and store of variable loompa with identifier "oompa" in the global context
var loompa = global.get("oompa")||0;
// you can use the "or zero" if there isn't a variable stored with the identifier "oompa" yet
global.set("oompa", loompa);

3. Example of a pendulum simulation

Now that you understand how STM32CubeMonitor and how function node handles data, we can implement custom data processing. We explore an example, a simple pendulum simulated on STM32 MCU using approximate closed-form equations. I am using a NUCLEO-F401RE, but you should be able to use the example on other MCUs as well. These return the x and y component of the pendulum's velocity. Assuming we cannot computationally afford it on the MCU itself, we implement the STM32CubeMonitor flow that computes the integral of these velocities and visualizes the resulting position of the simulated pendulum.

3.1 MCU code

Let us start with the code for the MCU. (I will be brief as it is not the main focus of this article.) Below is the C code that implements a pendulum simulation using closed-form equations. Rather than using the overly simplistic approximation of the period T₀ = 2π√(l/g), the approximation via elliptic integral computation is used. This method is quite accurate for arbitrary initial condition θ₀[1]. After computing the period T, the velocities vx and vy are computed using the approximate closed-form equations on lines 77-81.

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include <math.h>
/* USER CODE END Includes */

/* Private define ------------------------------------------------------------*/
/* USER CODE BEGIN PD */
#define L 0.7134f          // Length of the pendulum (in meters)
#define g 9.81f            // Acceleration due to gravity (in m/s^2)
#define THETA_0 M_PI/2.0f  // Initial angle (in radians)
#define TIMESCALE 1.0f     // Timescale compared to real-time for simulating pendulum
/* USER CODE END PD */

/* Private variables ---------------------------------------------------------*/
/* USER CODE BEGIN PV */
static volatile float vx, vy;
float theta, dtheta_dt;
float t; // Time in seconds
float t0;
static volatile uint8_t reset_flag = 0;
/* USER CODE END PV */

/* USER CODE BEGIN PFP */
float computeEllipticIntegral(float k);
float computeTheta(float t, float T);
/* USER CODE END PFP */

/* Private user code ---------------------------------------------------------*/
/* USER CODE BEGIN 0 */
/* Function to compute the complete elliptic integral of the first kind using the AGM method
 * */
float computeEllipticIntegral(float k) {
    float a = 1.0f - k;
    float b = 1.0f + k;
    float epsilon = 1e-7f; // Tolerance for convergence

    while (fabs(a - b) > epsilon) {
        float a_next = (a + b) / 2.0f;
        float b_next = sqrt(a * b);
        a = a_next;
        b = b_next;
    }

    return M_PI / (2.0f * a);
}

/* Function to compute theta(t) using the elliptic integral
 * */
float computeTheta(float t, float T) {
    return THETA_0 * sin(2.0f * M_PI * t / T);
}
/* USER CODE END 0 */

int main(void)
{

  /* ...initializations... */

  /* USER CODE BEGIN 2 */
  float k = sin(THETA_0 / 2.0f);
  float K = computeEllipticIntegral(k);
  float T = 4.0f * sqrt(L / g) * K;

  uint32_t time_ms = HAL_GetTick();
  t0 = time_ms/1000.0f;

  HAL_GPIO_WritePin(LD2_GPIO_Port, LD2_Pin, GPIO_PIN_SET);
  /* USER CODE END 2 */
  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  while (1)
  {
	  if (!reset_flag) {
		  time_ms = HAL_GetTick();
		  t = (time_ms/1000.0f - t0)*TIMESCALE;
          // Compute angle theta and angular velocity dtheta_dt
		  theta = computeTheta(t, T);
		  dtheta_dt = (2.0f * M_PI / T) * THETA_0 * cos(2.0f * M_PI * t / T);
		  // Compute velocities vx and vy
		  vx = L * cos(theta) * dtheta_dt;
		  vy = L * sin(theta) * dtheta_dt;
	  } else {
		  time_ms = HAL_GetTick();
		  t0 = time_ms/1000.0f;
		  vx = 0.0f;
		  vy = 0.0f;
	  }
    /* USER CODE END WHILE */
    /* USER CODE BEGIN 3 */
  }
  /* USER CODE END 3 */
}

To be able to synchronize our STM32CubeMonitor calculations with the MCU's calculations, there is a rising and falling edge interrupt on the Nucleo's user B1 button. It set/resets the reset_flag variable, see below.

void HAL_GPIO_EXTI_Callback(uint16_t GPIO_Pin){
	if (GPIO_Pin == B1_Pin) {
		if (HAL_GPIO_ReadPin(B1_GPIO_Port, B1_Pin) == GPIO_PIN_RESET){
			HAL_GPIO_WritePin(LD2_GPIO_Port, LD2_Pin, GPIO_PIN_RESET);
			reset_flag = 1;
		} else {
			HAL_GPIO_WritePin(LD2_GPIO_Port, LD2_Pin, GPIO_PIN_SET);
			reset_flag = 0;
		}
	}
}

3.2 Setting up STM32CubeMonitor

Now let us set up the STM32CubeMonitor flow. We start from scratch, so that everything can be explained. Alternatively, you can use the basic flow as a starting point and just modify it. To view the basic flow example, right-click in a flow, go to "Insert" → "Import example flow", in the dialog window that opens, go to "Local" and select STM32CubeMonitor_BasicFlow.json.

First, we place the variables node, where we set up the executable to the .elf file for the code above and configure it to capture the variables reset_flag, vx, and vy. Configure the acquisition mode to direct and sampling frequency to a sequential loop. The alternative to direct mode is snapshot mode, which is not covered in this article and requires changes to the MCU code as well as to the flow.

Fig. 5: Variables node setup

Then we create two button nodes. One sending a message with topic as a string "start" for starting the acquisition and one with topic as a string "stop" for stopping the acquisition.

Fig. 6: Button node topic setup

Connect these to the input of the variables node. Then we place acq stlink out and acq stlink in nodes and configure them to the desired MCU. Connect the acq stlink out node to the output of the variables node. Now, we can finally implement our data processing on the output of the acq stlink in node.

3.3 Implementing the function node

Once you have placed the function node and connected it to the acq stlink in node, open it and go to the "On Message" tab. If you recall from section 1.2.1, the acq stlink in node sends a message each measurement with one sample of each variable. So, on each message we check if there is a payload present. If there is not any payload to process, we return without message.

Then we get the timestamp of the received data and check if we have already initialized the stored timestamp. If not, we initialize it now and return a message with the current timestamp and zero values for the position. If the stored timestamp was already initialized, we first check the reset_flag. If it is set, we zero our stored data and return a message with the current timestamp and zero values for position. Now, we perform the integration (trapezoidal approximation is used) and return a message with current timestamp and position values. This is implemented in the code below.

// in case of other message types, return without result
if (msg.payload === ""){
    return;
}

// constant timescale - same as in the MCU code
const timescale = 1;

// load past values - x and y position, past measurements for trapezoidal integration
var x = flow.get("x")||0;
var y = flow.get("y")||0;
var vx1 = context.get("vx")||0;
var vy1 = context.get("vy")||0;
var t1 = context.get("t")||0;

// data array: [[reset_flag], [vx], [vy]]
// get current timestamp
var t2 = msg.payload.data[0][0].x;

// check for last timestamp
if (t1 === 0){
    context.set("t", t2);
    var data = [{"x":t2, "y":0}, {"x":t2, "y":0}];

    msg.payload.data = data;
    return msg;
}

// extract current velocity values
var vx2 = msg.payload.data[1][0].y;
var vy2 = msg.payload.data[2][0].y;

// store them for next time
context.set("t", t2);
context.set("vx", vx2);
context.set("vy", vy2);

// check for reset_flag
if (msg.payload.data[0][0].y === 1){
    flow.set("x", 0);
    flow.set("y", 0);
    var data = [{"x":t2, "y":0}, {"x":t2, "y":0}];

    msg.payload.data = data;
    return msg;
}

// compute trapezoidal integration
var dt = (t2 - t1)/1000.0 * timescale;
x += ((vx1 + vx2)/2.0)*dt;
y += ((vy1 + vy2)/2.0)*dt;

// store result
flow.set("x", x);
flow.set("y", y);

// compile message and return
var data1 = [{"x":t2, "y":x}, {"x":t2, "y":y}];
msg.payload.data = data1;
return msg;

Right now, your flow should look similar to the image below.

Fig. 7: Intermediate state of flow after adding custom function node processing

3.4 Visualizing the results

For visualizing the results, we can use the chart node. Once placed also place another button node with the topic set to string "clear", used to clear the graph. Connect the button node to the chart node. Now, you should also set the "Group" parameter for all of the buttons to the chart group. As the chart node expects to receive messages from the processing node, we need to emulate the message structure in another function node. The code below does just that, it splits the message from our processing function node into two messages. Note that there is also a third message, which we will use shortly.

For the first two messages, the groupname does not have to match the "Group" name in your variables node. You can set it to a string that better describes the group. The topic has to be set to "data" now. Set the number of outputs for this function node to 3 to allow separate message routing. Each output will then only serve the corresponding message.

var msg1 = {
    "payload":{
        "variablename":"x",
        "variabledata":[{
            "x":msg.payload.data[0].x,
            "y":msg.payload.data[0].y}],
        "groupname":"myIntegrals"}, 
    "topic":"data"};
var msg2 = {
    "payload": {
        "variablename": "y", 
        "variabledata": [{ 
            "x": msg.payload.data[1].x, 
            "y": msg.payload.data[1].y }], 
        "groupname": "myIntegrals" },
    "topic":"data" };
var msg3 = {
    "payload": {
        "x":msg.payload.data[0].y, 
        "y":msg.payload.data[1].y}};


return [msg1, msg2, msg3];

We want to emulate the processing node and decrease the computational load of sending each measurement in separate message to the chart node. To do that, we can route each of the first two outputs into a separate join node. Set each of these to manual mode. It should combine each msg.payload.variabledata[0] to create an Array. Set the "Send the message: After a timeout following the first message" to an interval in seconds I used 0.05 s so as to simulate the processing node behavior.

Fig. 8: Join node setup

This process almost creates the msg structure that we want, although it encapsulates the variabledata array into an array. We need to extract it properly. We can use the change node with rule that sets msg.payload.variabledata to msg.payload.variabledata[0].

Fig. 9: Change node setup

Alternatively, we can use another function node with the code below, that simply removes one layer of array from the variabledata. Connect the outputs of both join nodes to the input of the change node (function node alternatively) and connect the output of this node to the chart node.

msg.payload.variabledata = msg.payload.variabledata[0];

return msg;

We use the third message to visualize the pendulum position more intuitively. We use the radar node that can visualize a point given its x and y coordinates. As of now, the bottom position of the pendulum corresponds to the x and y position being zero. To center the motion around the center point [0, 0] in the default radar image, move the position y down by the length of the pendulum arm L. In the MCU C code example, it is set to 0.7134. This length corresponds to 2-second period with θ₀ being 90°. We create another function node to provide this simple correction with the code below.

const L = 0.7134;

var y = msg.payload.y;

msg.payload.y = y - L;
return msg;

Now, just connect the third output of the emulating function node to the input of the centering function node. Connect its output to the radar node, while also setting its Group to the chart group. Additionally, you can put a delay node in between the emulating and centering function nodes to limit the dashboard message rate, to decrease the computational load even further.

Fig. 10: Delay node setup

And we are done; you should be looking at a flow looking something like this:

Fig. 11: Final flow

Finally, just deploy the flow, open the dashboard (upper right-hand corner), start the acquisition, press the physical button to zero and synchronize the integration, and enjoy the show.

Fig. 12: Example visualization

Conclusion

In this article, we learned about the basics of how STM32CubeMonitor works under the hood (Node-RED based), explored the data flow (variables -> acq stlink out -> MCU -> acq stlink in -> processing) and the data/msg.payload structure (acq stlink in = parallel variable values; processing = serialized variable values of 50 ms, each variable one message). Then we introduced the ST specific nodes and how they work. Finally, the gained knowledge was put to a use implementing an example application from the ground up.

Final thoughts

If you, like me, have been mesmerized by the periodic motion in the example visualization, you might have spotted that the position slightly shifts over time. That is to be expected as the acquisition is not synchronized with the variable changes. We just hope to acquire as much data as possible to minimize the effect. Thus, we miss some velocity values, which result in the position shifting over time. This could be addressed by synchronizing the acquisition using the variables node and updated MCU code, but this is out of scope for this article.

To improve your understanding of STM32CubeMonitor, have a look at the STM32CubeMonitor wiki page and Node-RED documentation in the related links below. For most applications, you can rely on the built-in help in STM32CubeMonitor in the right-hand side dialog window. It is detailed and easily accessible.

Hope this article has helped you leverage the incredible potential of STM32CubeMonitor and improved your ST experience.

ALABSTM · ‎2025-06-02

Hi @podhosim,

Thank you for this tutorial. I really appreciated the pendulum example. Would it be possible to have another tutorial showing how to monitor the microcontroller's power consumption? It would be very useful, for instance, to compare the case where DMA controller is enabled and the case it is not.

Best regards,

podhosim · ‎2025-06-04

Hi @ALABSTM

I am glad that you enjoyed the article!

There is a tool for measuring the MCU's power consumption - STM32CubeMonPwr. This requires additional hardware. It acquires power measurements through the STLINK-V3PWR probe, the X-NUCLEO-LPM01A expansion board, or the energy meter of the STM32L562E-DK Discovery kit specialized intermediate board. The procedure to do so is well documented in the user manual.

Hope this helps.

BR,

podhosim