cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with the X-CUBE-EEPROM Library

BPrem.1
Associate II

My colleagues reported problems with data loss when using the X-CUBE-EEPROM library with STM32G473RET6 processors. The data loss occurred after a long period of using or testing embedded systems (switching them off and on quite frequently). I decided to investigate further. Ultimately, I discovered that the primary error was on our end: most of the virtual addresses used in the firmware exceeded NB_OF_VARIABLES. I investigated further because I initially suspected that the recovery process after a power failure while writing to or erasing a FLASH page was inadequate. While testing, reading the documentation, and reviewing the code, I discovered a missing boundary check and few other issues.

The inadequate boundary check is in the EEPROM library. The "eeprom_emul_conf.h" configuration file contains a line

#define NB_OF_VARIABLES xxx /*!< Number of variables to handle in eeprom */

My colleagues misunderstood the purpose of NB_OF_VARIABLES. They thought it indirectly defines how many Flash pages are needed to store all their variables and how many values with different virtual addresses could be stored. The AN4894 mentions twice that the virtual address 0xFFFF is forbidden but not directly that no virtual address should be larger than NB_OF_VARIABLES. The compile time parameter NB_OF_VARIABLES is defined as "NB_OF_VARIABLES (default 1000, 100 for STM32C0 series(a)): Number of nonvolatile elements, each element value being 8-, 16-, 32-or 96-bit." in the AN4894 without mentioning that virtual addresses over NB_OF_VARIABLES are not supported properly. The sentence "The driver requires the virtual address values to be between 0x0001 (0x0000 corresponds to an EEPROM element invalidated by the driver), and the maximum number of EEPROM variables required." you can easily overlook or misunderstand.

The catch is that any data with a virtual address up to including 0xFFFE can be stored in virtual EEPROM memory normaly, provided there is enough space in the active set of pages used for the EEPROM emulation. Values with that addresses can also be read normally. The problem arises when copying data with the PagesTransfer() function to the second set of pages (after the first set is full). This function only copies data with virtual addresses up to and including NB_OF_VARIABLES to the second set of Flash pages, thus forgetting all recorded values with larger virtual addresses. This always happened to my colleagues after a power outage (after switch off/on) because the EE_Init() function performs a dummy write of '0' to eliminate potential instability of the 0xFFFFFFFF line value consecutive to a reset during a write operation. That's why I initially looked for the cause of the error in the wrong place (assuming that power failures and the recovery process were the cause).

To avoid this kinf of problem in the future, I recommend that all functions (EEPROM read and write) perform stricter boundary checks on the virtual address. Rather than checking that the address is different from 0 and 0xFFFF, the functions should check that the address is different from 0 and is less than or equal to NB_OF_VARIABLES. Then, when writing or reading a value with a too large virtual address, the EE_INVALID_VIRTUAL_ADDRESS error will be reported.

I discovered a few more possibilities for X-CUBE-EEPROM improvement if anyone is interested. In my opinion, the current handling of Flash ECC errors after a power outage while erasing a page is not optimal. Preventive deletion of Flash pages when executing EE_Format(EE_FORCED_ERASE) could also be avoided because frequent switching on of the embedded system causes unnecessary cycling of Flash pages. However, if I were to include all the details, this report would become too long. I'm willing to contribute to the X-CUBE-EEPROM library, but I couldn't find X-CUBE-EEPROM repository on https://github.com/STMicroelectronics.

1 ACCEPTED SOLUTION

Accepted Solutions
Saket_Om
ST Employee

Hello @BPrem.1 

It's explicitly described in AN 4894 that "The driver requires the virtual address values to be between 0x0001 (0x0000 corresponds to an EEPROM element invalidated by the driver), and the maximum number of EEPROM variables required"
The MW can be customized by customers to tailor it to their own use cases if they need to target specific values.

 

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om

View solution in original post

4 REPLIES 4
Saket_Om
ST Employee

Hello @BPrem.1 

Thank you for bringing this issue to our attention.

I reported this internally.

Internal ticket number: 212304 (This is an internal tracking number and is not accessible or usable by customers).

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om
Saket_Om
ST Employee

Hello @BPrem.1 

It's explicitly described in AN 4894 that "The driver requires the virtual address values to be between 0x0001 (0x0000 corresponds to an EEPROM element invalidated by the driver), and the maximum number of EEPROM variables required"
The MW can be customized by customers to tailor it to their own use cases if they need to target specific values.

 

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om

@Saket_Om wrote:

It's explicitly described in AN 4894 that "The driver requires the virtual address values to be between 0x0001 (0x0000 corresponds to an EEPROM element invalidated by the driver), and the maximum number of EEPROM variables required"


@BPrem.1 did note that, but suggested that it's easy to miss:

 


@BPrem.1 wrote:

The sentence "The driver requires the virtual address values to be between 0x0001 (0x0000 corresponds to an EEPROM element invalidated by the driver), and the maximum number of EEPROM variables required." you can easily overlook or misunderstand.


(my emphasis)

A complex system that works is invariably found to have evolved from a simple system that worked.
A complex system designed from scratch never works and cannot be patched up to make it work.

Hi Saket,

I agree with you that the information you provided is included in AN4894. Two of my colleagues working on two different projects unfortunately missed it. If this information were in the description of NB_OF_VARIABLES, everything would be more obvious, but if the error was signaled when trying to write data to a virtual address that was too large, then the error would be detected immediately. Developers have to review huge amounts of documentation and it is better that the solutions are robust.

The description "NB_OF_VARIABLES (default 1000, 100 for STM32C0 series(a)): Number of nonvolatile elements, each element value being 8-, 16-, 32-or 96-bit." does not contain the information that virtual addresses from 1 to NB_OF_VARIABLES must be used. My colleagues concluded that only the number of values ​​is limited to NB_OF_VARIABLES, not that they must use consecutive addresses from 1 to NB_OF_VARIABLES.

From my perspective, it is an error if the EE_WriteVariableXXbits() function accepts data with a virtual address that is too large and stores it in the emulated EEPROM, only for the PagesTransfer() function to discard it when copying the data to another set of Flash pages. For applications that do not write much to EEPROM, this can happen weeks or months later. Therefore, I suggest that you implement stricter boundary checks on the virtual address in your code. Rather than checking that the address is different from 0 and 0xFFFF, the functions should check that the address is different from 0 and is less than or equal to NB_OF_VARIABLES. Then, when writing or reading a value with a too large virtual address, the EE_INVALID_VIRTUAL_ADDRESS error will be reported. And the problem in the customer code will be detected immediately.