cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H563 bootloader command over SPI

Dimlite
Associate II

Based on the information found in AN4286 rev 14, I am trying to command to bootloader to start executing the application.

The setup is a H573 and a H563 connected over SPI @ 0.5 MHz. The 573 operates as SPI master, and the 563 as a slave. I know that the HW is fine as I have no problem with the communication (application to application) when booth are configured with BOOT0 = 0.

Then I set BOOT0 = 1 on the 563, and tries the following code:

void BootloaderTest(void)
{
    clrEsbCsN;

    DelayUsBlocking(10);

    // Sync frame
    uart_printf("SEND SYNC 0x5A, RX: %.2X\r\n", SpiWriteEsb(0x5A));
    DelayUsBlocking(10);

    // Get ACK.
    if (GetAck())
    {
        // Send GO command.
        uart_printf("SEND SOF 0x5A, RX: %.2X\r\n", SpiWriteEsb(0x5A));
        DelayUsBlocking(10);

        uart_printf("SEND CMD GO 0x21, RX: %.2X\r\n", SpiWriteEsb(0x21));
        DelayUsBlocking(10);

        uart_printf("SEND CHKSUM, RX: %.2X\r\n", SpiWriteEsb(~0x21));   // single byte command, nothing to XOR.
        DelayUsBlocking(10);

        // Get ack of command.
        uart_printf("SEND 0x00, RX: %.2X\r\n", SpiWriteEsb(0x00));
        DelayUsBlocking(10);

        uart_printf("SEND 0x00, RX: %.2X\r\n", SpiWriteEsb(0x00));
        DelayUsBlocking(10);

        // TODO: check if we received 0x79.

        uart_printf("SEND 0x79, RX: %.2X\r\n", SpiWriteEsb(0x79));
        DelayUsBlocking(10);
    }


    DelayUsBlocking(10);

    setEsbCsN;
}

static bool GetAck(void)
{
    uart_printf("GET ACK\r\n");

    uart_printf("SEND 0x00, RX DUMMY: %.2X\r\n", SpiWriteEsb(0x00));
    DelayUsBlocking(10);

    uart_printf("WAIT FOR ACK OR NACK\r\n");

    do
    {
        uint8 ack = SpiWriteEsb(0xFF);

        uart_printf("SEND DUMMY, RX: %.2X\r\n", ack);
        DelayUsBlocking(10);

        if (ack == 0x79)
        {
            uart_printf("ACK RECEIVED!\r\n");
            uart_printf("SEND 0x79, RX DUMMY: %.2X\r\n", SpiWriteEsb(0x79));
            DelayUsBlocking(10);
            return true;
        }

        if (ack == 0x1F)
        {
            uart_printf("NACK RECEIVED!\r\n");
            return false;
        }

        uart_printf("Trying again.\r\n");

    } while (true);

    return false; // TODO: won't come here unless we limit the nbr of retries.
}

If we assume for now that the low level parts are fine (we can get back to this if needed), do you see any obvious flaws in the implementation with regard to how AN4286 describes it?

This is the output I get:

SEND SYNC 0x5A, RX: A5
GET ACK
SEND 0x00, RX DUMMY: A5
WAIT FOR ACK OR NACK
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: A5
Trying again.
SEND DUMMY, RX: 79
ACK RECEIVED!
SEND 0x79, RX DUMMY: A5
SEND SOF 0x5A, RX: A5
SEND CMD GO 0x21, RX: A5
SEND CHKSUM, RX: A5
SEND 0x00, RX: A5
SEND 0x00, RX: A5
SEND 0x79, RX: A5

The application is not started, and I know it is there because I flashed and tried it before setting BOOT0 = 1.

What is interesting is that next time I send the same command, this is what I get:

TCP: Get status
SEND SYNC 0x5A, RX: A5
GET ACK
SEND 0x00, RX DUMMY: A5
WAIT FOR ACK OR NACK
SEND DUMMY, RX: 1F
NACK RECEIVED!

I always get a NACK on every second attempt. If I were to try again, I would get the first sequence (many attempts for ack), and on the attempt after that I get a NACK etc.

Something is obviously affecting the internal statemachine in the bootloader, but something is missing here.

Thanks in advance.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Dimlite
Associate II

I am usually very satisfied with ST's hardware and documentation, but getting the bootloader in the STM32H563 to work over SPI was quite a miserable experience.

Here I have summarized my experiences and attached working code to communicate with the bootloader from another MCU. Hopefully, this will help others in the same situation.

---------------------------------------------------------------------------------

The documents I had on hand were:

AN2606 rev 64 (STM32 microcontroller system memory boot mode)
This document provides a general description of the bootloader and a more detailed description of the various STM32 variants.

AN4286 rev 14 (SPI protocol used in the STM32 bootloader)
This document describes the SPI protocol and the various commands supported by the bootloader.

RM0481 rev 2
STM32H563 Reference manual.

DS14258 rev 1
STM32H563 Datasheet.

UM2448 rev 8
STLINK-V3SET user manual.

---------------------------------------------------------------------------------

Hardware:

The hardware consisted of two custom boards, one equipped with STM32H573 (SPI master in application mode) and the other with STM32H563 (SPI slave in bootloader mode). These two boards were connected with approximately 14 cm ribbon cable, with ground between each SPI signal.

In terms of hardware, it is sufficient to set pin BOOT0 = 1 to configure the MCU to start in bootloader mode.

One issue is that the bootloader supports many different interfaces (SPI, UART, I2C, USB, CAN, etc.). As soon as you want to use the bootloader, all these interfaces are activated, and therefore the corresponding IO pins must be treated with special care; for example, pull-up/down resistors are needed for unused interfaces, which can affect functionality when these are also used by your own application (especially ADC inputs, where you generally do not want pull-up/down).

---------------------------------------------------------------------------------

Read datasheet and write software.

When I started writing the software for this, my first problem was the lack of information about SPI configuration in AN2606. Other MCUs have information on speed, bit order, CPOL, CPHA, etc., but this is missing here. This has been reported to ST, and they have confirmed that the default values for the SPI registers should be used. Nothing was mentioned about the maximum speed, but I guess it is 8 MHz like the other MCUs. However, I ran significantly slower for safety, at most 750 kHz.

After following the instructions in AN4286 regarding SYNC and ACK of sync byte, it became apparent that many dummy bytes must be sent to get an ACK from the bootloader. Often, about 30 attempts are required, and no upper limit is stated in the datasheet. I can understand that it may take time when the bootloader is listening to all possible interfaces, but as soon as you have established a connection on one interface, the others are disabled. However, even after this, many attempts are still needed to receive an ACK, which suggests to me that the bootloader's software is quite inefficiently written.

At this point, I tried sending GO 0x08000000 (the MCU had previously been loaded with software flashed via STLINK-V3 and BOOT0 = 0, and I knew that program worked), but nothing started.

---------------------------------------------------------------------------------

Use STLINK-V3 and STM32 Cube Programmer CLI to learn more about the protocol.

After contacting ST support, I was advised to use STM32CubeProgrammer CLI together with an STLINK-V3 equipped with MB1440 to obtain SPI from the dongle and test if the MCU responded that way.

Information about this is available in document 2448 and at:

https://community.st.com/t5/stm32-mcus/how-to-use-stm32cubeprogrammer-and-the-stlink-v3set-to-access/ta-p/49607

The CLI command to read option bytes was as follows:

STM32_Programmer_CLI.exe -vb 3 -c port=SPI -OB displ

This worked well except that I had to power cycle the board after each time I ran the command; otherwise, I would immediately receive NACK when I tried again.

After connecting the oscilloscope to NSS, MOSI, MISO, and SCK, it turned out that for each individual byte that the master sends (remember that reading also involves a writing operation in SPI), NSS is pulled low and released. This is a significant deviation from how NSS is illustrated in, for example, Figures 3, 4, and 5 in AN4286. See the attached screenshot from the oscilloscope, which shows SYNC and a large number of dummy bytes (0x00) sent before we receive ACK from the slave.

Since things weren't working, I switched to this method. I cannot say for sure that the error was here, but there is a remarkably large difference between the documentation and the implementation.

---------------------------------------------------------------------------------

Weird things in CLI.

While experimenting with the CLI, I tested changing the baud rate with the parameter br=500. The default is 375 kHz, and I wanted to increase it. Apparently, this seems to require the frequency to be even multiples of a certain frequency; I could not set, for example, 500 kHz, this input is ignored, and the frequency remains at 375 kHz.

Other quirks in the CLI included that in the command above, port=SPI is specified after -vb 3. Here it was not possible to add the parameter -br; doing so resulted in an error message (Warning: Baudrate cannot be equal to zero), even for a valid frequency like 750 kHz.

That is, this does not work:
STM32_Programmer_CLI.exe -vb 3 -c port=SPI br=750 -OB displ
Warning: Baudrate cannot be equal to zero

However, it works if I change the order of the commands.
STM32_Programmer_CLI.exe -c port=SPI br=750 -vb 3 -OB displ
OK!

Another parameter I never got to work was the delay.

delay: FEW_MICROSEC

STM32_Programmer_CLI.exe -c port=SPI br=500 -delay=nodelay -vb 3

A delay of 1 ms is introduced between each byte, which is bad for the overview on the oscilloscope. 15 µs is sufficient according to the datasheet.
I am not sure if this is the delay between the bytes; the documentation does not provide such details.

But no matter what I tried, the delay ignored my values.

---------------------------------------------------------------------------------

Conflicting documentation.

According to AN4286, the registers in the MCU will be reset to their default values (2.6, Go command)

When the address is valid and the checksum is correct, the bootloader firmware performs the following actions:
• Initializes the registers of the peripherals used by the bootloader to their default reset values.
• Initializes the user application main stack pointer.
• Jumps to the memory location programmed in the received ‘address + 4’

But in AN2606 (page 33), it states:

When executing the Go command, the peripheral registers used by the bootloader are not initialized to their default reset values before jumping to the user application. They must be reconfigured in the user application if they are used.

I don’t know what to believe.

---------------------------------------------------------------------------------

Weak protocol.

Aside from the exceedingly inefficient handling of ACK described above, the checksum used to verify communication is also very weak. XOR detects single bit errors, but if there are multiple bit errors, these can go unnoticed. In the context of OTA, it is extremely important that what is written to flash is correct, and I do not know if I would trust this.

---------------------------------------------------------------------------------

Summary.

Did I manage to get it working?
I can correctly read the SPI protocol version (0x01) and chip ID (0x02). But I never got GO to work; I get an ACK, but the program does not start.

If I let GO be preceded by, for example, get chip ID, I get neither ACK nor NACK on GO.

I have now reached a point where I consider the bootloader does not meet my needs, and I will not spend any more time on it. But I am attaching working source code for anyone who wants to continue working on this.

View solution in original post

1 REPLY 1
Dimlite
Associate II

I am usually very satisfied with ST's hardware and documentation, but getting the bootloader in the STM32H563 to work over SPI was quite a miserable experience.

Here I have summarized my experiences and attached working code to communicate with the bootloader from another MCU. Hopefully, this will help others in the same situation.

---------------------------------------------------------------------------------

The documents I had on hand were:

AN2606 rev 64 (STM32 microcontroller system memory boot mode)
This document provides a general description of the bootloader and a more detailed description of the various STM32 variants.

AN4286 rev 14 (SPI protocol used in the STM32 bootloader)
This document describes the SPI protocol and the various commands supported by the bootloader.

RM0481 rev 2
STM32H563 Reference manual.

DS14258 rev 1
STM32H563 Datasheet.

UM2448 rev 8
STLINK-V3SET user manual.

---------------------------------------------------------------------------------

Hardware:

The hardware consisted of two custom boards, one equipped with STM32H573 (SPI master in application mode) and the other with STM32H563 (SPI slave in bootloader mode). These two boards were connected with approximately 14 cm ribbon cable, with ground between each SPI signal.

In terms of hardware, it is sufficient to set pin BOOT0 = 1 to configure the MCU to start in bootloader mode.

One issue is that the bootloader supports many different interfaces (SPI, UART, I2C, USB, CAN, etc.). As soon as you want to use the bootloader, all these interfaces are activated, and therefore the corresponding IO pins must be treated with special care; for example, pull-up/down resistors are needed for unused interfaces, which can affect functionality when these are also used by your own application (especially ADC inputs, where you generally do not want pull-up/down).

---------------------------------------------------------------------------------

Read datasheet and write software.

When I started writing the software for this, my first problem was the lack of information about SPI configuration in AN2606. Other MCUs have information on speed, bit order, CPOL, CPHA, etc., but this is missing here. This has been reported to ST, and they have confirmed that the default values for the SPI registers should be used. Nothing was mentioned about the maximum speed, but I guess it is 8 MHz like the other MCUs. However, I ran significantly slower for safety, at most 750 kHz.

After following the instructions in AN4286 regarding SYNC and ACK of sync byte, it became apparent that many dummy bytes must be sent to get an ACK from the bootloader. Often, about 30 attempts are required, and no upper limit is stated in the datasheet. I can understand that it may take time when the bootloader is listening to all possible interfaces, but as soon as you have established a connection on one interface, the others are disabled. However, even after this, many attempts are still needed to receive an ACK, which suggests to me that the bootloader's software is quite inefficiently written.

At this point, I tried sending GO 0x08000000 (the MCU had previously been loaded with software flashed via STLINK-V3 and BOOT0 = 0, and I knew that program worked), but nothing started.

---------------------------------------------------------------------------------

Use STLINK-V3 and STM32 Cube Programmer CLI to learn more about the protocol.

After contacting ST support, I was advised to use STM32CubeProgrammer CLI together with an STLINK-V3 equipped with MB1440 to obtain SPI from the dongle and test if the MCU responded that way.

Information about this is available in document 2448 and at:

https://community.st.com/t5/stm32-mcus/how-to-use-stm32cubeprogrammer-and-the-stlink-v3set-to-access/ta-p/49607

The CLI command to read option bytes was as follows:

STM32_Programmer_CLI.exe -vb 3 -c port=SPI -OB displ

This worked well except that I had to power cycle the board after each time I ran the command; otherwise, I would immediately receive NACK when I tried again.

After connecting the oscilloscope to NSS, MOSI, MISO, and SCK, it turned out that for each individual byte that the master sends (remember that reading also involves a writing operation in SPI), NSS is pulled low and released. This is a significant deviation from how NSS is illustrated in, for example, Figures 3, 4, and 5 in AN4286. See the attached screenshot from the oscilloscope, which shows SYNC and a large number of dummy bytes (0x00) sent before we receive ACK from the slave.

Since things weren't working, I switched to this method. I cannot say for sure that the error was here, but there is a remarkably large difference between the documentation and the implementation.

---------------------------------------------------------------------------------

Weird things in CLI.

While experimenting with the CLI, I tested changing the baud rate with the parameter br=500. The default is 375 kHz, and I wanted to increase it. Apparently, this seems to require the frequency to be even multiples of a certain frequency; I could not set, for example, 500 kHz, this input is ignored, and the frequency remains at 375 kHz.

Other quirks in the CLI included that in the command above, port=SPI is specified after -vb 3. Here it was not possible to add the parameter -br; doing so resulted in an error message (Warning: Baudrate cannot be equal to zero), even for a valid frequency like 750 kHz.

That is, this does not work:
STM32_Programmer_CLI.exe -vb 3 -c port=SPI br=750 -OB displ
Warning: Baudrate cannot be equal to zero

However, it works if I change the order of the commands.
STM32_Programmer_CLI.exe -c port=SPI br=750 -vb 3 -OB displ
OK!

Another parameter I never got to work was the delay.

delay: FEW_MICROSEC

STM32_Programmer_CLI.exe -c port=SPI br=500 -delay=nodelay -vb 3

A delay of 1 ms is introduced between each byte, which is bad for the overview on the oscilloscope. 15 µs is sufficient according to the datasheet.
I am not sure if this is the delay between the bytes; the documentation does not provide such details.

But no matter what I tried, the delay ignored my values.

---------------------------------------------------------------------------------

Conflicting documentation.

According to AN4286, the registers in the MCU will be reset to their default values (2.6, Go command)

When the address is valid and the checksum is correct, the bootloader firmware performs the following actions:
• Initializes the registers of the peripherals used by the bootloader to their default reset values.
• Initializes the user application main stack pointer.
• Jumps to the memory location programmed in the received ‘address + 4’

But in AN2606 (page 33), it states:

When executing the Go command, the peripheral registers used by the bootloader are not initialized to their default reset values before jumping to the user application. They must be reconfigured in the user application if they are used.

I don’t know what to believe.

---------------------------------------------------------------------------------

Weak protocol.

Aside from the exceedingly inefficient handling of ACK described above, the checksum used to verify communication is also very weak. XOR detects single bit errors, but if there are multiple bit errors, these can go unnoticed. In the context of OTA, it is extremely important that what is written to flash is correct, and I do not know if I would trust this.

---------------------------------------------------------------------------------

Summary.

Did I manage to get it working?
I can correctly read the SPI protocol version (0x01) and chip ID (0x02). But I never got GO to work; I get an ACK, but the program does not start.

If I let GO be preceded by, for example, get chip ID, I get neither ACK nor NACK on GO.

I have now reached a point where I consider the bootloader does not meet my needs, and I will not spend any more time on it. But I am attaching working source code for anyone who wants to continue working on this.