cancel
Showing results for 
Search instead for 
Did you mean: 

NUCLEO-L476RG invalidated firmware update behavior

ac_gd
Associate II

Hi all,

We have been testing our 2 images configuration SBSFU firmware update with our NUCLEO-L476RG. We have encountered 2 scenario's during which the different steps taken by the bootloader was not entirely clear to us. We could use your help to clarify this.

So in both scenario's we started by normally flashing our firmware to the STM32. Nothing special.

Subsequently, in the first scenario, we do a hardware reset to do a YMODEM transfer to transfer new firmware to the STM. When this update is completed, we again do a hardware reset and transfer new firmware again over YMODEM, before the previous firmware was validated (this was intentional for testing this scenario). You can find the output log of this scenario in the attached file output-log-scenario-1.txt.

We have 2 questions about this scenario and its output:

  • Are we correct to assume that the following happens?
  1. So in the first update, the new FW is downloaded in the DWL slot and successfully swapped with the ACTIVE slot. After the installation is complete, the FW in the ACTIVE slot is not validated yet.
    1. Then, the hardware reset triggers the second update. After that second YMODEM transfer, the STM resets, and the bootloader realizes that the ACTIVE slot was not validated yet, so a rollback should happen.
    2. It wants to rollback to the normally backed up FW in the DWL slot, but this slot now contains the just downloaded FW (and not any backed-up FW) and therefore it fails and says "backed up FW not identified", so the rollback fails.
    3. It resets, and after the reset it has a new downloaded FW in the DWL slot, so it just installs that one. Is this a correct interpretation?
  • What does it mean that "backed-up FW is not identified"? Does it mean that it is not validated or has it another meaning?

In the second scenario, of which you can find the logs in output-log-scenario-2.txt, a similar process happens. The first FW update installs successfully, and starts running. However, the second FW update is executed before the first FW update was validated (again, this was intentional for this scenario) and fails during the YMODEM tranfer.

Here we have 3 questions:

  • Is this the correct interpretation?
  1. In the first update, the new FW is downloaded in the DWL slot and successfully swapped with the ACTIVE slot.
    1. After the installation is complete, the FW in the ACTIVE slot is not validated yet. Then, the hardware reset triggers the second update.
    2. The second update then fails during the YMODEM transfer (the reason for the YMODEM failure is not important for our question) and the STM resets. After resetting, the STM realizes that its FW in ACTIVE1 is not validated yet and wants to rollback. For some reason, the backed-up FW is now identified correctly (why? shouldn't this be the same as in scenario 1, something as "backed-up FW" not identified"?) and the rollback procedure seems to complete without errors. However, the rollback did not happen yet.
    3. Then it resets, and it first erases ACTIVE1 (why?), and then it actually says it has no "No valid FW found in the active slots nor new FW to be installed" because the YMODEM transfer into the DWL slot failed (so we think that now nothing is in the DWL slot?). So it's only option is to wait for now FW.
  • Why don't we get the "backed up FW not identified" message. This part of this scenario should be similar to scenario 1? Is it because the YMODEM transfer failed and maybe the DWL slot does not contain anyhing useful?
  • Is it correct that when the YMODEM transfer fails, there is nothing in the DWL slot and therefore it says "No valid FW found in the active slots nor new FW to be installed"?
  • Why is the FW in ACTIVE1 erased? We think it happens at the following lines in sfu_boot.c:
/* 3- No active firmware candidate for execution ==> Local download */
      if (m_ActiveSlotToExecute == 0U)
      {
        /* Control if all active slot are empty */
        for (i = 0U; i < SFU_NB_MAX_ACTIVE_IMAGE; i++)
        {
          if (SlotStartAdd[SLOT_ACTIVE_1 + i] != 0U)       /* Slot configured ? */
          {
            if (SFU_IMG_VerifyEmptyActiveSlot(SLOT_ACTIVE_1 + i) != SFU_SUCCESS)
            {
              /*
               * We should never reach this code.
               * Could come from an attack ==> as an example we invalidate current firmware.
               */
              TRACE("\r\n\t  Slot SLOT_ACTIVE_%d not empty : erasing ...", SLOT_ACTIVE_1 + i);
              (void)SFU_IMG_InvalidateCurrentFirmware(SLOT_ACTIVE_1 + i); /* If this fails we continue anyhow */
            }
          }
        }

It would be great if you could give us some insight in the process behind this. Of course, we also checked the code, but some of these details we could not figure out.

Thanks in advance!

ac_gd

8 REPLIES 8
Bubbles
ST Employee

Hello @ac_gd​ ,

I'll start with the first scenario.

From the context I assume you are testing configuration with ENABLE_IMAGE_STATE_HANDLING enable.

If you look at the UM2262, in Appendix J the flow is described and your understanding seems to match the description in the UM.

The error message you received is returned on several different pre-rollback check-up, but generally it means the header contents do not match. Which is expected, as it's not a backup.

Jarda

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Bubbles
ST Employee

I'll look in detail on the second scenario tomorrow. It's obviously a different branch and it probably related to the fact that the first image was still the actual one, never replaced. Please in the meantime confirm what I wrote earlier.

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Hi @JHOUD​, thanks for your answer and for the reference to Appendix J. I did not know it existed.

I've attached the flow image, combined with poor drawing skills. So looking at my output log and the flow, I guess my scenario follows the red line. A couple of follow-up questions:

  • So it checks "Is previous firmware valid?". I guess the "previous firmware" is the one in the DWL slot.
    • Is it somewhere during this test that in my scenario the "backed-up FW is not identified" message is generated? (Or is it in the next "roll-back step"?)
    • However, no local download is started (as indicated in the flow when the test fails). Is this because there is already a firmware in the DWL slot? I think this part in my output logs is rather weird: it says the rollback failed (because of error), so I guess it won't rollback to the image in the DWL slot, but on the next reset it does install that firmware from the DWL slot.
  • You mention: "generally it means the header contents do not match". You mean the header contents of the image in the ACTIVE and DWL slots? Why do those header contents have to match for a rollback? Can't these be different images, so shouldn't the header contents differ? Probably I do not entirely comprehend the meaning of the header contents.

0693W00000BZl0RQAT.png 

Thanks in advance,

ac_gd

Hi @JHOUD​, thanks again for your help. How I see it:

  • After first (successful) update (and thus swapping): ACTIVE slot contains the not validated new FW, the DWL slot contains the original validated FW.
  • After the second (not successful) update/YMODEM transfer: ACTIVE slot still contains the not validated FW (which is observed on the reset after the YMODEM transfer), but what does the DWL slot contain now? I guess: the original validated FW or nothing useful (because the transfer failed). In case of the original validated FW, why erasing the ACTIVE slot and waiting for a new local download, and not just rollback to this FW in the DWL slot?

Hi @ac_gd​ ,

yes, the red line shows the flow that I believe describes best your case.

The flow is high-level, it's easy to see that not all code branches are included, only those needed to understand the underlying principle. So in reality I'd say the error message is generated in between.

I think I know what's exactly happening, but I don't have time to test it today. Please hold.

For the header-slot relationship, I'm going to steer your attention to Appendix B describing the dual slot configuration. The header contains both "measurements" (length, integrity) and the life cycle information.

Rgds,

Jarda

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Hello @ac_gd​ ,

I failed to replicate the exact same situation that your logs describe, so I'm not completely sure.

But I see the code that erases the active slot is commented as that this case should actually never happen. It covers the case when NO FW is found in ACTIVE slot and yet the slot is not empty. So basically I'd say the SBSFU function SFU_IMG_DetectFW failed to recognise your App as valid FW. But I do not know why that happened.

I'll take another look if you can send over a zip file with your project.

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Hi @JHOUD​ , sorry for the delay.

Too bad that you can't replicate the scenario. In the meanwhile, we also investigated the second scenario in more detail. I will repeat and clarify what I have done do create this scenario 2 and elaborate on what we found out. A first remark is that I always work with the exact same FW, for the original flashing and also for updates. The process:

  1. I flash my FW to the STM.
  2. I do a FW update over YMODEM. This first FW update is successful, meaning that the logs tell me that ACTIVE1 now contains the new FW and the DWL slot contains the original, flashed FW. The UserApp is started.
  3. After a while, I start a new FW update over YMODEM, but - and this is important - the FW that is currently in ACTIVE1 (of the first FW update) is not validated yet, meaning the SE_APP_ValidateFw() function is not called yet.
  4. This second FW update fails. We (intentionally) sent a YMODEM abort signal in the very beginning of the transfer (not even a data packet has been sent). This results in a COM ERROR as can be seen in the logs.
  5. In the reset that follows, the bootloader notices the fact that the FW in ACTIVE1 is not validated yet, so it initiates a rollback.
  6. This rollback actually seems to succeed - in contrast to what I said in my original message. We find it very weird that this rollback can succeed**.
  7. In the next reset, the bootloader first notices that the contents of ACTIVE1 (which now normally contains the result of the rollback) is not empty and erases it. This is also weird as it found the image good enough to roll back to...

** It is strange the bootloader can rollback to the image in the DWL slot, because after the COM ERROR, we found out that the SFU_IMG_EraseDownloadedImg() function in SFU_BOOT_SM_DownloadNewUserFw() is called successfully (return code is SFU_SUCCESS). So this means it is actually rolling back to an empty image. Is it possible that the headers are not erased and this causes the bootloader to think that the contents of the DWL slot is also valid and just pursues the rollback (which seems to be a plain copy)?

What are your thoughts about this?

Due to company policy, I can't share our SBSFU project. However, in the coming week (or the week after) I will try to replicate the scenario on a dev board with clean slate SBSFU.

Thanks for the effort,

ac_gd

Hi @ac_gd​ ,

I attached log of my attempt to recreate the behavior you described. You see that after unsuccessful attempt to rollback the active slot is empty and the SBSFU asks for FW to be sent. Which is expected behavior. Maybe I misunderstood something, so please check the log.

Frankly I find it difficult to understand how the rollback to non-existing App image could appear to pass.

I'm genuinely interested in any demo you could possible share with me.

Thank you,

J.

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.