Skip to main content
David Wallén
Associate III
March 8, 2019
Question

STM32F107 Software Reset breaks Interrupt functionality

  • March 8, 2019
  • 6 replies
  • 5563 views

I have FreeRTOS set up on an STM32F107 performing several tasks of different priority in "parallell".

There is one task of IDLE priority that blinks an LED and one task of Normal priority that receives communication over CAN (triggered by external interrupts). There are also some tasks that reads values off the ADC and uses DMA.

There is also a bootloader in place that can receive new firmware via CAN.

When the MCU is powered it starts in the bootloader, and a custom CAN message kicks it to the application, and everything works fine. The application can receive messages from CAN, blink the LED and execute calculations just fine.

Another custom message can bring it from the application back to the bootloader (via a call to NVIC_SystemReset()), which works. But it is when attempting to bring the MCU back to the application (after NVIC reset) that things start to behave weirdly:

Sometimes, roughly ~60% of the time, the MCU seems to not enable interrupts. The LED blinks just fine (indicating the RTOS is still running its most low-priority task) but the application is not responding to CAN messages. Which to me indicates that interrupts have not been set up correctly.

I have tried putting LED indications on hardware fault handlers and failure to initialize CAN without any success, so I am not sure what the MCU is doing.

I also tried disabling IRQ and all system clocks before calling the NVIC reset without any luck.

Do you have any ideas on how to troubleshoot this further?

Is there any recommended procedure to perform before calling NVIC_SystemReset?

Thanks in advance!

BR,

David

This topic has been closed for replies.

6 replies

AvaTar
Senior III
March 8, 2019

Sounds like a quite complex system ...

I'm not sure about the reset behavior of peripheral registers. A SW reset differs from a power-up.

> Sometimes, roughly ~60% of the time, ...

Which sounds like either a runtime issue, or dependence on another unknown state.

Perhaps CAN messages got stuck during the reset, or some error flags not reset ?

BTW:

> There is also a bootloader in place that can receive new firmware via CAN.

> When the MCU is powered it starts in the bootloader, and a custom CAN message kicks it to the application, and everything works fine.

This is not the approach in our company's controller firmware.

The BL starts the application if a valid one is detected, and a special "stay in BL" flag is not set.

The request for a firmware update must be initated in the application, and the controller power-cycled. Consecutively the firmware update is executed by the BL.

David Wallén
Associate III
March 8, 2019

Hi AvaTar, thanks for your reply!

What do you mean by "got stuck during the reset"? I thought NVIC_SystemReset would clear all buffers as well as restore all variable to their initial values?

I know there can be multiple ways to deal with flashing a new application with the bootloader. We also have the firmware upload executed by the BL, just that an overall host controller has the ability to toggle between bootloader and application.

Is there any special reason you have the requirement to power-cycle in order to get to the boot loader? Perhaps this is the way most platforms are built specifically due to issues like the one I am seeing (that software reset can't be fully trusted) ?

BR,

David

Tesla DeLorean
Guru
March 8, 2019

Do you have anything driving the pin externally. ​

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
David Wallén
Associate III
March 8, 2019

Hi Clive!

What pin do you mean exactly?

The NRST pin is connected via a 100 nF capacitor to GND.

And the boot0 pin is connected directly to GND.

BR,

David

Alexey Trifonov
Associate
March 8, 2019

Hi David,

The "NVIC_SystemReset" command is triggering the hard reset using the watchdog.

In this case, the MCU starts from scratch.

The RAM memory may stay the same.

Have you tried to clean the RAM?

Best Regards,

Alexey

David Wallén
Associate III
March 11, 2019

Hi Alexey, thanks for your reply!

What do you mean may stay the same?

I would hope that NVIC_SystemReset() would result in a predictable (deterministic) behaviour by the MCU.

Is there anywhere I can read what it actually does?

Both the ST manual and ARM documentation so far has not outlines in much detail what goes on and what differs from a power cycle.

BR,

David

Alexey Trifonov
Associate
March 11, 2019

Hi David,

You are right all the registers and peripherals will be in the default state.

I am not sure that there is hardware that clean a RAM memory.

So, some variables may have the same value from the last run.

mem_addr = 0x20000000;

Just run the - memset( mem_addr , 0 , mem_size );

before NVIC_SystemReset().

We can see if it helps.

Regards,

Alexey

Piranha
Principal III
March 10, 2019

Are You going from bootloader to application through NVIC_SystemReset() also? That is the safest way, because it removes unnecessary dependencies. I've described that approach in details here:

https://community.st.com/s/question/0D50X0000AFpTmUSQV/using-nvicsystemreset-in-bootloaderapplication-jumps

David Wallén
Associate III
March 13, 2019

Hi Piranha,

Thank you for your reply and the links to your suggestions.

Our bootloader is written by a third party, so I can not comment exactly on how it jumps to the application.

But from initial inspection it works by changes the vector table with an offset (that matches where the application starts) and then calls the reset isr.

It feels like the jumping procedure in general works, and is reliable. Just that I am missing something that i should be doing before calling NVIC_SystemReset() when going from application to bootloader. Is there any such recommended procedure? For example: always disable interrupts before calling NVIC_SystemReset, or something similar?

BR,

David

Bob S
Super User
March 13, 2019

Goodness gracious NO! Calling memset() to clear ALL of RAM will overwrite your stack. Which means the return from the memset() call will go where??? probably to address zero, which if you haven't changed the memory mapping is the value of the initial stack pointer from the interrupt vector table. Likely not a valid instruction. Nor is the reset vector that follows.

Presuming both your bootloader and main app use the normal "C" startup code, the startup code (that runs before calling your main() function) initializes the RAM areas that the compiler knows about - that is, variables in RAM that are initialized (set to their initial values) and those that are not initialized (set to zero). The only "uninitialized" RAM is areas that the compiler doesn't think you are using.

Does it ALWAYS work when you power on, enter bootloader, get CAN message to run the app and then run the app? Many many many times in a row?

Do you have any external devices that the CPU communicates with that may not be getting fully re-initialized after the soft reboot?

Does the bootloader disable the CAN interface before running the main app? Does the main app re-initialize the CAN interface or does it expect it to be already configured by the bootloader? Does your main app re-initialize the clock tree? Is the CAN port actually running at the correct baud rate (you will probably have to put a scope on the serial lines and look at any data the CPU may transmit, and you may have to force the CPU to transmit something so you can see it).

David Wallén
Associate III
March 13, 2019

Hi Bob,

Thanks for the answer! Yes, it does indeed ALWAYS work when I start from a powered down state, enter bootloader, get a CAN message to run the app. Then I can always get more CAN messages and do different things and interact with peripherals.

There are external devices, but these are managed on other tasks, which are not blocking anything (they should be stopped when an interrupt comes for the CAN message anyways). So I really think it is software related.

I am in conversation with the guy who wrote the bootloader, so I can not answer directly if it de-activates CAN or not (I will add this information as soon as I have it). But I definitely initialize it when the app is starting up, and I have an assertion of the CAN initialization which lights a Blue LED if it went OK (according to HAL) and Red if it fails. When running the test case it lights blue, both when application is started for the first time, and for the second time after switching to bootloader first. This shows me that CAN has been properly initialized (at least from the HAL drivers point of view).

I also had a look with an oscilloscope at the bits on the wire from the CAN hardware driver chip to the STM32 RX pin and they look OK, and still no interrupt is generated on the MCU.

Please let me know what you think.

Best Regards,

David

Bob S
Super User
March 13, 2019

You need to look at the TX line to know what baud rate (and UART clock/prescaler) is actually being used by your CPU. The RX line tells you what OTHER devices on the CAN bus are using. I'm not that familiar with CAN, but you may have to modify things to get your board to send something.

Do you have a serial port that you can output debug info to? Or can you run the STLINK debugger while running this code? You need to find out what how the system clock is configured and all the relevant CAN config registers. Presuming of course that this is the actual problem.

Alexey Trifonov
Associate
March 13, 2019

Hi Bob,

Why do you need the stack if the next command is a reset?

Probably clear the memory can be applied using the for a loop.

Best Regards,

Alexey

Tesla DeLorean
Guru
March 13, 2019

Depends on the memset() implementation and whether is pushs registers and return address on the stack.

One might want to have helpful code in the HardFault Handler to determine if it cratered there.

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
Bob S
Super User
March 13, 2019

And I think "uninitialized memory" is a red herring (not the real issue). As I said, the "C" startup code initializes all of RAM that it THINKS your program is using. If some RAM is really not getting initialized (like any dynamically allocated memory) then clearing memory at the start of the program (or before rebooting) only masks the actual problem.