Why does adding a fourth module stops the firmware working as required?

ron_w · ‎2022-01-10

Hello

I'm developing firmware for a STM32F413 on a PCBA. The controller needs to:

- control two ADCs via one SPI,

- control an LCD via another SPI,

- de-bounce a switch using a timer and

- communicate with a UI interface over a UART.

The SPIs and the UART use DMA.

I am able to develop the firmware in the two following ways before an error occurs:

1) Get data from the ADCs, display pages to the LCD and communicate over the UART.

When the de-bounced switch is added, then

a) HAL_SPI_Transmit_DMA(lcd->spiHandle, data, size); does not call the following callback:

HAL_SPI_TxCpltCallback(SPI_HandleTypeDef *hspi) which leaves the firmware in the following loop as the callback should set 'spiTX_cpltFL' to 'true'

// Wait for end of DMA transfer
while(!lcd->spiTX_cpltFL){};

The above occurs the first time that data to display a page is sent to the LCD.

2) Get data from the ADCS, de-bounce the switch and communicate over the UART.

When the LCD is added the 'HardFault_Handler' is called:

a) BFARVALID and PRECISERR are set;

b) the value in the programme counter (PC)(MSP) is at 0x08002774 which points to

    // ** stm32f4xx_hal_spi.c 
    // line 830 onwards
    if ((hspi->Instance->CR1 & SPI_CR1_SPE) != SPI_CR1_SPE)
    {
      * Enable SPI peripheral */
      __HAL_SPI_ENABLE(hspi);
    }
 
    // ** stm32f4xx_hal_spi.h
    // line 455
    #define __HAL_SPI_ENABLE(__HANDLE__)  SET_BIT((__HANDLE__)->Instance->CR1, SPI_CR1_SPE)

'__HAL_SPI_ENABLE' is as follows:

  // ** stm32f4xx_hal_spi.c 
 
  // line 830 onwards
  if ((hspi->Instance->CR1 & SPI_CR1_SPE) != SPI_CR1_SPE)
  {
   * Enable SPI peripheral */
   __HAL_SPI_ENABLE(hspi);
  }
 
 
 
  // ** stm32f4xx_hal_spi.h
 
  // on line 455
  #define __HAL_SPI_ENABLE(__HANDLE__) SET_BIT((__HANDLE__)->Instance->CR1, SPI_CR1_SPE)

Can anyone suggest what I might be doing wrong to generate these errors, please? The various pieces of code appear to work but not all together, so it appears that I've caused a conflict somewhere.

Regards

Ron

TDK · ‎2022-01-10

DMA IRQ needs to be serviced in order for it to call HAL_SPI_TxCpltCallback. Possibly your other interrupts are overloading the system. Be sure to call it at a reasonable rate and clear appropriate flags so it doesn't reenter immediately.

If __HAL_SPI_ENABLE is hard faulting, probably the hspi handle is invalid or otherwise not initialized.

If you feel a post has answered your question, please click "Accept as Solution".

ron_w · ‎2022-01-10

Thanks for those @TDK.

As the various modules do work, but not all together, then I think that overloading or the hspi handle may be the most likely cause.

I'm not sure what you mean by "servicing the DMA IRQ"; could you give me an idea of where I need to do this, please?

Tesla DeLorean · ‎2022-01-10

Perhaps get a better Hard Fault handler which can output the exact (precise) instruction that is failing, and the registers at that time so a disassembly of the code can show you what's amiss.

Check your stack is large enough.

Make sure variables change under interrupt/callback are volatile where need-be.

The various IRQHandlers in startup.s typically call C code you provide, which subsequently calls back into the HAL, which in turn calls your callbacks.

If you sit in infinite loops in callbacks, or wait on clocks that you're blocking the tick, you are apt to dead-lock the machine.

The pick-n-mix presentation of the code makes it hard to grasp the interplay of the code, or if any of the variables are correctly/suitably defined.

Step#1 might be to get some diagnostic data from a UART, and getting the Hard Fault Handler to output details using that.

Then instrument your code so you can understand dynamic flow and interaction.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Andrew Neil · ‎2022-01-11

@Community member - "get a better Hard Fault handler"

Lots of references on debugging Cortex-M Hard Faults:

https://community.arm.com/support-forums/f/embedded-forum/3257/debugging-a-cortex-m0-hard-fault

TDK · ‎2022-01-11

With CubeMX generated code, the IRQ handler is placed in the stm32f4xx_it.c file. It should call the HAL IRQ handler HAL_DMA_IRQHandler which clears flags and calls functions as appropriate.

If you're getting a hard fault, it doesn't sound like that's the issue here. Sounds more like an out of bounds write or other memory mismanagement issue.

If you feel a post has answered your question, please click "Accept as Solution".

ron_w · ‎2022-01-19

Thanks for the replies.

In the end I went back to an older version of the code that controlled the UART, ADCs and LCD. To that version I added the the de-bounced switch and got all of these four working together as required. During this development I noticed that I had two issues occurring at once:

1) While trying to get some inline assembly code working to get some hard fault feedback I changed the compiler to ARM version 6 from version 5. This does not appear to call 'HAL_SPI_TxCpltCallback()' in ARM 6 but works in ARM 5. I don't know why but am staying with version 5.

2) It looks like my initial bug was to create two similar named variables for the same purpose, but using one in one place and the other in another. Unfortunately, I don't have time to prove whether that was the reason.

ron_w · ‎2022-01-19

@Andrew Neil - thanks for the link to those resources. I used some of them, but thought the following might be useful for others - it does have a useful video to go with it:

https://interrupt.memfault.com/blog/cortex-m-fault-debug