cancel
Showing results for 
Search instead for 
Did you mean: 

stm32l5xx_hal_mmc.c driver crashing non-deterministically

DBark.2
Associate III

When trying to read/write to the connected eMMC device, the eMMC driver will sometimes fail with an error. After failing in this way, every call to HAL_MMC_ReadBlocks or HAL_MMC_WriteBlocks will fail until the microcontroller is restarted. 

If the first call in a series of reads succeeds, all following calls will complete without issue *UNLESS* the following calls are to a non-sequential part in eMMC memory, in which case they might fail.

A partial work-around I’ve found is to put HAL_Delay calls before each call to the eMMC memory, but this does not always work. 

Similarly, running in debug mode and walking slowly through the process will always work. And then following calls at high-speed run without issue.

The HAL_MMC_GetCardState function always returns a 4 - HAL_MMC_CARD_TRANSFER. If I try to wait for the cardstate to become HAL_MMC_CARD_READY before making a call, it will stall indefinitely. 

My hypothesis is that the stm32l5xx_hal_mmc driver is not correctly reading the eMMC card state. And therefore is not correctly waiting for the eMMC memory to be ready for a transmission before trying to send/receive data.

Because the card I’m using has a small processor and optimizes the speed for sequential blocks, it may be the case that I only see issues in non-sequential because they take longer, and the card isn’t ready for the new command somehow.

I’ve not been able to find any documentation on this issue, any assistance would be greatly appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
DBark.2
Associate III

Hello @ChahinezC​ ,

I'm actually going to have to apologize here. This was our mistake. The software driver became suspicious because the issue was caused periodically but only on the first call (All future calls would work without issue), and the driver wasn't correctly maintaining status, which seemed to rule out signal integrity issues. But it turns out the mistake was a missing pin pull-up. The driver is working, sorry again.

I don't know if this is at all helpful, but as an apology, here's a different issue that I uncovered while working on this process, so you have something for the time: https://community.st.com/s/question/0D53W00000ghIQHSA2/bug-found-solution-octospi-configuration-fails-when-sending-only-data-without-instruction-address-or-alternate-byte-or-dummycycle-in-regularcommand-protocol

View solution in original post

6 REPLIES 6
Piranha
Chief II

The whole HAL/Cube bloatware is crashing non-deterministically... 😉

ChahinezC
Lead

Hello @DB.7arke​,

Can you please provide me with your project (or a minimum code)?

It may help me point to the problem you are facing.

Chahinez.

DBark.2
Associate III

Hello @ChahinezC​ ,

Thanks for reaching out!

I've built a specific project for the purpose of demonstrating this issue. I'll attach the .ioc and main.c files (the only ones that have been modified) and write the main code here as well for anyone interested.

If the information is helpful, I'm currently connected to a Kingston e*MMC 5.1 EMMC64G-IB29-90F02 .

I'll add the additional files in the following comments, because this only seems to allow me to attach one file at a time.

  // Variable Definitions
  uint8_t HAL_ret = HAL_OK;
  uint8_t card_state = 0;
 
  // eMMC optimizes card speed for repeated operations on the same memory location
  //    so we're randomizing the read/write location to try to reproduce error more
  //    frequently
  uint32_t Block_Address = HAL_GetTick() % 10;
 
  __disable_irq();  // Disabling or enabling interrupts does not seem to have an effect.
                    //    But I'm doing this here to eliminate as many error causes as possible
  for(uint8_t i=0; i<3; ++i)  // Retry reading 7 times to show that after a failure, it is impossible to reconnect
  {
     HAL_ret = HAL_MMC_WriteBlocks(&hmmc1, buf, Block_Address*32, 32, 100);
 
      // While I often encounter the error in my project at the first call to the emmc, I'm also adding this second call to
      //    cause the error to come up more often for the purposes for the test
     uint32_t Block_Address = HAL_GetTick() % 10;
     HAL_ret = (HAL_OK == HAL_ret) ? HAL_MMC_WriteBlocks(&hmmc1, buf, Block_Address*32, 32, 100) : HAL_ret;
             // This second call may reproduce the issue a little consistently.  If you want to see connection succeed, just
             //      comment out this second call and it will succeed most times because the card isn't overwhelmed as quickly.
 
     if(HAL_OK == HAL_ret)
     {
        break;
     }
     else
     {
        __enable_irq();
        //HAL_MMC_Abort(&hmmc1);   // I have tried an absurd number of solutions to "reset" the card.
                                 //     None have been successful.  Once the card has failed, the only method
                                 //     I've found to consistently fix it is to restart and run the "HAL_MMC_ReadBlocks()"
                                 //     in the debugger at slow speeds.  Re-flashing the board often doesn't even fix the issue
 
                                 //     Completely DeInit-ing the mmc driver and re-initing it doesn't solve the issue
                                 //     (This attempt fails at the PowerOn() call in the Init() function)
 
        HAL_Delay(1000);         // Calling HAL_Delay with a sufficient timeout has had good results in
                                 //    the error from occurring.  So has increasing the clock divider, unfortunately
                                 //    I need a fast system.  (I'm already using mmc DMA in the working project)
 
        while(hmmc1.State != HAL_MMC_STATE_READY){     // This was my initial attempt at trying to fix this error
                                                       //  Unfortunately, the hmmc1 value only gives the state of the driver
                                                       //     not the state of the card that's being written to
 
           card_state = HAL_MMC_GetCardState(&hmmc1);  // This is it right here.  The function that should allow me to
                                                       //    check the state of the physical mmc device before writing.
                                                       //    I imagine if I could use this, I'd be able to query the card,
                                                       //    wait for it to be ready for it to receive messages, and then
                                                       //    work as intended.  Unfortunately, this only ever returns "HAL_MMC_CARD_TRANSFER"
                                                       //    no matter how long I wait.  And even then, if waiting for the card to be ready
                                                       //    was the problem, why does the FIRST call to the card fail so often?
                                                       //    I've tried this on multiple boards.
 
        }
        __disable_irq();
        // Try re-init-ing eMMC, then wait a little
     }
  }
  __enable_irq();
 
  // If everything worked perfectly, try again.  This is non-deterministic
  // If I use proper timeouts I'm able to avoid this error about 80% of the time
  //    unfortunately that's hardly acceptable for a release.
 
 
  for(uint8_t i; i<20; ++i)                   // I ran this at the beginning at the same result, but that made the error
                                              //    harder to reproduce so I moved this down here
  {
     card_state = HAL_MMC_GetCardState(&hmmc1);
     if(card_state != HAL_MMC_CARD_TRANSFER)
     {
        break;
     }
  }

DBark.2
Associate III

See the main file attached below. Additionally, here's a picture of the main .ioc configurations in question here:

0693W000008wi1kQAA.png 

ChahinezC
Lead

Hello @DB.7arke​,

Can you provide me with your full main file?

I would like to check the SDMMC peripheral initialization.

Chahinez.

DBark.2
Associate III

Hello @ChahinezC​ ,

I'm actually going to have to apologize here. This was our mistake. The software driver became suspicious because the issue was caused periodically but only on the first call (All future calls would work without issue), and the driver wasn't correctly maintaining status, which seemed to rule out signal integrity issues. But it turns out the mistake was a missing pin pull-up. The driver is working, sorry again.

I don't know if this is at all helpful, but as an apology, here's a different issue that I uncovered while working on this process, so you have something for the time: https://community.st.com/s/question/0D53W00000ghIQHSA2/bug-found-solution-octospi-configuration-fails-when-sending-only-data-without-instruction-address-or-alternate-byte-or-dummycycle-in-regularcommand-protocol