cancel
Showing results for 
Search instead for 
Did you mean: 

STM32U5xx: OCTALSPI (as QSPI) fails for faster speed, with 1V8 it fails more drastically

tjaekel
Senior III

This is more a bug report, not a question.

Background:

I want to use OCTALSPI as QuadSPI (QSPI), esp. with VDD set to 1V8. I want to get (at least) 30 MHz SCLK working on QSPI.

But I cannot go faster as 12.195 MHz, which is associated with a OCTALSPI clock divider set to: 13.

Symptoms:

VDD = 3V3:

down to clock divider = 13 - all looks fine
clock divider lower (for faster speed): looks different on SCLK but might be OK (still reasonable, even with gaps)
the minimum for clock divider with VDD = 3V3 is: 6 (resulting in 27.778 MHz - I want to see 30 MHz working)
anything faster (smaller clock divider) FAILS: there is no SCLK anymore
VDD = 1V8:

all fine down to clock divider = 13 - the same as 3V3
clock divider = 12 - FAILS completely! no SCLK anymore!
with VDD = 1V8: I cannot even reach the same speed as with 3V3
Details on FW:

I use the standard HAL functions. My QSPI ends up in calling "HAL_QSPI_Transmit()". This is running in "polling mode": the data is written to FIFO register with looping and waiting until it was sent (no INT, no DMA):

   status = OSPI_WaitFlagStateUntilTimeout(hospi, HAL_OSPI_FLAG_FT, SET, tickstart, Timeout);

The OCTALSPI clock source is:

PeriphClkInit.PeriphClockSelection = RCC_PERIPHCLK_OSPI;

PeriphClkInit.OspiClockSelection = RCC_OSPICLKSOURCE_SYSCLK;

Waveforms:

STM32U5A5_QSPI_issues3.png

STM32U5A5_QSPI_issues4.png

STM32U5A5_QSPI_issues1.png

STM32U5A5_QSPI_issues2.png

STM32U5A5_QSPI_issues5.png

VDD = 1V8:
It fails immediately with CLKDIV = 12 - where it was working still (a bit) with 3V3!

What is wrong?

The goal is:

QSPI with VDD = 1V8 and 30 MHz
the SCLK should look constant (no gaps, continuous)
if gaps are on SCLK - caused by "SW polling mode" - OK:
how to change and use DMA based functions?

15 REPLIES 15

It is so strange...

I thought: "OK, check Fifo-threshold, done in file "stm32u5xx_hal_ospi.c". I assume it flags immediately if one byte is free, even I write all the time 32words (4 bytes). Add a delay and wait for FIFO having more bytes free, e.g. 4 or even completely empty."

But this code fails completely as well: the transaction stalls and never completes:

      do
      {
        /* Wait till fifo threshold flag is set to send data */
        status = OSPI_WaitFlagStateUntilTimeout(hospi, HAL_OSPI_FLAG_FT, SET, tickstart, Timeout);

        //HAL_Delay(500); //doing this fails!

        if (status != HAL_OK)
        {
          break;
        }

        *((__IO uint8_t *)data_reg) = *hospi->pBuffPtr;

This HAL_Delay(500); makes it to fail! WHY?

It would tell me: the OCTOSPI is very timing sensitive: I had to provide data fast enough to keep it going. If I "stall" when sending data, e.g. via another INT, RTOS schedules something else... - it would fail as well. "rrrrrr"

I was looking for a register in OCTOSPI telling me how many bytes are free (or occupied) in FIFO: such a register is not there (just the threshold indication flag). So, writing 32bit words but FIFO is based on Bytes - it can result in a FIFO overflow: check via HAL_OSPI_FLAG_FT = one byte is free - but write 4 bytes all the time, and keep going this way - it should overflow the FIFO at the end).

But code works fine with slower speed (without HAL_Delay, OCTOSPI DIV at least 13). On faster speeds (DIV = 2) it creates these byte bursts (which I want to get rid of).

I give up on STM OCTALSPI (for QSPI). Looking for another MCU for QSPI...

David Littell
Senior III

Following this with interest but @tjaekel definitely wins the award for "Most Creative Font and Color Use".  😎

OK, I am closing this ticket.

Observations:

  1. the "byte bursts" depend on Debug code vs. Release code:
    if I set optimization to -g0 and -O3 - it happens later.
    But it happens: the longer the QSPI transaction, or the faster the speed - the following data words come as "byte burst" (not as words burst, never mind what the FIFO setting is).
  2. The GPIO Speed setting has FOR SURE a dramatic impact, if it is working or not (Speed setting as "fastest" makes it working for faster QSPI speed, slower GPIO speed makes the QSPI failing).

I am closing this ticket, even I did not have a clue how to avoid the "byte burst" and why GPIO speed setting matters so match to see a SCK signal on scope.

KDJEM.1
ST Employee

Hello @tjaekel ,

Thank you these interesting details and explanations.

It is mentioned in the datasheet that the Octo-SPI  pins support 'very high' and 'high' functionality. I think the AN5050 precisely section 6.2.3 OCTOSPI GPIOs and clocks configuration can help you to configure the OCTOSPI GPIO pins.

For the byte burst issue could you please try to disable the optimization. Note that it is recommended to use compiler optimization level –O0 when building a project that must be debugged. Debugging with optimization level –Og may work but higher optimization level is hard to debug because of compiler code optimization. For more details please refer to  UM260 "STM32CubeIDE user guide" section 3. Debug.

For STM32 U5, any of four different clock sources (SYSCLk, MSIK, pll1_q_ck, pll2_q_ck) can be used for the OCTOSPI clock source. So, PLL3 can't be used as OCTOSPI clock source.  

Thank you for your contribution in STCommunity 🙂.

Kaouthar

 

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Thank you.
Yes, AN5050 says clearly on page 27:

Note:     All GPIOs have to be configured in very high-speed configuration.

This is the answer (and confirms what I have realized).

I am aware of the debug and optimization flags (for debug I use -g3 and -Og).
I was trying to get rid of the byte bursts on QSPI. And this optimization setting has a small influence: byte bursts happen later now (when set for -none and -O3).

Just not yet successful to have a "gap-less" stream of words on QSPI. After a while it turns into "byte bursts" (two clock cycles for 8bits but a gap between these bytes, even all as 32bit words). Using the FIFO (other thresholds) does not help.
But OK: it works still (waveform is correct). All fine for now.

DMA.  The Core can't keep up at the higher QSPI clock rates.