I have a board using the STM32F765, which uses the QSPI interface to fetch data from an FPGA. The normal sequence of operation is:
- A timer in the FPGA causes it to perform a sequence of events which produce a chunk of data, which is stored in internal FPGA memory
- FPGA asserts an output indicating to the STM32 that data is available
- This assertion causes an EXTI interrupt, which kicks off a chain of events which use the QSPI interface to read data from the FPGA. (The FPGA is programmed to emulate a serial Flash memory).
- The QSPI interface is set to indirect read mode, and DMA2 is used to read data from the QSPI FIFO and write it into standard (not DTCM) RAM.
- Once each DMA transfer is complete, the ISR cleans and invalidates the relevant portions of the data cache (by address).
- There are usually several blocks of data to read at a time. Once they've all been read, the FPGA is reset (via a separate SPI interface), a flag is set to indicate to the main application that a block of data is available to process.
- The main application processes the data, then sits and waits for the next block, and so on.
The problem I'm seeing is that, just occasionally, the QSPI chip select goes active after all the data has been read from the FPGA. The next time the QSPI interface is used, its status register indicates that it's busy and has a full FIFO, as if a read operation has been started, but no DMA has been set up to actually put the results somewhere.
I've spent the last day or so using a scope and some GPIO signals to determine what is happening when, and here's where it gets really interesting. I now know that the spurious QSPI activation is not caused by the code which intentionally initiates QSPI reads. Instead, three conditions must be met in order to cause it:
- 1) The main code must be actually accessing the data from the FPGA, which is in cacheable RAM (SRAM1);
- 2) The data cache must be enabled;
- 3) The SysTick interrupt handler must have just exited.
The SysTick handler is very simple; usually it just sets a few flags and increments some counters, and occasionally it generates some debug output (though this has no effect on whether the spurious QSPI event is triggered). Nevertheless, QSPI CS goes low within a few nanoseconds of the handler exiting.
If I turn the data cache off, then all is well and there are no spurious QSPI events.
If I leave it on while data is being fetched from the FPGA, but turn it off while processing, then that's OK too.
Calling SCB_CleanInvalidateDCache() at the end of the SysTick handler makes no difference.
Putting the data from the FPGA into DTCM RAM does fix the symptoms, but I don't know why.
My ISR normally leaves the QSPI interface enabled. If I turn QSPI off by writing 0 to the control register once it's finished with, then this does prevent spurious QSPI transactions from occurring - but since I don't know why they're occurring in the first place, I can't be sure there isn't something else bad also happening for the same reason.
I never get a spurious QSPI event when the main loop is sitting waiting for new data to arrive; only while it's actually working on that data. This is actually quite a short window of time; if I turn off hardware floating point support, which makes the processing take longer, then spurious QSPI events can occur within a wider time window after each block is received.
So, I have a few workarounds for the spurious QSPI events: turn off the data cache (at least while the data is being processed), move the data into DTCM, or turn off the QSPI interface when it's not being used. None of these really explain the problem, though they do make the symptom go away.
My best guess is that exiting the SysTick handler while the cache contains data from SRAM1 is causing a number of cache operations to occur, and one of these is writing to QUADSPI->AR, or triggering the QSPI interface in some other way.
It *almost* feels like some obscure erratum, ie. "QSPI interface can be triggered by data cache operations on return from interrupts", but I'd rather fix my code than blame it on something that's "clearly" a hardware bug that nobody else seems to have noticed!
Any suggestions please, experts?