2019-02-22 12:37 AM
I've run across what looks like a bug in the STM32H743IIK (144pin BGA) related to SPI DMA. It is a very strange one, and you can see the history of this issue on the ChibiOS forums here:
http://www.chibios.com/forum/viewtopic.php?f=16&t=4140&start=80
I do not believe this issue matches any existing errata.
The issue is that for some very specific clock tree settings (which are seen as valid in STM32Cube), the result of a SPI DMA will interleave zero bytes between each valid output byte. So if you expect to get 0xaa 0xbb 0xcc in a SPI response from a SPI peripheral then in fact you will see 0x00 0xaa 0x00 0xbb 0x00 0xcc in the DMA receive memory.
I have only reproduced this on a board with a 16MHz HSE. I have two other boards, one with a 24MHz HSE (a LFQ part) and of course the Nuceo with a 8MHz HSE.
The critical setting that causes this issue is D2PRE2, which controls the APB2 peripheral and timer clocks. I've only seen the issue when this is set to DIV2. I attach the full STM32Cube config file to this issue.
If I change D2PRE2 to DIV1 then it all works perfectly, and all SPI buses handle DMA fine.
With DIV2 at least SPI buses 1 and 2 get the interleaved zero issue.
The bug was reproduced under ChibiOS 19.1-stable. I can provide a git repo with full source if needed.
Kind regards,
Andrew Tridgell
ArduPilot dev team
2019-02-22 12:51 AM
Can you please post the problem it in terms of the relevant RCC and SPI and DMA registers content, possibly as read out while the problem persists?
JW
PS Oh, H7, so also DMAMUX
2019-02-22 03:04 AM
well, I could gather all the register values if really needed (if so in what format? hex dumps? named vars?)
but critical things are:
This is code that runs great on two other boards with different HSEs, and also runs great if D2PRE2 is DIV1. When I run with the 8 or 24MHz HSE I use the same clock tree, but change DIVM1, DIVM2 and DIVM3 to keep the rest of the clock tree the same. The change of making D2PRE2 go from DIV2 to DIV2 just changes the peripheral clock for APB2 from 96MHz to 48MHz.
2019-02-22 03:08 AM
oh also, happens with AXI SRAM, SRAM1, SRAM2, SRAM3 and SRAM4. DCache is enabled. Appropriate flush and invalidate ops are used. Register values for SPI and DMA all confirmed with debugger at initiation of transfer.
2019-02-22 05:15 AM
> well, I could gather all the register values if really needed
I personally am not going to jump on this as I'm not interested in the 'H7 at the moment; but providing *complete* information increases chances that somebody will be willing to look at it.
For example you hint that this is clock tree dependent, but you don't give definitive information on how this is set (don't expect everybody uses CubeMX and/or is willing to guess what other unpublished code might impact the clocks settings).
You also said that "DMAMUX setup doesn't seem to matter" but how do you know? For me this looks like two DMA transfers per one trigger from SPI, and it's DMAMUX which processes triggers from peripherals and forwards them to DMA(s).
> (if so in what format? hex dumps? named vars?)
Any, provided the info is complete; but if you can make it decoded to names, it makes life easier.
JW
PS. If you "manually" read two bytes out of SPI upon each incoming byte, is the second read byte 0x00?
2019-02-22 10:33 AM
One more question: the DMA's NDTR indicates what, the number of bytes arrived on SPI, or the (twice as much) number of bytes stored in RAM?
JW
2019-02-23 10:49 PM
@JW, your question made me realise I'd explained the issue rather badly. I explained it a bit better in the (long) ChibiOS forum thread, but my description above is quite misleading.
Using the example of a SPI slave device that will return bytes of 0xaa 0xbb 0xcc 0xdd etc etc, then the following happens:
Also note that in all these examples it is full-duplex SPI against a device that has a register based access method. So when I say "read 3 bytes" it actually does a 4 byte SPI transfer where the first byte is the register number in the SPI slave. This is important as it means that DMA rx engine is actually being given 4 bytes to feed into memory, specifically 0x00 0xaa 0xbb 0xcc, and the actual bytes that appear in memory when doing a 6 byte read (which is a 7 byte transfer) is 0x00 0x00 0xaa 0x00 0xbb 0x00 0xcc. The DMA engine and the SPI peripheral have no way of knowing the register read conventions of the SPI slave device.
So the SPI with DMA transfer is injecting extra 0x00 bytes into the transfer after each correct byte, and then stops the transfer when the total number of bytes requested is reached.
My apologies for the lousy explanation in the original posting.