SPI DMA request triggered by Timer-compare

FredS · ‎2024-06-11

Hallo everyone,

As engineer who started to explore the world of MCU's years after my retirement, I regularly encounter problems that puzzle me for some time. However, since I'm stuck for several days now, I decided to ask for assistance by this forum.

For my project, I selected a high-performance MCU, (STM32H723VGT6) mounted on a WeAct test-board, because of its price and atractive form-factor.
I want to digitize the analoge output of a linear CCD by an ADC with SPI interface. This ADC (ADS8319) needs a rising edge to start a conversion and >1.6us later it expects a serie of 16 Clock pulses to export its data. I have read an attractive solution for such challenge at the 'StackExchange' site, that sends dummy uint16 data to a SPI with DMA, triggered by a Timer.
A SPI in Full-Duplex mode, will produce 16 Clock-pulses to transmit an U16 word. The Timer must produce PWM pulses with the wanted frequency for the CCD readout. The rising edge of the pulse will trigger the ADC conversion, while the falling edge triggers the DMA transfer of a dummy uint16 to SPI, that produces the Clock-pulses needed by the ADC to export its data. The ADC's uint16 data is sent to the SPI MISO pin, to let it be transferred to the global data-array by another DMA action. The scheme below shows the timing:

Unfortunately, the organization of DMA on my MCU (BDMA, DMA with Mux, MDMA) is different from the STM32F746 on the 'StackExchange' forum, meaning the shown code snippets cannot be copied.

I configured SPI2 as Master in Full-Duplex mode with HW NSS signal handling, set 16bit frame-size, the baudrate and MSB-first. Despite many changes and different approaches, I have not been able to make SPI2 produce Clock-pulses, while the task looks quite simple: define and start a DMA action which, triggered by a Timer-compare event, writes a uint16 value to the SPI2 TXDR register.

I sincerely hope I made myself clear and that some readers are willing to think with me about how to crack this problem. Thank you in advance,

Fred Schimmel

RomainR. · ‎2024-06-14

Hello @FredS

I suggest you to start a CubeMX STM32H723 project with SPI2 + DMA1 configuration. Then check data full-duplex transfer with oscilloscope and ADS8319 response (with an know analog voltage input).

After that, use CubeMX to implement TIM8 OC or PWM interrupt handler in order to trigger externally ADS8319 at the falling edge and a second one to start SPI2 DMA transfert. Again use instrument to synchronize Rising/Falling OC on GPIO to trigger ADS8319 and SPI2 CLK/NSS.

In same time, could you share the project that you already did I can check it to see what is wrong?

Best regards,

Romain

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

FredS · ‎2024-06-15

Dear Roman,

Thank you for your reply on my question, I really appreciate you took your time for it.

I'm sorry to say, I don't understand how your suggestion in the first two lines of your reply should work. I can configure SPI2 with DMA for transmission (Tx) and reception (Rx), but have not a clue how to connect my ADS8319 ADC to this configuration to achieve readouts of the ADC (which should be about a fixed number for a known fixed analoge input).

In your second alinea, you mention "implement TIM8 OC or PWM interrupt handler". My question just arose from the wish to bypass the use of interrupts on this time-critical appliance.

I would prefer a solution with a tiny, fixed delay, identical to the subject in a threat of another forum:

https://electronics.stackexchange.com/questions/353152/stm32f-how-to-config-dma-transfer-to-spi-triggered-by-timer

From the suggested solution I understood the following concept:

configure a timer (TIM8 in my case) to create PWM pulses,
configure DMA for TIM8,
define the period equal to the ADC repetition rate,
connect the PWM-output to the ADC 'CNVST' input (= trigger for conversion),
define the pulse-length to the time difference between the ADC trigger (= rising edge) and the start event of the TIM-DMA (falling edge),
configure the TIM-DMA to retrieve a constant uint16 value (from a prefixed address, no pointer increment) and transfer this value to SPI Tx-DMA register 'hspi2.Instance->TXDR',
configure the SPI2-DMA-Rx to receive uint16 values on its RXDR register and transfer this value to the next index of a global data-array 'g_CCD_Buff'.
a value on a SPI_TXDR register means the peripheral will transmit the bit-pattern on its MOSI-pin (= not connected), synchronous with clock-pulses on the SPI_CLK pin, which is connected to the ADC_CLK pin.
if the duration of the TIM8 pulse is more than the ADC conversion time, the ADC will output its conversion value at its 'SDO' pin, on the rythm of the pulses on its CLK input,
the ADC 'SDO' pin is connected to the SPI_MISO pin (so hspi2.Instance-> RXDR receives the ADC output),
as SPI2 has a DMA configured for its Rx channel, the received value will become transferred to the next index of the global data-array.

I understand most actions described above and know how to implement them. But I I cannot figure out how to define a Timer-DMA to perform its action on a SPI peripheral.

Or, in case I understand it all wrong: how I can let a TIM8- OC1 event trigger a SPI-Tx DMA to execute.

I hope I made myself more clear now,

best regards,

Fred Schimmel

BarryWhit · ‎2024-06-15

Hi Fred,

Your post is quite complex and I've spent a good long while attempting to figure out what you're actually trying to do.

[Heavily pared down to try and simplify matters]

Is it fair to summarize your post as:

"I want to read one 16bit word from SPI (with CSn asserted) every N microseconds,

and I want to use DMA to do it so that the incoming data is stored into a memory buffer"

...Is that it?

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.

FredS · ‎2024-06-15

Dear Barry,

Thank you for your reply.

I will try to give my reactions on the four sections of your reply, one by one.

About 1): I have learnt the trick to transmit a dummy word on a Full-Duplex SPI to force the SPI peripheral to produce clock pulses that are needed for the ADC to output its conversion word. And the SPI-Rx side receives the ADC word and transfers that data to an array. It was mentioned that "Receive Only Master" expects external clock-pulses fed to its SCLK pin to synchronize. I may experiment with your suggestion, but in that setup I still need to trigger the SPI-rRx action to read the ADC data (which I don't know how).

About 2): In my text I didn't explain the complete setup of the application. There are 120,000us are needed for the complete readout of 3700 CCD-pixels. It takes 8us to convert a pixel, so 3700*8us = 29,600us for digitizing all. In the past I encountered problems with the export of collected data as ASCII stream, converting new data, so I decided to do it sequentially. Writing 3700 uint16 values as hex ASCII chars on a 921,600 Baud port takes ~80.3ms => one complete readout takes 109.9ms, rounded to 120ms.

About 3): Apparently, I failed to explain my reasoning (valid or invalid). I want to use the TIM8 PWM for two things:

the rep.rate of the ADC trigger. Each new pulse should trigger a new conversion of a CCD pixel voltage, where the period is determined by the ARR register.
the delay between the ADC trigger (start of the conversion by the rising edge) and the readout of the ADC data (by triggering a SPI-DMA action on the falling edge), the delay equals the pulse length, determined by CCR1.

The SPI clock frequency is only important.to be fast enough to read 16 bits from the ADC within 8us - ADC-conversion time (= ~6.2us), but not too fast to allow my HW to keep the pulses separated. And, as I already made clear, I want to read 3700 samples, per cycle, not one.

About 4): Maybe I'm wrong, but in my perception the suggested solution in the stackexchange example defines a Timer-DMA that is triggered by TIM8.OC1REF, just as you describe in the first two sentences of your point 3.

The next step is to configure this DMA transfer for another peripheral (SPI2), by the specification of the address of a uint16 constant as 'source pointer' and the the address of SPI2.TXDR as 'destination pointer'. Until here I think I understand this approach, but I miss how to program the last part: how to start the Timer-DMA.

I hope I finally managed to make clear my ideas,

many greetings from the Netherlands,

Fred Schimmel

BarryWhit · ‎2024-06-16

> It was mentioned that "Receive Only Master" expects external clock-pulses fed to its SCLK pin to synchronize

This is wrong. The definition of "Master" in SPI is the side that controls SCLK. CubeMX offers both "Receive-only master" and "Receive-only slave". The difference between them is precisely whether the MCU is in charge of SCLK.

> I still need to trigger the SPI-rRx action to read the ADC data (which I don't know how).

a DMA read from the proper SPI register will trigger an SPI read by the peripheral, and after the peripheral acknowledges with the data, the DMA will place the result in memory.

> In the past I encountered problems with the export of collected data as ASCII stream, converting new data, so I decided to do it sequentially.

> Writing 3700 uint16 values as hex ASCII chars

I don't understand this. You mean you had trouble transferring binary data over a serial terminal? that's a software issue on the host side, not the STM. At least on linux, you can set the terminal to "raw mode" for binary data. I'm sure there's an equivalent in every OS.

> It takes 8us to convert a pixel,

Where did this number come from? The datasheet says max conversion time is 1400ns, and min acquisition time (data readout time) is 600ns. In particular, your can clock out the 16 bits at 33Mhz (30ns min SCLK period for 3.3V VDD). Which the STM32H723 can easily do, and this takes ~600ns. Perhaps it's not realistic to expect to hit this optimal point exactly with an MCU (you could with an FPGA), but I think your timing budget is off.

> on a 921,600 Baud port takes ~80.3ms => one complete readout takes 109.9ms, rounded to 120ms.

FYI, USB->Serial ports can easily run at 1/2/4Mbps and even 8Mbps. The STM32H723 also has HS USB which can easily do 16bit*500ksps .

If your timing budget is tight, there's no need to separate SPI reception and UART transmission into distinct phases, it can in principle be done concurrently (word by word for example).

It is HIGHLY recommended that your egress channel (UART or maybe USB in the future) be at least a bit faster then your ingress channel (SPI). The required throughput depends on your target sample rate for the ADC.

It might be possible to cobble a together ADC->DMA->(SPI->MEM)->DMA->(MEM->UART) using the DMAMUX's DMA request generator and request chaining (no CPU involvement). I'm not sure, but it would be an interesting exercise to try. For this, egress throughout must be higher then ingress (real-time streaming).

> Each new pulse should trigger a new conversion of a CCD pixel voltage, where the period is determined by the ARR register.

The DS is a little hard to parse, but it looks like this ADC has several modes.

In "3-wire without busy" mode, IIUC, a conversion is started whenever CSn is deasserted (the SPI bus is idle). If you time your SPI read till at least tcnv later, you can have the STM32 read out the data at 33Mhz. All you need to do is figure out the right period in which to trigger the reads, and anything more than ~2us should be ok. I really think that might be all that's required.

Alternatively, in "3-wire with busy" mode you can have the ADC output an interrupt signal when data is ready, and you can configure the DMAMUX request generator to use this signal that to trigger a DMA write. This could work as well.

I just don't think using "Timer Output Compare" is a good solution here.

> I miss how to program the last part: how to start the Timer-DMA.

I'm gonna assume you'll relinquish the output compare idea.

For a timer to periodically trigger a DMA transfer, follow the StackExchange code.

The important steps are:

1. link the DMA to the timer with __HAL_LINKDMA() (or some such).

2. Set DMA mode (Peripheral-To-Memory)

2. program the DMA with src/dest addresses and width with HAL_Start_DMA.

3. use __HAL_TIM_ENABLE_DMA(&htim, TIM_DMA_UPDATE); to allow the update event of the timer

to trigger DMA (this is important)

4. start the timer.

You cannot do (all of) this with CubeMX code generation.

Every time the timer overflows, the DMA will issue a DMA request to the SPI peripheral, which will

read the number of bits configured at the baud rate configured. When the data is read, it will signal the

DMA peripheral, which will read the data and store it in memory. That's it.

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.

waclawek.jan · ‎2024-06-16

You appear to want to run before having learned to walk.

The 'H7 are overcomplicated beasts and now you have to cope with many concepts at once.

> I cannot figure out how to define a Timer-DMA to perform its action on a SPI peripheral.

First, have a look at Figure 1 System architecture in the RM. The MDMA is probably not very helpful in this situation, it's mostly aimed at heavy lifiting in the AXI domain. BDMA is mainly aimed at working autonomously, when the rest of the chip is in sleep, within the low-power domain, and it doesn't have access to peripherals beyond that domain (except AHB3/APB3).

Thus assuming you are going to use one of the DMA1/DMA2, set up trigger first. In the timer, enable DMA from the source you want to (Update, or one of the Capture/Compare channels, by setting respective TIMx_DIER.UDE/CCxDE). In DMAMUX1, look up in Table 118. DMAMUX1: assignment of multiplexer inputs to resources the trigger from timer, and write it to one of the DMAMUX1_CxCR.DMAREQ_ID - the x there then determines, to which DMA Stream will be this request routed.

You then set up that DMA Stream to perform the transfers from memory (beware of caching issues) to given SPI data register or FIFO (the 'H7 SPI is again an overcomplicated beast, much more complicated than SPI in 'F7 or other families; and I am not familiar with it so can't give specific clues for that) by setting the SPI data register address in DMA Stream's Peripheral Address register, memory buffer address in Memory Address register, number of transfers in NDTR, set the appropriate direction, transfer size, etc. in Control register. You don't need to use FIFO at this point.

This may or may not be that simple to click in CubeMX. I don't know, I don't use Cube/CubeMX. Generally, Cube inevitably implements only a fraction of what the hardware is capable of - whatever Cube's authors deemed "typical" - and is helpful as long as you want something from that fraction. Otherwise it may or may not get into your way more than help. Now you've been warned.

JW

FredS · ‎2024-06-16

Good day Jan,

Thank you very much for your straight forward response in my help request. Over time I have read many of your replies on questions, posed by me and by other members. From those texts I conclude you are a "no-nonsens" guru, with a broad overview and willing to advise people with all kind of skill-levels. And despite your aversion for CUBEIDE you respect people who need such framework (like me) to get their application running and still guide a way towards understanding and/or a solution. Thank you very much for such attitude!

BarryWhit and you gave me a lot of directions, corrections, and advices which I have to chew on. It will take me some time to comprehend the new information and then implement and test a new approach, so no updates on this forum don't mean ignorance on my side.

Thank you both a lot for your attempts to get me on the right track,

many greetings,

Fred Schimmel.

FredS · ‎2024-06-16

Dear Barry,

Thank you for your extensive responses on my reactions.

About "It was mentioned that "Receive Only Master" expects external clock-pulses fed to its SCLK pin to synchronize. This is wrong. The definition of "Master" in SPI is the side that controls SCLK.":

This is new for me and very good to know, thanks.

About " > In the past I encountered problems with the export of collected data as ASCII stream, converting new data, so I decided to do it sequentially. & > Writing 3700 uint16 values as hex ASCII chars":

I didn't express myself clearly. I could create a stream of bytes that, grouped 2 by 2, represent the uint16 values from an ADC. My main issue were: a) how to group the bytes to uint16, b) how to distinguish the start of a new CCD readout in the stream of numbers and c) how to recover from hickups (rare, but nasty).

About "> It takes 8us to convert a pixel, Where did this number come from?":

This was a wrong statement. You are right, the conversion + readout only takes 1400 + 600ns. Instead, it is the timing of the CCD itself that determines the 8us rep.rate. This can be increased, but in this stage I chose to take my time for the various steps, this may be changed later on when I get the system working.

About "> on a 921,600 Baud port takes ~80.3ms => one complete readout takes 109.9ms, rounded to 120ms. &

FYI, USB->Serial ports can easily run at 1/2/4Mbps and even 8Mbps. The STM32H723 also has HS USB which can easily do 16bit*500ksps.":

I experienced such high speed streaming before, with an 'USB_OTG_FS' peripheral, but didn't manage to reproduce it, so I chose to apply an UART, which documentated max speed is 921,600Baud.

About "> Each new pulse should trigger a new conversion of a CCD pixel voltage, where the period is determined by the ARR register.":

The mode you suggest ("3-wire without busy") was/is my choice also.

You state: a conversion is started whenever CSn is deasserted (the SPI bus is idle). If you time your SPI read till at least tcnv later, you can have the STM32 read out the data at 33Mhz. All you need to do is figure out the right period in which to trigger the reads, and anything more than ~2us should be ok.

This is just where I want to apply TIM8-CH1 for: the PWM pulse is connected as CSn (also idicated as CNVST) and the OC1REF event must invoke the SPI-RX DMA (I learned this from you, so SPI mode must be 'Receive, Master Mode') to read the ADC data. The pulse length is tuned to the required delay (<=2us).

About "> I miss how to program the last part: how to start the Timer-DMA.":

I will follow your instructions carefully and report my results after I testing.

Thank you very much for your attention for my problem, I really appreciate it.

Greetings from the Netherlands,

Fred Schimmel

BarryWhit · ‎2024-06-16

Dear Fred,

>> a conversion is started whenever CSn is deasserted

> This is just where I want to apply TIM8-CH1

Fred, I've done my best to communicate to you my belief (It is ever possible that I'm the one in the wrong of course) that you have a misunderstanding of how this parts works. I'll try one last time. You seem to think that the 1.4us delay required between toggling CSn and the start of clocking out the data means that you have to toggle the CSn separately, carefully manage a delay for 1.4us and only then trigger a SPI read. This is not what the (Obfuscating) datasheet says or what its timing diagrams show. You're thinking about in the wrong way.

The CSn logic starts the conversion when you *deassert* CSn (i.e. when it goes High) and it is only when you issue an SPI read that the SPI peripheral will assert CSn (set it low). So *The conversion doesn't start just before you issue a read, it starts as soon as the previous read concludes*, i.e. when the SPI peripheral releases the bus. Thus, you don't have to do anything special to trigger a conversion, except to space your SPI reads sufficiently apart for the ADC to complete a full conversion during the "Idle time" between SPI transfers.

I hope this drives the point home. If it doesn't - I surrender. :)

Finally, I will also counsel you that ST has parts (cheaper, simpler, and less power-hungry ones than the H723) that include two 4msps 12bit (16bit with oversampling) ADC. I've recently used the G431 for a pet project and was very happy with it (the G474 is its beefy bigger brother). WeAct sell very affordable boards with both these parts, and switching to those might just simplify your design a great deal. Of course this is assuming your CCD isn't part of some specialized module that bakes in the ADC chip as the sole interface to the sensor.

Good luck.

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.