2024-07-22 06:06 AM
Hello,
I am fairly new to STM32 (and ARM MCUs in general), and I need a bit of help with DMA/MDMA transfers.
I am using an STM32H747 dual core MCU and I intent to do some high-speed data sampling on the ADC (currently ~3 MSPS, going to 7-ish MSPS in the future) and process the data in the M7 core. Currently all peripheral configuration, communication etc. is done on the M4 core in order to keep the M7 free for signal processing.
Basically, I am getting 64 data samples from the ADC (clocked by a free-running timer) and whenever at 64 sample (128 byte) chunk is ready, I want to move the data into DTCM and trigger the processing on the M7 core using inter-core interrupts. The processing runs in parallel with the ADC getting the next 64 samples.
When data processing has completed, the resulting data should be placed back into SRAM (Accessible for the M4 core) for sending out over USART/whatever.
My current solution samples the input signal but moves it into shared SRAM (ridiculously slow) before doing the processing and data back-and-forth - so it is conceptually working, but slow. I can get it running much faster by activating caching on the M7 core, but this
a) requires me to do handle some cache-coherency issues, that I have not mastered yet
b) seems like an unreliable hack-type solution for a concrete problem
Can anybody help me allocate DTCM space for the data and setup the MDMA controller to do the data transferring to and from the SRAM upon ADC buffer conversion complete?
I should mention that for now I am sticking with the STM32CubeIDE and bare-metal (HAL) programming, until I’ve familiarized myself a bit more with the architecture.
Cheers,
Davla
Solved! Go to Solution.
2024-09-07 05:09 AM - edited 2024-09-20 06:25 AM
Hello
@Davla_CEKO wrote:
but moves it into shared SRAM (ridiculously slow)
I'm wondering how did you conclude that and how did you quantify this slowness. I invite you to look at this application note: AN4891 STM32H72x, STM32H73x, and single-core STM32H74x/75x system architecture and performance
@Davla_CEKO wrote:
but slow. I can get it running much faster by activating caching on the M7 core, but this
a) requires me to do handle some cache-coherency issues, that I have not mastered yet
You need to to handle cache coherency if AXI-SRAM or any SRAM except DTCM. Unfortunately, DMA has no access to DTCM. Please refer to the AN4839 Level 1 cache on STM32F7 Series and STM32H7 Series / especially the section 3.2 Example for cache maintenance and data coherency. But you can use MDMA to collect all the data collected by DMA from other SRAM to transfer it to the DTCM for CM7 process, In that case, since the data are in DTCM no need for cache maintenance as this memory has no cache on its path.
@Davla_CEKO wrote:
Can anybody help me allocate DTCM space for the data and setup the MDMA controller to do the data transferring to and from the SRAM upon ADC buffer conversion complete?
You can refer to the AN5001 STM32Cube Expansion Package for STM32H7 Series MDMA
2024-09-07 05:09 AM - edited 2024-09-20 06:25 AM
Hello
@Davla_CEKO wrote:
but moves it into shared SRAM (ridiculously slow)
I'm wondering how did you conclude that and how did you quantify this slowness. I invite you to look at this application note: AN4891 STM32H72x, STM32H73x, and single-core STM32H74x/75x system architecture and performance
@Davla_CEKO wrote:
but slow. I can get it running much faster by activating caching on the M7 core, but this
a) requires me to do handle some cache-coherency issues, that I have not mastered yet
You need to to handle cache coherency if AXI-SRAM or any SRAM except DTCM. Unfortunately, DMA has no access to DTCM. Please refer to the AN4839 Level 1 cache on STM32F7 Series and STM32H7 Series / especially the section 3.2 Example for cache maintenance and data coherency. But you can use MDMA to collect all the data collected by DMA from other SRAM to transfer it to the DTCM for CM7 process, In that case, since the data are in DTCM no need for cache maintenance as this memory has no cache on its path.
@Davla_CEKO wrote:
Can anybody help me allocate DTCM space for the data and setup the MDMA controller to do the data transferring to and from the SRAM upon ADC buffer conversion complete?
You can refer to the AN5001 STM32Cube Expansion Package for STM32H7 Series MDMA
2024-09-07 04:19 PM
ADC can store measurements over DMA, directly to RAM accessible by the M7 (with proper cache management, yes). Since DMA runs in parallel with the MCU core no copying is needed .