2013-06-04 07:59 AM
Hi all,
I just want to clear up my understanding of DMA and how it works with respect to communication peripherals such as SPI and I2C. First, let's say I have a function that does some matrix multiplication and saves the result to some value 'multResult'. Now I want to send that value over I2C to some other device in the network. I can use DMA to tell the micro to move this value into the I2C2->DR reg to save a little CPU time, but in order to actually SEND the data over the bus, I need to initiate the 'start condition' from the main thread therefore causing the I2C communication to take up CPU time. The DMA transfer cannot actually handle the communication itself, correct? On the other hand, if I sent the data over an SPI bus as opposed to the I2C bus, the DMA controller could write 'multResult' to the SPI1 Tx register, therefore triggering an SPI transfer. Am I correct in assuming that the SPI transfer taking place would still use CPU cycles in order to send the data across the bus? Or maybe I'm completely wrong, and I actually can use DMA to trigger and handle these transactions and save myself a LOT of CPU time. If this is the case, please let me know! That would mean that I just don't have my DMA configured quite right and I'll post some code. Thanks in advance! Edit: I'm using a STM32L152RB micro by the way2013-06-04 08:39 AM
Am I correct in assuming that the SPI transfer taking place would still use CPU cycles in order to send the data across the bus?
Well the CPU is going to keep running, but the SPI is a PERIPHERAL device, running from a APB clock which likely is ticking slower than the CPU, and further divided for lower baud rates. While the SPI clocks are synchronous and derived from the CPU's clocks, you should perhaps think about them as being relatively independent. Honestly I wouldn't bother with DMA on I2C unless you plan on bursting several dozen bytes, it would surely be more efficient to build a state machine in the interrupt handler and spin the CPU in a WFI loop if you had no useful work to do.
2013-06-04 08:54 AM
Thanks for your response clive1. Helpful as always.
If I understand you correctly, you're saying the SPI peripheral will continue to run because the APB clock is constantly ticking away, even if I'm not explicitly handling it in my main thread. As for your second comment, I actually have a lot of important work for the CPU to be doing. From a high level, my program is receiving data via SPI from an external device, doing a bunch of matrix math on that data, and sending the result to another external device so it can do its thing. This data is coming in very quickly, and my goal is to give as much time as possible to the CPU for all the math its doing. Right now, I'm running out of time before the next set of data starts coming in. And yes, I actually do plan on bursting about 2 or 3 dozen bytes every 7ms or so. I found some more info buried in the reference manual about how to use DMA with I2C that I must've overlooked. I'll make some changes based on that and see where that takes me.2013-06-04 09:24 AM
As for your second comment, I actually have a lot of important work for the CPU to be doing.
And that's fine, my point was that a WFI in an idle task addresses the CPU clock continuously ticking, and consuming the least power. And that it can mitigate the use of higher clock rates used to get the real work done. The use of DMA is going to be a balancing act, the cost of managing it is not insignificant. Using it to transmit several dozen bytes is definitely an ideal use case, it will deal with the peripheral in the background, and you can apply the CPU cycles to real work/computation. SPI also requires a lot less micro-management.
2013-06-04 01:21 PM
Ok, I'm having some trouble getting an SPI transmission to start with DMA. In the reference manual, it states the following:
''In transmission, a DMA request is issued each time TXE is set to 1. The DMA then writes to the SPI_DR register (this clears the TXE flag).'' But the TXE flag is set by default. Does this mean that I have to start the SPI using a call to SPI_I2S_SendData(SPI1, data) in order to kickstart the setting of the flag? Do I have to do some manual bit manipulation? Can I make this work by simply enabling/disabling the SPI peripheral when I don't want the program to be looking for TXE=1 (i.e. between transfers)? Here's some code: SPI Initializationvoid initSpi(void)
{
GPIO_InitTypeDef GPIO_InitStructure;
SPI_InitTypeDef spiInitStruct;
// Configure SPI comms for Titan3 and Taconite3
GPIO_InitStructure.GPIO_Pin = ((GPIO_Pin_3) | (GPIO_Pin_4) | (GPIO_Pin_5));
GPIO_InitStructure.GPIO_Mode = GPIO_Mode_AF;
GPIO_InitStructure.GPIO_OType = GPIO_OType_PP;
GPIO_InitStructure.GPIO_PuPd = GPIO_PuPd_UP;
GPIO_InitStructure.GPIO_Speed = GPIO_Speed_40MHz;
GPIO_Init(GPIOB, &GPIO_InitStructure);
// Set AF to SPI1
GPIO_PinAFConfig(GPIOB, GPIO_PinSource4,GPIO_AF_SPI1);
GPIO_PinAFConfig(GPIOB, GPIO_PinSource5,GPIO_AF_SPI1);
GPIO_PinAFConfig(GPIOB, GPIO_PinSource3,GPIO_AF_SPI1);
// Configure SPI chip-select for Taconite3
GPIO_InitStructure.GPIO_Pin = GPIO_Pin_3;
GPIO_InitStructure.GPIO_Mode = GPIO_Mode_OUT;
GPIO_InitStructure.GPIO_OType = GPIO_OType_PP;
GPIO_InitStructure.GPIO_PuPd = GPIO_PuPd_UP;
GPIO_InitStructure.GPIO_Speed = GPIO_Speed_2MHz; // may want a faster speed depending on update rates? ~0.5us period @2MHz
GPIO_Init(GPIOA, &GPIO_InitStructure);
spiInitStruct.SPI_Direction = SPI_Direction_2Lines_FullDuplex;
spiInitStruct.SPI_Mode = SPI_Mode_Master; //uCon works as master in SPI network (both asics are slaves)
spiInitStruct.SPI_DataSize = SPI_DataSize_8b; //
spiInitStruct.SPI_CPOL = SPI_CPOL_Low; //clock idle state level (Low = 0 when idle)
spiInitStruct.SPI_CPHA = SPI_CPHA_2Edge; //1st edge or 2nd edge synchronization
spiInitStruct.SPI_NSS = SPI_NSS_Soft; //NSS controled by hardware
spiInitStruct.SPI_BaudRatePrescaler = SPI_BaudRatePrescaler_2; //Fpclk = 1MHz -- / prescaler gives spiclk of 500khz (maximum allowed by ASICs)
spiInitStruct.SPI_FirstBit = SPI_FirstBit_MSB; //first bit transmitted is MSB
spiInitStruct.SPI_CRCPolynomial = 0x125; //supposedly optimal 8-bit crc for 32-bit words -- x^8+x^5+x^2+1
SPI_Init(SPI1, &spiInitStruct);
SPI1->CR2 = 0x0087; //configure interrupts, frame format, and DMA
SPI_I2S_ITConfig(SPI1, SPI_IT_TXE, ENABLE);
SPI_I2S_ITConfig(SPI1, SPI_IT_RXNE, ENABLE);
}
Relevant DMA initialization
void initDma(void)
{
DMA_InitTypeDef DMA_InitStructure;
NVIC_InitTypeDef NVIC_InitStructure;
/* DMA1 channel 2 configuration ---------------------------------------------*/
/* Enable DMA1 clock -------------------------------------------------------*/
RCC_AHBPeriphClockCmd(RCC_AHBPeriph_DMA1, ENABLE);
DMA_DeInit(DMA1_Channel2);
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t)SPI1_DR_ADDR;
DMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) &spiRxBuffer;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralSRC;
DMA_InitStructure.DMA_BufferSize = BUFFER_SIZE;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Enable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Byte;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_Byte;
DMA_InitStructure.DMA_Mode = DMA_Mode_Circular;
DMA_InitStructure.DMA_Priority = DMA_Priority_High;
DMA_InitStructure.DMA_M2M = DMA_M2M_Disable;
DMA_Init(DMA1_Channel2, &DMA_InitStructure);
/* DMA1 channel 3 configuration ---------------------------------------------*/
/* Enable DMA1 clock -------------------------------------------------------*/
RCC_AHBPeriphClockCmd(RCC_AHBPeriph_DMA1, ENABLE);
DMA_DeInit(DMA1_Channel3);
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t) SPI1_DR_ADDR;
DMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) &spiTxBuffer;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralDST;
DMA_InitStructure.DMA_BufferSize = BUFFER_SIZE;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Enable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Byte;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_Byte;
DMA_InitStructure.DMA_Mode = DMA_Mode_Circular;
DMA_InitStructure.DMA_Priority = DMA_Priority_High;
DMA_InitStructure.DMA_M2M = DMA_M2M_Disable;
DMA_Init(DMA1_Channel3, &DMA_InitStructure);
DMA_Cmd(DMA1_Channel2, ENABLE);
DMA_Cmd(DMA1_Channel3, ENABLE);
}
Then, my SPI transmission is triggered by an external interrupt (that part works just fine). This is the ISR I am now using to catch the interrupt, load the data into the DMA 'memory', and *hopefully* start the transaction.
void EXTI1_IRQHandler(void)
{
uint8_t tempRx = 0x00;
if(EXTI_GetITStatus(EXTI_Line1) != RESET)
{
//write to DMA buffer
spiTxBuffer[0] = readVd1; //read vd1
spiTxBuffer[1] = dummy;
spiTxBuffer[2] = dummy;
spiTxBuffer[3] = dummy;
EXTI_ClearITPendingBit(EXTI_Line1);
}
}
So just to recap... I was hoping writing to the DMA channel's memory location (spiTxBuffer) would start off the SPI, but this doesn't seem to be happening. Am I in the ballpark at least?
Thanks again.
2013-06-04 01:53 PM
At a quick glance
SurelyDMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) &spiTxBuffer[0];
or
DMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) spiTxBuffer;
Likely not circular, unless you want a continuous stream
2013-06-04 02:21 PM
Some more potentially-helpful info:
SPI is sending data in 8-bit packets. DMA memory location is a 4-element array containing uint8_t objects. I wasn't enabling the DMA1 AHB clock before. Fixed that now, but no change in operation.2013-06-04 05:21 PM
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Enable;
// No don't be incrementing.
Don't enable the interrupts, unless you're ready to handle them.
Not sure of the CR2 thing.
/* Enable the SPI Tx DMA request */
SPI_I2S_DMACmd(SPI1, SPI_I2S_DMAReq_Tx, ENABLE); SPI_I2S_DMACmd(SPI1, SPI_I2S_DMAReq_Rx, ENABLE);..
DMA_Cmd(DMA1_Channel2, ENABLE);
DMA_Cmd(DMA1_Channel3, ENABLE);
2013-06-05 02:05 PM
Got it working. Moved all the DMA enabling into the EXTI1 ISR after loading the values I want to send.
The only thing left is to play with the timing between bytes on the SPI bus. The device I'm sending data to requires a 7us delay between bytes and the DMA only ends up with a 3.8us delay at this clock speed. My initial thoughts for implementing this delay is to make each byte a seperate DMA request, but this seems like it could take up more CPU time than I want. Any other ideas would be much appreciated, but I think I'll play around with that idea for a while and see where it gets me. Thanks again clive1