SPI Dasi Chain: Is it possible to capture a 10-byte DMA array transmission with 10 1-byte transmission?

Ole Thomsen · ‎2019-09-25

Hi there,

I'm trying to create a Dasi Chain setup with STM32 Nucleo boards using the HAL library.

My master should send a command to 10 slaves. It would be best to use DMA for the master to avoid blocking the mpu.

HAL_SPI_Transmit_DMA(&hspi1, command_array, 10);

The slaves must capture the byte one by one and send it to the next slave for the fastest transmission. Is this possible?

I tried to use

HAL_SPI_TransmitReceive_IT(&hspi1, &dataTx, &dataRx, 1);

and catch the interrupt with

void HAL_SPI_TxRxHalfCpltCallback(SPI_HandleTypeDef *hspi) {
	if(hspi->Instance == hspi1.Instance){
                temp = dataTx;
                dataTx = dataRx;
                dataRx = temp;
		HAL_SPI_TransmitReceive_IT(&hspi1, &dataTx, &dataRx, 1);
	}
}

This approach does not seem to be fast enough because the transferred data is not correct.

Thanks for help 🙂

Tesla DeLorean · ‎2019-09-25

That and each will be a byte delay behind.

Not sure the need for the swap, the current RX byte becomes the next TX byte.

I'd imagine you could just use the same DMA buffer, but the TX side pulling one byte earlier.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Ole Thomsen · ‎2019-09-25

You are right, the swap is pointless and it is true that there is a delay of one byte, but that is not exactly my problem. The second byte that the slave sends should be the first of the master, but it's different.

Ole Thomsen · ‎2019-09-25

I tested it with a slave that should transfer a 5-byte array from the master. The slave sends data that are not shifted by one byte, but wrong values.

S.Ma · ‎2019-09-25

Well, actually except on H7's SPI, the assumption of having a single shift register with bit in and out in the STM32 may not be right.

Some drawings show one incoming shift register, and one outgoing different shift register. No HW link between in and out.

The workaround that works for me is to set (for slaves) the DMA to have a cyclic buffer of say 64 bytes. DMA will write and read in this buffer (same one used for Rx and Tx channel). This will emulate by HW a 64 bytes long shift register for the SPI slave which is much more convenient to shift data around in a single round transaction with all slaves. Works like a charm on STM32L4 family (SPI with 32 bit FIFOs) with 1 master and up to 12 slaves.

Unfortunately, until there is a repo to drop code example, it's a bit difficult to share in this forum.

Here's the extract (using 16 bit SPI mode): (this runs in an interrupt based state machine, no blocking)

HAL_StatusTypeDef SPIP_SlaveSerialTransferStart_ISR(SPIP_t* pSPIP){
 (...) 
  if(HAL_SPI_TransmitReceive_DMA(pSPIP->hspi, (uint8_t*)pSPIP->pSlaveSerialPseudoRegister, (uint8_t*)pSPIP->pSlaveSerialPseudoRegister, sizeof(SPIP_SerialRegs_t)/2) != HAL_OK)
     TrapError();
  return HAL_OK;
}

Fill the buffer with what you want to send before the NSS falling edge

Read the buffer after NSS rising edge.

Payload is here fixed to 64 bytes (to avoid reaching the insane SW complexity limit).

S.Ma · ‎2019-09-25

SPI at 12 Mbps, so one transaction for up to 12 slaves will take 64x12x8/12=512 usec for exchanging full duplex 64 bytes of each slave.

The time to SW toggle NSS will start to be smaller order of magnitude vs data transfer time.

A simple example of full duplex SPI is to shuffle around different slave USART SW FIFO through SPI. Master can move slave 3 RX to slave 5 TX and slave 6 TX.,, with some additional coding. Average baudrate will depend on how often the SPI transaction occurs in the background. Of course, there is SW FIFOs between USART and SPI for this to work.