cancel
Showing results for 
Search instead for 
Did you mean: 

How to make QSPI transmission without delay between Command and Data?

Irek
Associate III

I need to communicate with FPGA by QuadSPI. I use the NUCLEO-H743ZI2 and FreeRTOS with STM32Cub IDE.
So my code to write data to FPGA looks next:

uint32_t qspi_write(const uint8_t* RXbuf, uint8_t* TXbuf, uint16_t* tx_length) {
	sCommand.InstructionMode 	= QSPI_INSTRUCTION_NONE; /* Specifies the Instruction Mode: value of @ref QSPI_InstructionMode */
	sCommand.Instruction		= 0; /* Specifies the Instruction to be sent: value (8-bit) between 0x00 and 0xFF */
	sCommand.AddressMode		= QSPI_ADDRESS_4_LINES; /* Specifies the Address Mode: value of @ref QSPI_AddressMode */
	sCommand.AddressSize		= QSPI_ADDRESS_32_BITS; /* Specifies the Address Size: value of @ref QSPI_AddressSize */
	sCommand.Address			= *(uint32_t *) RXbuf; /* Specifies the Address to be sent (Size from 1 to 4 bytes according AddressSize): value (32-bits) between 0x0 and 0xFFFFFFFF */
	sCommand.AlternateByteMode	= QSPI_ALTERNATE_BYTES_NONE; /* Specifies the Alternate Bytes Mode: value of @ref QSPI_AlternateBytesMode */
	sCommand.AlternateBytesSize	= QSPI_ALTERNATE_BYTES_8_BITS; /* Specifies the Alternate Bytes Size: value of @ref QSPI_AlternateBytesSize */
	sCommand.AlternateBytes		= 0; /* Specifies the Alternate Bytes to be sent (Size from 1 to 4 bytes according AlternateBytesSize): value (32-bits) between 0x0 and 0xFFFFFFFF */
	sCommand.DummyCycles		= 1; /* Specifies the Number of Dummy Cycles: number between 0 and 31 */
	sCommand.DataMode			= QSPI_DATA_4_LINES; /* Specifies the Data Mode (used for dummy cycles and data phases): value of @ref QSPI_DataMode */
	sCommand.NbData				= (uint32_t) *tx_length; /* Specifies the number of bytes to transfer: value between 0 and 0xFFFFFFFF (0 means undefined length until end of memory)*/
	sCommand.DdrMode			= QSPI_DDR_MODE_DISABLE; /* Specifies the double data rate mode for address, alternate byte and data phase: value of @ref QSPI_DdrMode */
	sCommand.DdrHoldHalfCycle	= QSPI_DDR_HHC_ANALOG_DELAY; /* Specifies if the DDR hold is enabled: value of @ref QSPI_DdrHoldHalfCycle */
	sCommand.SIOOMode			= QSPI_SIOO_INST_EVERY_CMD; /* Specifies the send instruction only once mode: value of @ref QSPI_SIOOMode */

	configPRINTF( ("qspi: Command to Address = 0x%08x\n", sCommand.Address) );
	HAL_StatusTypeDef result = HAL_QSPI_Command(&hqspi, &sCommand, QSPI_TIMEOUT_VALUE);
	if ( result != HAL_OK ) {
		qspi_error_handler(result);
    	return result;
	}

	configPRINTF( ("qspi: Transmit NbData = %d\n", (uint16_t) *tx_length) );
//	result = HAL_QSPI_Transmit(&hqspi, TXbuf, QSPI_TIMEOUT_VALUE);
	result = HAL_QSPI_Transmit_IT(&hqspi, TXbuf);
    if ( result != HAL_OK ) {
    	qspi_error_handler(result);
    	return result;
    }
	return HAL_QSPI_ERROR_NONE;
}

As result, I see by Logic Analyser the next picture:
Untitled.png
The first burst of clocks corresponds to the transmission if Command, the 2nd - to the Data. The gap between two bursts  is about 0.76 mks in the case of using of HAL_QSPI_Transmit_IT and 1.5 mks in case of HAL_QSPI_Transmit which is somehow strange.
But the main question how I have to organize QSPI transmission without delay between Command and Data?


18 REPLIES 18
Irek
Associate III

I have found, that if I set hqspi.Init.ClockPrescaler to relatively large value 200, then I do not see this gap by logic analyser. But it is always there at low ClockPrescaler values like 4-20. And I have in Clock Configuration that HCLK3  = 240MHz goes to QSPI.

Moreover, I ran a similar test in MBed OS - and the behavior was exactly the same!

Therefore, it seems that the problem is STM32H7 QSPI module hardware - it takes at least 0.8-1.6 mks to switch from Command to Data transmission. So the figures of the data transmission in the Indirect Mode at high speeds in STM's manuals are not correct. Or STM should provide some explanation how to do it without gaps...

 

TDK
Super User

Don't know how much it helps but i see no gaps using qspi on the H723. Clock rate something like 50 MHz. Was wondering if it could be a cache thing.

Also couldnt see anything in RM to explain this.

If you feel a post has answered your question, please click "Accept as Solution".
Irek
Associate III

@TDK

Thank you! I'll look at the cache section of the RM. And next week will do a new tests. 

Irek
Associate III

Hello,

I returned to this strange problem with delays in QSPI transmission. I succeeded to enable the I and D cache in my project - it was a problem with TCP+:  https://forums.freertos.org/t/freertos-tcp-stm32h743-dcache-doesnt-work/23167/10 
But this do not help with QSPI. Even more - I tried to send as a data 12 bytes. Result is next:

Untitled.png

I see that data is transmitted by 4-bytes chunks separated by similar delays as a Command. And this strange behaviour do not depend on hqspi.Init.FifoThreshold value - it can be 1 or 4 doesn't matter. All transmission takes 3.8 mks instead of 0.57 mks - this is awful! 

Any idea what I can do? 

 

 

TDK
Super User

The code probably just isn't keeping up. I didn't realize you were using blocking and IT methods. If you use DMA, or memory mapped mode, these should disappear.

If you feel a post has answered your question, please click "Accept as Solution".
Irek
Associate III

@TDK 

I did not find in RM or AN4760 any sentence that using blocking and IT methods will lead to such strange data transfer. But Ok - I still need find a way to make QSPI working. Today I studied how to enable MDMA with QSPI. I did everything - and result again strange. The HAL_QSPI_Transmit_IT(&hqspi, TXbuf) line of code transfers data on the QSPI lines as I demonstrated above. However, when I replace it with HAL_QSPI_Transmit_DMA(&hqspi, TXbuf), I cannot see any data transferred by the logic analyzer.
Any idea why it can be?

Is DMA configured?

If you feel a post has answered your question, please click "Accept as Solution".
Irek
Associate III

I think yes. Here is some code:

// In the main.c 
  hqspi.Instance = QUADSPI;
  hqspi.Init.ClockPrescaler = 4;
  hqspi.Init.FifoThreshold = 1;
  hqspi.Init.SampleShifting = QSPI_SAMPLE_SHIFTING_NONE;
  hqspi.Init.FlashSize = 31;
  hqspi.Init.ChipSelectHighTime = QSPI_CS_HIGH_TIME_1_CYCLE;
  hqspi.Init.ClockMode = QSPI_CLOCK_MODE_0;
  hqspi.Init.FlashID = QSPI_FLASH_ID_1;
  hqspi.Init.DualFlash = QSPI_DUALFLASH_DISABLE;
// In the stm32h7xx_hal_msp.c
    /* QUADSPI MDMA Init */
    /* QUADSPI_FIFO_TH Init */
    hmdma_quadspi_fifo_th.Instance = MDMA_Channel0;
    hmdma_quadspi_fifo_th.Init.Request = MDMA_REQUEST_QUADSPI_FIFO_TH;
    hmdma_quadspi_fifo_th.Init.TransferTriggerMode = MDMA_BUFFER_TRANSFER;
    hmdma_quadspi_fifo_th.Init.Priority = MDMA_PRIORITY_VERY_HIGH;
    hmdma_quadspi_fifo_th.Init.Endianness = MDMA_LITTLE_ENDIANNESS_PRESERVE;
    hmdma_quadspi_fifo_th.Init.SourceInc = MDMA_SRC_INC_BYTE;
    hmdma_quadspi_fifo_th.Init.DestinationInc = MDMA_DEST_INC_DISABLE;
    hmdma_quadspi_fifo_th.Init.SourceDataSize = MDMA_SRC_DATASIZE_BYTE;
    hmdma_quadspi_fifo_th.Init.DestDataSize = MDMA_DEST_DATASIZE_WORD;
    hmdma_quadspi_fifo_th.Init.DataAlignment = MDMA_DATAALIGN_PACKENABLE;
    hmdma_quadspi_fifo_th.Init.BufferTransferLength = 1;
    hmdma_quadspi_fifo_th.Init.SourceBurst = MDMA_SOURCE_BURST_SINGLE;
    hmdma_quadspi_fifo_th.Init.DestBurst = MDMA_DEST_BURST_SINGLE;
    hmdma_quadspi_fifo_th.Init.SourceBlockAddressOffset = 0;
    hmdma_quadspi_fifo_th.Init.DestBlockAddressOffset = 0;
    if (HAL_MDMA_Init(&hmdma_quadspi_fifo_th) != HAL_OK)
    {
      Error_Handler();
    }

    if (HAL_MDMA_ConfigPostRequestMask(&hmdma_quadspi_fifo_th, 0, 0) != HAL_OK)
    {
      Error_Handler();
    }

    __HAL_LINKDMA(hqspi,hmdma,hmdma_quadspi_fifo_th);

    /* QUADSPI interrupt Init */
    HAL_NVIC_SetPriority(QUADSPI_IRQn, 0, 0);
    HAL_NVIC_EnableIRQ(QUADSPI_IRQn);



Irek
Associate III

And the function which writes data:

uint32_t qspi_write(const uint8_t* RXbuf, uint8_t* TXbuf, uint16_t* tx_length) {
	sCommand.InstructionMode 	= QSPI_INSTRUCTION_NONE;
	sCommand.Instruction		= 0;
	sCommand.AddressMode		= QSPI_ADDRESS_4_LINES;
	sCommand.AddressSize		= QSPI_ADDRESS_32_BITS;
	sCommand.Address		= *(uint32_t *) RXbuf;
	sCommand.AlternateByteMode	= QSPI_ALTERNATE_BYTES_NONE;
	sCommand.AlternateBytesSize	= QSPI_ALTERNATE_BYTES_8_BITS;
	sCommand.AlternateBytes		= 0;
	sCommand.DummyCycles		= 1;number between 0 and 31 */
	sCommand.DataMode		= QSPI_DATA_4_LINES;
	sCommand.NbData			= (uint32_t) *tx_length;
	sCommand.DdrMode		= QSPI_DDR_MODE_DISABLE;
	sCommand.DdrHoldHalfCycle	= QSPI_DDR_HHC_ANALOG_DELAY;
	sCommand.SIOOMode		= QSPI_SIOO_INST_ONLY_FIRST_CMD;

	HAL_StatusTypeDef result = HAL_QSPI_Command(&hqspi, &sCommand, QSPI_TIMEOUT_VALUE);
//	HAL_StatusTypeDef result = HAL_QSPI_Command_IT(&hqspi, &sCommand);
	if ( result != HAL_OK ) {
		qspi_error_handler(result);
    	return result;
	}

//	result = HAL_QSPI_Transmit(&hqspi, TXbuf, QSPI_TIMEOUT_VALUE);
//	result = HAL_QSPI_Transmit_IT(&hqspi, TXbuf);
	result = HAL_QSPI_Transmit_DMA(&hqspi, TXbuf);
    if ( result != HAL_OK ) {
    	qspi_error_handler(result);
    	return result;
    }
}