cancel
Showing results for 
Search instead for 
Did you mean: 

QSPI Quad Read Commands Only Receiving 16 Bytes and Low Read Frequencies

Jaboop
Associate II

We are creating a driver for a W25Q128JV Winbond Nor Flash chip. We are working with a STM32 H743ZI2 board and communicating on the QSPI peripheral with the help of the HAL libraries. Currently the driver can successfully Write to the chip using standard Page Program (0x02h) and the Quad Page Program (0x32h). We are also successfully doing a Fast Read (0x0Bh) up to speeds of 72MHz and a Fast Read Dual Output (0x3Bh) up to speeds of 36MHz before they start to fail. Currently we are using a 3.3V power bus, so based on the AC Electrical Characteristics seen in the datasheet we expect these commands to work up to speeds of 133MHz.0693W00000FBEtWQAX.pngBelow is the CubeMX setup we are using for the QSPI peripheral:

0693W00000FBEyCQAX.png*Scaled clock down to 12MHz for logic analyzer testing

If you prefer the generated code that is also provided below.

QSPI_HandleTypeDef hqspi;
 
/* QUADSPI init function */
void MX_QUADSPI_Init(void)
{
 
  /* USER CODE BEGIN QUADSPI_Init 0 */
 
  /* USER CODE END QUADSPI_Init 0 */
 
  /* USER CODE BEGIN QUADSPI_Init 1 */
 
  /* USER CODE END QUADSPI_Init 1 */
  hqspi.Instance = QUADSPI;
  hqspi.Init.ClockPrescaler = 12 - 1;
  hqspi.Init.FifoThreshold = 4;
  hqspi.Init.SampleShifting = QSPI_SAMPLE_SHIFTING_NONE;
  hqspi.Init.FlashSize = 23;
  hqspi.Init.ChipSelectHighTime = QSPI_CS_HIGH_TIME_6_CYCLE;
  hqspi.Init.ClockMode = QSPI_CLOCK_MODE_0;
  hqspi.Init.FlashID = QSPI_FLASH_ID_1;
  hqspi.Init.DualFlash = QSPI_DUALFLASH_DISABLE;
  if (HAL_QSPI_Init(&hqspi) != HAL_OK)
  {
    Error_Handler();
  }
  /* USER CODE BEGIN QUADSPI_Init 2 */
 
  /* USER CODE END QUADSPI_Init 2 */
 
}
 
void HAL_QSPI_MspInit(QSPI_HandleTypeDef* qspiHandle)
{
 
  GPIO_InitTypeDef GPIO_InitStruct = {0};
  RCC_PeriphCLKInitTypeDef PeriphClkInitStruct = {0};
  if(qspiHandle->Instance==QUADSPI)
  {
  /* USER CODE BEGIN QUADSPI_MspInit 0 */
 
  /* USER CODE END QUADSPI_MspInit 0 */
  /** Initializes the peripherals clock
  */
    PeriphClkInitStruct.PeriphClockSelection = RCC_PERIPHCLK_QSPI;
    PeriphClkInitStruct.PLL2.PLL2M = 32;
    PeriphClkInitStruct.PLL2.PLL2N = 144;
    PeriphClkInitStruct.PLL2.PLL2P = 2;
    PeriphClkInitStruct.PLL2.PLL2Q = 2;
    PeriphClkInitStruct.PLL2.PLL2R = 2;
    PeriphClkInitStruct.PLL2.PLL2RGE = RCC_PLL2VCIRANGE_1;
    PeriphClkInitStruct.PLL2.PLL2VCOSEL = RCC_PLL2VCOWIDE;
    PeriphClkInitStruct.PLL2.PLL2FRACN = 0;
    PeriphClkInitStruct.QspiClockSelection = RCC_QSPICLKSOURCE_PLL2;
    if (HAL_RCCEx_PeriphCLKConfig(&PeriphClkInitStruct) != HAL_OK)
    {
      Error_Handler();
    }
 
    /* QUADSPI clock enable */
    __HAL_RCC_QSPI_CLK_ENABLE();
 
    __HAL_RCC_GPIOF_CLK_ENABLE();
    __HAL_RCC_GPIOB_CLK_ENABLE();
    /**QUADSPI GPIO Configuration
    PF6     ------> QUADSPI_BK1_IO3
    PF7     ------> QUADSPI_BK1_IO2
    PF8     ------> QUADSPI_BK1_IO0
    PF9     ------> QUADSPI_BK1_IO1
    PF10     ------> QUADSPI_CLK
    PB10     ------> QUADSPI_BK1_NCS
    */
    GPIO_InitStruct.Pin = GPIO_PIN_6|GPIO_PIN_7|GPIO_PIN_10;
    GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
    GPIO_InitStruct.Pull = GPIO_NOPULL;
    GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
    GPIO_InitStruct.Alternate = GPIO_AF9_QUADSPI;
    HAL_GPIO_Init(GPIOF, &GPIO_InitStruct);
 
    GPIO_InitStruct.Pin = GPIO_PIN_8|GPIO_PIN_9;
    GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
    GPIO_InitStruct.Pull = GPIO_NOPULL;
    GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
    GPIO_InitStruct.Alternate = GPIO_AF10_QUADSPI;
    HAL_GPIO_Init(GPIOF, &GPIO_InitStruct);
 
    GPIO_InitStruct.Pin = GPIO_PIN_10;
    GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
    GPIO_InitStruct.Pull = GPIO_NOPULL;
    GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
    GPIO_InitStruct.Alternate = GPIO_AF9_QUADSPI;
    HAL_GPIO_Init(GPIOB, &GPIO_InitStruct);
 
    /* QUADSPI interrupt Init */
    HAL_NVIC_SetPriority(QUADSPI_IRQn, 0, 0);
    HAL_NVIC_EnableIRQ(QUADSPI_IRQn);
  /* USER CODE BEGIN QUADSPI_MspInit 1 */
 
  /* USER CODE END QUADSPI_MspInit 1 */
  }
}

Here are the functions we are working with to drive the flash chip and the instruction description from the datasheet.

0693W00000FBEyMQAX.png 

0693W00000FBEyRQAX.png0693W00000FBEyWQAX.png0693W00000FBEybQAH.pngInitial thoughts on seeing 16bytes from the Fast Read Quad I/O was that the Wrap Bits were not actually defaulting correctly so we included a manual setting of the Wrap Bits using the Set Burst with Wrap (77h) to guarantee we had the right configuration there. The function that does this is as follows.

uint8_t QSPI_SetBurstWrap_77h(uint8_t mode) {
	if (mode < 0 || mode > 7) {
		mode = 1;
	}
 
	mode = mode<<4;
	QSPI_CommandTypeDef sCommand;
 
	/* Burst Wrap Sequence --------------------------------- */
	sCommand.Instruction = SET_BURST_WRAP_CMD;
	sCommand.InstructionMode = QSPI_INSTRUCTION_1_LINE;
	sCommand.AddressMode = QSPI_ADDRESS_NONE;
	sCommand.AddressSize = QSPI_ADDRESS_24_BITS;
	sCommand.Address = 0;
	sCommand.AlternateByteMode = QSPI_ALTERNATE_BYTES_NONE;
	sCommand.DdrMode = QSPI_DDR_MODE_DISABLE;
	sCommand.DdrHoldHalfCycle = QSPI_DDR_HHC_ANALOG_DELAY;
	sCommand.SIOOMode = QSPI_SIOO_INST_EVERY_CMD;
	sCommand.DataMode = QSPI_DATA_4_LINES;
	sCommand.NbData = 1;
	sCommand.DummyCycles = 6; //24 Dummy Cycles across 4 Lines 24/4=6
 
	if (HAL_QSPI_Command(&hqspi, &sCommand, HAL_QPSI_TIMEOUT_DEFAULT_VALUE)!= HAL_OK) {
		return HAL_ERROR;
	}
 
	/* Transmission of Wrap Bits */
	if (HAL_QSPI_Transmit(&hqspi, &mode, HAL_QPSI_TIMEOUT_DEFAULT_VALUE) != HAL_OK)
	{
		return HAL_ERROR;
	}
 
	return HAL_OK;
}

Finally, the main we are using for our tests is below. 

/* USER CODE BEGIN 0 */
uint32_t Size = 4096; //4KBytes
uint32_t ReadAddress, WriteAddr, BlockAddress = 0;
uint8_t sData[4096];
uint8_t rData[4096];
uint8_t s1Data[4096];
uint8_t checkReg;
/* USER CODE END 0 */
int main(void)
{
  /* Enable I-Cache---------------------------------------------------------*/
  SCB_EnableICache();
  /* Enable D-Cache---------------------------------------------------------*/
  SCB_EnableDCache();
  /* MCU Configuration--------------------------------------------------------*/¬
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
  int i=0;
  uint8_t val= 0x00;
 
  while(i < Size){
	  sData[i] = val;
	  if ((i+1) % 1 == 0) {
		  val+=1;
	  }
	  i++;
  }
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_QUADSPI_Init();
 
  /* USER CODE BEGIN 2 */
  /*Enable Quad*/
  QSPI_EnableQuad(); //Set the QE, DRV0, DRV1 bits
 
  /*Erase Functions*/
  if (QSPI_EraseSector_20h(BlockAddress) != HAL_OK) {
	  Error_Handler();
  }
  /*Read Single Data Lines*/
  if (QSPI_FastReadData_0Bh(rData, ReadAddress, Size) != HAL_OK) {
	  Error_Handler();
  }
  /*Read Quad Data Lines*/
  if (QSPI_FastReadQuadOutput_6Bh(s1Data, ReadAddress, Size) != HAL_OK) {
	  Error_Handler();
  }
 
  while (1)
  {
 
  }
  /* USER CODE END 3 */
}

Before we do any read or write operations we write to and check the values in our status registers to ensure the QE bit is set to 1 and that DRV1 and DRV0 are set to 0 making the Driver strength 100%.

Here are the register values we read out after doing our setup for Default Driver Strength 25%:

0693W00000FBEylQAH.pngHere are the register values we read out after doing our setup for Maximum Driver Strength:

0693W00000FBEyqQAH.pngHere is the Logic analysis we are seeing for both the Fast Read Quad Output and Fast Read Quad I/O commands. They both receive the first 16 bytes of good data and then get the same bad data later down the line, random bytes but consistent random bytes. Note: for all reads we are reading the first sector of the Winbond chip (4KB) which contains 16 pages (256Bytes) each populated with values 0-255. This data is known good as we successfully read the data out using the Fast Read command before trying the Quad Read Commands.

Logic Analyzer of Fast Read Quad Output

0693W00000FBEyvQAH.pngLogic Analyzer of Fast Read Quad I/O

0693W00000FBEz0QAH.pngThe two things we are trying to find out here is 1) Why are the Quad Read Functions only returning the first 16 bytes of data instead of a whole sector and 2) What could be preventing Read commands from working at higher frequencies than 72MHz?

I attached the .h file that contains the masking of the hex instructions just in case

12 REPLIES 12
Jaboop
Associate II

We have also conducted some testing that might be of interest.

One thought we had was that the handling of the WP and HLD lines were not being switched to high impedance correctly. To test this, we disconnected the IO3 from the Nucleo board and probed both the IO3 on the Winbond chip and the Nucleo board with the logic analyzer to see all the data transmission that is happening. For the first test we had the IO3 line connected during initialization and then disconnected it and probed the data on both sides the results of this are below

0693W00000FBEzUQAX.pngAround when the bad data occurs there is some sort of communication coming from the Nucleo board on the STMD3 line. We also tested if we did not have the IO3 line connected during the initialization. The results of this are below.

0693W00000FBEzeQAH.pngIn this test it seems we were seeing good data during transmission on IO0, IO1 and IO2. It can still be seen that the Nucleo board is sending something across STMD3/IO3 at about the same time +10 microseconds.

>>Currently we are using a 3.3V power bus, so based on the AC Electrical Characteristics seen in the datasheet we expect these commands to work up to speeds of 133MHz.

Yeah, that's not going to happen with zero consideration to signal integrity and having a whole load of stub traces.

What solder bridges have you modified?

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

At this point none. This project is still in breadboarding stages so we are expecting slower times due to signal integrity but is it expected that the dual read would be working at half the speed of the single read? Or is there some setup that is being overlooked? Our understanding is that the dual and quad read should be two and four times as fast.

Andreas Bolsch
Lead II

What about sample shift, i.e SSHIFT in QUADSPI_CR?

Here is what I am seeing in the hqspi registers. I have set Sample Shifting to the default (None) but have tried using the shift by half cycle and have not noticed any difference in performance.0693W00000FBPpBQAX.png

Andreas Bolsch
Lead II

Well, I meant the problem with Quad-Output or Quad-I/O. The performance (or lack thereof) is a different matter and most likely due to signal integrity. So we shouldn't mix these two problems.

For the Quad problem: Do you observe wrong data already on the pins with your LA or only the in data returned be the QSPI interface? From your description it's not totally clear to me. Unfortunately the screenshots from your LA are much too small to recognize anything. And, for the read you might check both memory mapped and indirect one. The 16-byte limit indeed sounds like some problem with wrap around, either in the flash or in the QSPI fifo.

BTW: For the Quad-IO code you stuffed address and M byte into address register. Please don't do this, this is not a simple pass-through register, as you can infer from the description in RM and errata sheet. Either use one alternate byte or two additional dummy clocks (with pull-ups enabled).

In youŕ code snippets there are some MODIFY_REG calls to QSPI registers between HAL_QSPI_Command and HAL_QSPI_Receive. I don't use HAL, so I don't know the implementation details, but keep in mind that most control registers can't be modified when BUSY bit is set. If this is attempted the result may be unpredictable. So, *before* any command is triggered, the setup must have been completed. And after data transfer has finished, one has to wait for BUSY to get cleared again. And, it's good practice to set ABORT and wait for BUSY to be cleared prior to setting up *any* command. Maybe the HAL layer does this automatically but it's worth checking. Moreover, the errata sheets for devices with QSPI (F4, F7, H7, L4) interface note problems with certain combinations of instruction, address, alternate and dummy settings. Hence it's worth to skim through all these errata sheets.

To clarify I preform 1 write and then try to read it in single data mode using fast read which correctly reads back all the data confirming that the data on the chip was written. I then try the Quad-I/O or the Quad Fast Read, both of which only get 16 bytes of expected values it seems to hang and then get a bunch of noisy data that looking at the logic analyzer seems like it might be coming from the nucleo boards side. I've attached larger LA captures in case it helps point to some obvious clues.

IO3 Connected During Initialization

0693W00000FBXHWQA5.jpgIO3 Disconnected During Initialization

0693W00000FBXHbQAP.jpg 

Thanks for the notes on things to change and for pointing me towards the errata sheet. There are two examples in there that could be causing things to hang. I will try out these changes this week and come back with the results of those changes.

Really weird ... During dummy phase IO0-IO3 should switch from output to input but here the previous levels are kept constant (possibly due to capitance). Hence: Maybe you could repeat the *very* same test (upper one, all pins connected) again once with pull-ups on IO3-IO0 enabled, once with pull-downs enabled. The pull-up/pull-down setting doesn't interfere with alternate setting of the GPIOs. This would show easily wheter any of IO3-IO0 is driven by the H7 or the flash (we couldn't distinguish which one, but anyway) during dummy phase and more importantly during the "silence phase" after the first 16 bytes. The most surprising fact is that there seems to be total silence on *all* four data pins.

BTW: You don't use PC2_C or PC3_C pins here by any chance? As these are somewhat problematic.

SPati.13
Associate

Jaboop, I am facing a similar issue with QSPI read with Winbond NOR flash as described in the forum: https://community.st.com/s/question/0D53W000016nhP7SAI/qspi-quad-read-commands-only-receiving-16-bytes-and-low-read-frequencies

Can you please post the resolution that helped to overcome the Read issue in Quad mode?. This may help many others working with Winbond NOR flash.

Thanks