cancel
Showing results for 
Search instead for 
Did you mean: 

Need help with flashloader/driver for STM32H7A3ZIT6Q with Octo-spi FLASH (twin quad) NOR flash MT25TL256

unsigned_char_array
Senior III

I have a custom board with an STM32H7A3ZIT6Q. It has the MT25TL256 NOR flash connected to the pins of OCTOSPI1.

I'm trying to write a flashloader/driver.

In our application we do not need to execute code from this flash. We only need it for data. But if there is a way to get execution working that would be nice too.

I'm following the tutorial from ST:

ST tutorial:

https://www.st.com/content/st_com/en/support/learning/stm32-education/stm32-moocs/external_QSPI_loader.html

https://www.youtube.com/watch?v=YFIvJVsvIsE&list=PLnMKNibPkDnHIrq5BICcFhLsmJFI_ytvE&index=1

Template project: https://drive.google.com/drive/folders/1KiaqXgiubk81EvevofK-y3LxrCl9NHfi

STM git repo https://github.com/STMicroelectronics/stm32-external-loader/tree/contrib

Application note: https://www.st.com/content/ccc/resource/technical/document/application_note/group0/91/dd/af/52/e1/d3/48/8e/DM00407776/files/DM00407776.pdf/jcr:content/translations/en.DM00407776.pdf

MCU Datasheet: https://www.st.com/resource/en/datasheet/stm32h7a3ri.pdf

Reference manual: https://www.st.com/resource/en/reference_manual/rm0455-stm32h7a37b3-and-stm32h7b0-value-line-advanced-armbased-32bit-mcus-stmicroelectronics.pdf

MT25TL256: https://media-www.micron.com/-/media/client/global/documents/products/data-sheet/nor-flash/serial-nor/mt25t/generation-b/mt25t_qlhs_l_256_xba_0.pdf?rev=443b2b18a725408b9ba5b028fa46e840

Here is what I have figured out so far:

MCU is clocked to 270MHZ.

Prescaler is set to 4 (value 3 in register) so MT25TL256 is clocked at 90MHz in DTR mode.

chip-select high time (CSHT) is hard to find, but since all similar timings are <= 5ns this equals to 1 clock cycle, so this is set to 1

Our MCU is the Q variant with SMPS, so not all pins of OCTOSPI2 are available.

Here is what I'm struggling with:

In table 4 of AN5050 Rev 7 it says the flash has to be connected to OCTOSPI2, but ours is connected to OCTOSPI1.

In Table 1 of DS13195 Rev 8 it says nor flash is not supported directly only in Multiplexed mode.

In figure 10 of AN5050 Rev 7 the Multiplexed mode is visualized. In this figure Memory 2 is connected to OCTOSPI1 and its CSn pin (S#) is connected to OCTOSPI2.

In Table 6 of RM0455 Rev 9 it seems like NOR flash could be mapped to OCTOSPI1 (I assume the first addresses are for OCTOSPI1), but that "Execute never" is set to yes, so it cannot be executed.

Since our memory has the CSn connected to OCTOSPI1 instead of two I wonder if we can get this to work, with or without support for execution, without making hardware changes.

So my questions are:

1) Can we get this chip to work without changing the hardware? If, so how?

2) What value should I use for FifoThreshold?

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.
1 ACCEPTED SOLUTION

Accepted Solutions
unsigned_char_array
Senior III

I've found the bug. The debugger doesn't hard reset the processor so the reset line of the FLASH chip doesn't toggle. The FLASH chip is still in quad mode. The solution was to simply reset the chip in both quad and single mode.

I've also improved the driver structure so I can enable Discard unused sections (-Wl,--gc-sections):

	//trick compiler not to discard unused functions and data with option Discard unused sections (-Wl,--gc-sections)
	volatile bool useUnused = false;
	volatile unsigned long DeviceSize;
 
	if (useUnused)
	{	DeviceSize = StorageInfo.DeviceSize;
		UNUSED(DeviceSize);
		SectorErase(0, 0);
		MassErase();
		Write(0,0,NULL);
		CheckSum(0, 0, 0);
		Verify(0,0,0,0);
	}

Edit: driver can be found here: https://github.com/STMicroelectronics/stm32-external-loader/pull/13

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.

View solution in original post

10 REPLIES 10
Andreas Bolsch
Lead II

Just to make sure: 144-pin LQFP with SMPS? In this case it is quite possible to use two *serial* NOR in dual mode, e.g. I've attached several Octal-SPI flash devices to OCTOSPI1 (to OCTOSPI2 would be possible, too) on a Nucleo-H7A3ZI-Q board, and the very same pins could be used for two Quad-SPI chips simultaneously: OCTOSPIM_P1_IO7 down to OCTOSPIM_P1_IO0, OCTOSPIM_P1_CLK, and OCTOSPIM_P1/2_NCS.

Watch out, the *external* OCTOSPIM_Px pins can be internally connected to OCTOSPI1 or OCTOSPI2, that's done within OCTOSPIM, which is sort of crossbar switch (although ST seems to prefer the term multiplexer).

The term "multiplexed mode" is rather ambiguous here: It could refer to FMC, *parallel* NOR flash (multiplexed I/O), this means address and data are mutiplexed over the same lines. Or sharing (i.e. multiplexing) the same physical *serial* flash devices between OCTOSPI1 and OCTOSPI2 via OCTOSPIM. Not related in any way.

But your flash devices are *serial* ones, so the lines referring to FMC in table 1 in data sheet are irrelevant. What matters is the line "Octo-SPI interfaces" only.

Re. FifoThreshold: It depends. If using interrupt-driven data transfers, e.g. 1 would not be a good choice, as the full interrupt overhead would occur for each byte. So in this case a higher value would be advisable. For DMA-driven transfers the overhead is just one bus arbitration per burst transfer, so a lower value might be feasible. A high value would cause a long burst period ...

Execute code from serial flash is just a matter of configuring OCTOSPI1/2 to memory mapped mode. Re. table 6 in RM0455: The address windows of OCTOSPI1/2, i.e. starting at 0x9000000 and 0x7000000 have default attribute "execute never" as "no", so executable by default.

Yes. We have the 144-pin LQFP with SMPS, though we currently use the internal LDO.

It is not clear to me what mode I should use.

Dual quad or Octo? The MT25TL256 is basically two Quad dies in one package (twin quad) and not a true Octo chip. So I think I have to use Dual-Quad. Do I have to select multiplexed? In that case I don't have data pins 4:7.

My pinout is:

  • MCU_OCTOSPIM_P1_IO0: PF8
  • MCU_OCTOSPIM_P1_IO1: PF9
  • MCU_OCTOSPIM_P1_IO2: PE2
  • MCU_OCTOSPIM_P1_IO3: PF6
  • MCU_OCTOSPIM_P1_IO4: PC1
  • MCU_OCTOSPIM_P1_IO5: PC2_C (pin 30)
  • MCU_OCTOSPIM_P1_IO6: PD6
  • MCU_OCTOSPIM_P1_IO7: PG14
  • MCU_OCTOSPIM_P1_CLK: PF10
  • MCU_OCTOSPIM_P1_NCS: PG6

0693W00000Y7kYPQAZ.png

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.

As long as you don't do fancy things, "Dual-Quad-SPI". Yes, it's essentially the same as two separate Quad-SPI chips, so eight data pins going to *ONE* OCTOSPI interface. CLK of both flash devices both connected to OCTOSPI CLK, the same for both NCS signals. Command and address would then go simultaneously to both chips, and only data would be interleaved between both devices.

"Multiplexed" would mean both chips together would be alternatingly connected to OCTOSPI1 and OCTOSPI2, that's rarely useful (except if access with different setups would be necessary without having to reconfigure the interface every now and then).

Agree with Andreas on the SPIM part, should be able to plumb onto P2 pins HIGH/LOW

The TL parts being interesting in the pining allows for what would be DUAL BANK in the original QSPI peripheral mode.

Commands will go to both chips, with low bytes living in ONE die, and the high bytes the other.

Status need to be read / check in a 16-bit (two byte) mode, and thus both must be waited on, as one might be faster than the other. I personally would probably avoid the auto-poll operations, as we've had issues with the timeout paths of the code.

The page write doubles to 512-bytes. Similarly the erase granularity 4KB becomes 8KB, 32KB -> 64KB, etc.

At the 256Mbit density were not looking at Stacked Die, these further complicate the use of Micron parts where you have to scan more things to ensure they are all complete, and the mass erase needs to be substituted for die erase, and managed.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
unsigned_char_array
Senior III

I'm making progress. I can initialize the part, read the id, and I also got erase and write working at some point. I'm now having an issue with reading the ID in quad mode. The first byte of the ID reads as 0x70 instead of 0x20. So bits 0 and bit 2 of the first nibble read as high instead of low.

I'm running the clock currently at 5MHz and sample it with a logic analyzer at 50MSPS.

Here are the logic charts.

As you can see the Logic Analyzer reads it correctly, but my MCU doesn't. The signal is clearly in the illegal zone during the rising edge. How do I fix this?

Edit: with half cycle delay or with delay block it seems to work. But it is not clear to me if delay block will work for all frequencies. Half delay cycle doesn't work with dtr.

0693W00000Y8BPJQA3.png0693W00000Y8BP9QAN.png

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.

I'm afraid that's a silicon bug, this appeared in this forum some time ago already (I don't recall whether this occurred with QSPI or OCTOSPI). The turn-around of IO2 and/or IO3 seems to be delayed, but as you noticed, SSHIFT solves this problem in SDR mode. Whether this problems occurs in DDR mode I don't know, but:

In general Read ID should be done in simplest mode only (1-line, SDR), as some flash chips don't support this command in e.g. 8-line mode at all (Adesto ATXP) or change the ID upon entering 4-line mode (Winbond). And most chips power up in 1-line SDR mode anyway, and to switch to any other mode, you have to know the type of flash chip already ...

For other read commands (read status, read register, read memory array) the dummy cycles silently hide this problem. (Yes, status read doesn't need dummy cycles, but most (all?) flash devices repeat status continuously, and this is the case for most registers, too, so one artificial dummy cycle won't harm.)

I only used read ID for testing purposes. Once before init and once after init in quad mode. Because I know what data to expect. While I didn't get dtr mode to work the DTR commands do work. So the commands with the command in STR mode and address and data in DTR mode.

Memory mapped mode also works, though I've only tested it with the STR command.

I still need to conduct more tests and test at a higher frequency, but I'm confident enough it will work.

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.
unsigned_char_array
Senior III

The low level part of the driver now works (init, erase, program, memory map, etc.). But when I run it as a loader I get weird problems.

Memory map works, but I can only read 1 sector. After that I need to disconnect and reconnect in STM32CubeProgrammer and then I can read another sector.

Erase simply fails even if my function does nothing and only returns ok. I get this error "Error: Sector erase operation has failed at least for one of the existing specified sectors.Please verify flash memory protection."

When debugging in STM32CubeIDE I get hardfaults after return from init as the init function is directly called from the reset handler. I don't know if this behavior is desired as I don't know how STM32CubeProgrammer calls the functions.

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.
unsigned_char_array
Senior III

I've attached the current version I have the mentioned issues with.

EDIT: I'm debugging the loader (by writing some strings in unused RAM), and I found the cause of the problem: STM32CubeProgrammer always calls the init function and this function is not re-entrant. I've tried many ways to reset the peripherals or deinitialize everything before the init, but it always fails the second time. I've even compared registers of OSPI the first and second time. The register values were identical, but somehow the read id function fails.

Kudo posts if you have the same problem and kudo replies if the solution works.
Click "Accept as Solution" if a reply solved your problem. If no solution was posted please answer with your own.