cancel
Showing results for 
Search instead for 
Did you mean: 

Programming External QSPI with OpenOCD via STM32H7A3/STM32H7B0

AEarl.1
Associate

So firstly, I'll explain what I'm trying to achieve. My company is planning to utilize an STM32H7B0 because it's the chip that we were able to find stock of. For testing, I'm using a STM32H7A3 because it was available as a Nucleo board. The primary requirement we have is that updates to the device must have as little downtime as possible. Functionally this means that an update should be downloaded into a second image slot during normal operation, and then switched with as little downtime as possible. This would be simple using the H7B3 or H7A3, but we cannot find available parts.

My current approach is to use MCUBoot to load the image into RAM from a slot contained in a QSPI NOR flash. This also enables us to encrypt the image in the QSPI for IP protection. This is also the primary reason I'm not pursuing XIP. In theory this approach should work just fine, and I've already confirmed that the software runs just fine out of RAM instead of flash.

The problem I'm facing is in setting up a sane development environment. Currently, OpenOCD is configured to load the program into RAM which lets me test the program just fine. But I have to be able to write the program into the QSPI in order to test the bootloader. OpenOCD actually has a flash driver called `stmqspi` which will let me access the QSPI as another flash bank. However, I've only had limited success in getting this to work due to the memory mapped mode returning garbage.

The QSPI Flash I'm using is a Winbond W25Q16DV.

Here is the TCL script I'm using to setup the OCTOSPI. This intended to run immediately following a reset and halt.

proc qspi_init {} {
        global _CHIPNAME
 
	mmw 0x58024540 0x000007FF 0     ;# RCC_AHB4ENR |= GPIOAEN-GPIOKEN (enable clocks)
	mmw 0x58024534 0x00284000 0	;# RCC_AHB3ENR |= IOMNGREN, OSPI2EN, OSPI1EN (enable clocks)
	sleep 1				;# Wait for clock startup
        
	mww 0x5200B404 0x03010111	;# OCTOSPIM_P1CR: assign Port 1 to OCTOSPI1
	mww 0x5200B408 0x00000000	;# OCTOSPIM_P2CR: disable Port 2
 
        # Port B: PB2:AF09:V
        mmw 0x58020400 0x00000020 0x00000030    ;# MODER
        mmw 0x58020408 0x00000030 0x00000030    ;# OSPEEDR
        mmw 0x5802040C 0x00000000 0x00000030    ;# PUPDR
        mmw 0x58020420 0x00000900 0x00000F00    ;# AFRL
        # Port D: PD11:AF09:V, PD12:AF09:V, PD13:AF09:V
        mmw 0x58020C00 0x0A800000 0x0FC00000    ;# MODER
        mmw 0x58020C08 0x0FC00000 0x0FC00000    ;# OSPEEDR
        mmw 0x58020C0C 0x00000000 0x0FC00000    ;# PUPDR
        mmw 0x58020C24 0x00999000 0x00FFF000    ;# AFRH
        # Port E: PE2:AF09:V
        mmw 0x58021000 0x00000020 0x00000030    ;# MODER
        mmw 0x58021008 0x00000030 0x00000030    ;# OSPEEDR
        mmw 0x5802100C 0x00000000 0x00000030    ;# PUPDR
        mmw 0x58021020 0x00000900 0x00000F00    ;# AFRL
        # Port G: PG6:AF09:V
        mmw 0x58021800 0x00002000 0x00003000    ;# MODER
        mmw 0x58021808 0x00003000 0x00003000    ;# OSPEEDR
        mmw 0x5802180C 0x00000000 0x00003000    ;# PUPDR
        mmw 0x58021820 0x0A000000 0x0F000000    ;# AFRL
 
        mmw 0x52005000 0 1
 
        mww 0x52005008 0x00150100       ;# OCTOSPI_DCR1: MTYP=0x0, DEVSIZE=0x15, CSHT=0x1
        mww 0x5200500C 0x00000020       ;# OCTOSPI_DCR2: PRESCALER=0x20
        mww 0x52005010 0x00000000       ;# OCTOSPI_DCR3
        mww 0x52005014 0x00000000       ;# OCTOSPI_DCR4
        
	mww 0x52005108 0x00000004	;# OCTOSPI_TCR: SSHIFT=0, DHQC=0, DCYC=0x4
	mww 0x52005100 0x01002101	;# OCTOSPI_CCR: DMODE=0x1, ABMODE=0x0, ADSIZE=0x2, ADMODE=0x1, ISIZE=0x0, IMODE=0x1
	mww 0x52005110 0x00000003	;# OCTOSPI_IR: Read Data 03h
        
        mww 0x52005000 0x3040000B       ;# OCTOSPI_CR: FMODE=0x11, APMS=1, FTHRES=0, FSEL=0, DQM=0, TCEN=1
 
        sleep 1
 
        flash probe $_CHIPNAME.qspi
}

You might notice that this is configured for 1-lane for both data and address. This is because for some reason, the stmqspi driver is unable to probably fetch the SFDP parameters, device ID, or run commands in quad spi mode.

But in this mode, I'm able to erase, write, and read the flash via raw commands. But the memory mapped region just returns "garbage". I mean, it's close?

> stmqspi cmd 1 4 0x03 0 0 0
spi: 03 00 00 00 -> ef be ad de
 
> mrw 0x90000000
0xefddeafb
> mrw 0x90000004
0xffffffff
>

Even more confusing, the signals coming back from the memory mapped operation look completely fine:

0693W00000aHdsAQAS.pngAm I missing something obvious? Do I need to explicitly disable cache? I'm just really lost here.

1 ACCEPTED SOLUTION

Accepted Solutions
Andreas Bolsch
Lead II

Most certainly you can't use W25Q16DV with 4-line mode (aka QPI-mode) with stmqspi driver: The W25Q16DV supports only some "mixed" modes, i.e. command byte always in 1-line, address 1-line or 4-line (depending on command!), and data in 1-line or 4-line mode. E.g. Read ID works *ONLY* in 1-line mode for both command and data phase. Or Quad Input Page Program 0x32: command and address in 1-line mode, data in 4-line mode.

These mixed-modes are deliberately not supported by stmqspi, simply because there far too many combinations to test, my time is somewhat limited.

So only 1-line, 2-line, 4-line and 8-line modes (identical for command, address and data phases!) are supported.

But this refers to flashing only, you application may still reconfigure memory mapped mode at will, even for debugging. Of course, for next flash cycle your script must revert the flash to 1-line mode.

Regarding SFDP: I got some W25Q256 devices which (according to datasheet) should have SFDP data, but in fact there is none. Same for some devices from other manufacturers. Hence I woudn't be surprised if this happens for an even older design, too.

BTW: W25Q16DV is already NRND. W25Q16JV-DTR (not W25Q16JV) would be an alternative, this one does support 4-line mode consistently as far as I see from datasheet.

View solution in original post

4 REPLIES 4

For Quad mode you'd need to free up the pins via QE bit in SR2

The Dummy Cycles DCYC=4 is causing the shift in the Memory Mapped data, READ (0x03) has no dummy cycles

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Andreas Bolsch
Lead II

Most certainly you can't use W25Q16DV with 4-line mode (aka QPI-mode) with stmqspi driver: The W25Q16DV supports only some "mixed" modes, i.e. command byte always in 1-line, address 1-line or 4-line (depending on command!), and data in 1-line or 4-line mode. E.g. Read ID works *ONLY* in 1-line mode for both command and data phase. Or Quad Input Page Program 0x32: command and address in 1-line mode, data in 4-line mode.

These mixed-modes are deliberately not supported by stmqspi, simply because there far too many combinations to test, my time is somewhat limited.

So only 1-line, 2-line, 4-line and 8-line modes (identical for command, address and data phases!) are supported.

But this refers to flashing only, you application may still reconfigure memory mapped mode at will, even for debugging. Of course, for next flash cycle your script must revert the flash to 1-line mode.

Regarding SFDP: I got some W25Q256 devices which (according to datasheet) should have SFDP data, but in fact there is none. Same for some devices from other manufacturers. Hence I woudn't be surprised if this happens for an even older design, too.

BTW: W25Q16DV is already NRND. W25Q16JV-DTR (not W25Q16JV) would be an alternative, this one does support 4-line mode consistently as far as I see from datasheet.

Pavel A.
Evangelist III

> But I have to be able to write the program into the QSPI in order to test the bootloader.

IMHO you do not need any OpenOCD support for your flash. Thanks to the debugger, you already can run your program in RAM. Now you can just develop the read & write routines for the flash, in the mode suitable for you. Then make your bootloader, then just use it to put the stuff into the flash.

Since you don't need XIP, reading ~ 1MB from the flash once per boot won't take a long time even in slower modes.

I'd probably add code on the boot-loader side to be able to write content to QSPI so I could recover in the field. It could perhaps copy from an SD CARD, or I could X-MODEM in data.

Load should be able to brink up clocks and external memories once, then transfer control to app, which then can skip those steps. Perhaps just bring up secondary PLLs and peripheral clocks as required.

An ST-LINK/V2 can be very slow. Some of the 512Mbit (64MB) devices taking me 15 minutes

There are a lot of parts floating about, and Winbond's gone through a lot of generations and die shrinks. The smaller capacity parts typically having the least features, and a lot of the quirks worked out on the newer/larger parts.

The 1-bit programming likely isn't materially slower, most of the time is eaten in spin loops waiting for the IC to erase or write.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..