cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H7 Quad SPI broken bit when output is switching to input with NOR flash

SLesh.1
Associate III

Hi there!

I have a problem when using QSPI with NOR flash. Some read commands return data with first 4 bits corrupted.

After investigation, I have found that the issue only appears if there is switch from output to input without dummy cycles.

For example (see picture): I send a 4-line, 1 byte instruction (2 cycles) and read 4-line 3 bytes (6 cycles). I expect IO2 (blue) to switch from output to input prior to clock's (yellow) third falling edge. Instead, it seems to be held at some grey level, that is sometimes sampled as 1 at the third rising edge.

0693W00000Dn5RqQAJ.jpg 

Same issue was observed on STM32H7 with 2 different flash chips, using quad-spi and dual-spi.

Yellow is clock, blue is IO2

I can’t find something similar in errata.

Are some ideas, guys?

29 REPLIES 29

Frankly speaking I can hardly recognize anything in that picture. "red" is ???, and "blue" is ???.

Sorry for the confusion. I thought it was self explaining.

Red is SPICLK and blue is SPIO3. They were scaled differently so that the signal can be recognized easily.

The command reads one byte from the slave and sampling was delayed by half cycle, which is why there are 5 clocks.

It's the same problem I think, i.e. SPIO3 and other data pins are not switched to input on time. There is no problem when the first bits of response on data pins are zeros. Only 1's are corrupted when they are the first response bits of SPIOx pins.

More precisely, my suspicion is...

Output block of F7 QSPI controller remains active when input block gets active on 3rd clock. That makes the total impedance of F7 SPIOx very small. 44Ohm is used to supress ringing(for termination purpose). This termination resistor divides ~3V with the output block of F7 SPIOx. Reading level gives 1.2V at this point. On the rising edge of 3rd clock, output block of F7 SPIOx gets inactive and the total impedance of F7 SPIOx is now restored to the impedance of input block of F7 SPIOx pins and so the voltage level gets back to ~3V.

0693W00000GYLm8QAH.png 

In case of Ilya Balov, the commands ends with 1 and output register has value 1. And it causes the voltage level not to be zero.

0693W00000GYLleQAH.png 

equivalent measurement of the same pin with ATSAMV71 and the same device. no half cycle delay for sampling and data pin pulled up. And sampling polarity(or whatever) is different. No problem with the QSPI device.

red is clock and blue is data pin.

0693W00000GYLo4QAH.png 

My device counts the number of clocks, which is used internally in the device to indicate the offset of device memory. That's why additional clock generated by half cycle delay causes another problem. Half cycle delay does not work for me!!!

I can't agree with your opinion that problem solved if it works by using half cycle delay for sampling. I think even ST engineers can't agree with you.

0693W00000GYLxVQAX.png

>>Sorry for the confusion. I thought it was self explaining.

Don't make assumptions, this is a two-month old thread, and no one here knows your specific situation, or parts involved.

You're using an W25Q32JV too, or something else? Be specific.

How is the part configured? Show the volatile/non-volatile register settings.

Ideally have the code for the the BSP being used, the configuration, register setting portions, and whatever the read/write functionality is here.

Showing the ATMEL code that works would also help provide contextual detail, and insight in to why it may be behaving differently.

Pretty sure this is third-party IP not some ST home-baked implementation.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

My data is 1 year old. I'm not ready to provide the data you requested. But better idea came up.

1.2 V voltage is a problem, whichever is responsible for it. It seems possible to determine which is the culprit by measuring the voltage. I will measure the voltages at points 1 and 2 depicted in the picture below. 2 is the output pin of the slave device. We can observe the voltages at points 1 and 2 of SPIOx pins during the 3rd clock. I hope you understand this is not a configuration issue but an input impedance issue.

0693W00000GYMNdQAP.png3 possible outcomes (whose 3rd bit of SPIx is logic high, no dummy cycles are set)

  1. 1 and 2 are both close to 3V : good
  2. 1 is 1.2V and 2 is close to 1.2V (actually a little bit higher)
  3. 1 is 1.2V and 2 is 3V

case 1 means that we have no problem.

case 2 proves that slave device is doing something wrong.

case 3 proves that F7 QSPI controller is a problem.

Rough physical layout is as below.

0693W00000GYMRbQAP.pngIf you agree on my experiment plan, I will proceed. It will take time to get the result because my company is based on other country and I have to arrange this.

Please let me know your opinion about my suggestion.

Hi @KShim.1738​!

I don't think you are using sample shifting correctly.

> The command reads one byte from the slave and sampling was delayed by half cycle, which is why there are 5 clocks.

This is not true. sample shifting does not add any cycles, but changes when bits are sampled:

  • QSPI_SAMPLE_SHIFTING_NONE - on the CLK rising edge
  • QSPI_SAMPLE_SHIFTING_HALFCYCLE - halfway between CLK rising edge and CLK falling edge

It seems like you instead added dummy cycles of incresed length or read.

This workaround has worked for me, yet I do think this is a hardware issue and ST staff should look into this. Unfortunately, none of them seem to be interested.

The data I had is almost 1 year old. Maybe some of the details are not correct. If QSPI_SAMPLE_SHIFTING_HALFCYCLE adds no additional clock, then I can workaround this issue. In your case, one dummy cycle removes this issue. I think I didn't add a dummy cycle. As far as I remember it appears after I enabled the QSPI_SAMPLE_SHIFTING_HALFCYCLE. But I will check it again.

Another problem is the maximum sinking current on both master/slave sides, which is 25 mA for F7. If the outputs of master and slave are both active and they have different logic levels, the current can be bigger than 25 mA. In my setting, it is about 30 mA. Even though the duration of this huge amount of current is very short, it is not good for the QSPI output MOSFET. It has to be taken care of.

One solution for maximum sinking current is to use different modes on master and slave. I didn't test it but I think it will work. I've been using mode 3 on both sides. When slave is changed to use mode 1 with half cycle delay active on master side, then the problematic half cycle will not be used. My slave device support mode1 and mode3.

0693W00000GYOdEQAX.png 

if only the half of 3rd clock is a problem, it can be skipped by mixing modes (in my thought).

0693W00000GYOxyQAH.pngWhat do you think of this maximum sinking current issue and my solution?

Another solution is to use SPI in my case.

Thanks a lot, mate.

I never tried using different modes for spi master and slave. STM32H7 does not support modes 1 and 2, but I see what are you trying to do. Are you able to configure slave mode 1ONLY for response (i.e. keep using mode 3 for command, address, write data)? Also note, you would be forced to use sample shifting. If all of these are ok for you, I don't see why this wouldn't work (although looks like a dirty hack, if you ask me).

What SPi command are you using? My flash has only one qspi command without dummy cycles - JEDEC ID. Since we only have to use it once per power cycle, we decided to not worry about possible issues with high current. If you need this for a read (or any often used command), it might become a problem.

My slave device is not a flash device. My slave device is UWB sensor for proximity sensing from Novelda. It transmits Gaussian shaped electromagnetic wave(Gaussian baseband with 8.7GHz carrier) and records the return profile. Simply speaking it's a small radar. I consider that device as a kind of a very fast ADC. The data is read from the slave thru QSPI. Quite a lot data, 4 x 128 bytes per frame. And frame rate is 6 frames. 4 x 128 x 6 bytes = 24576 bits per second. That's why I stick to QSPI. Sorry, I'm not helpful with your situation.

Yes, I think above setting might work for me. I can't be 100% sure before I test it.

master : mode 3 with QSPI_SAMPLE_SHIFTING_HALFCYCLE on

slave : mode 1

Then master will samples during the first half cycle at each clock I reckon. Master don't care when slave outputs. I calculated the timing.

I didn't have any issue with flashes, because all of them had required dummy cycles. You are the one who pointed out how it affects this issue.

I think you can avoid the strange voltage issue by designing the file system or file structures. The lower 4 bits of last byte determines the output level of H7. So make the very lower 4 bits of first bytes from the slave match those 4 bits from the command. And make every response from the slave begins with the byte, i.e. a dummy byte. Then the two outputs from H7 and flash will be the same logic level. It's just an idea.

It's certainly a silicon bug (and that's not the only one, check the various errata sheets for devices with QSPI or OCTOSPI ...). But the sample shift is a simple workaround which works in most cases. (As I said before, the PCx_C shouldn't be used as they're easily destroyed by a conflict.)

But on the other hand: The RMs say very explicitly that QSPI is intended for flash devices. For anything else you may say "Your mileage may vary". I''m afraid there is little hope ST will make a new silicon revision ...

For flash devices there is even another workaround, namely (unneccessary) dummy cycles. The only transfer mode really affected is indirect read for status register, and here most (all) devices simply repeat its contents indefinitely, so sampling e.g. the second byte instead of the first doesn't matter. All other register reads or id reads could and should be done in simple SPI mode before switching to QPI mode,