M95M02-DR : Too many zeros within a page produce write errors

Juergen Abel · ‎2022-07-15

Hello, I am working with an SPI M95M02-DR EEPROM in a Cypress PSoC Creator enviroment.

The M95M02-DR is running on a 1 Mbps SPI bus.

After each buffer write, a delay of 20 ms is used.

If 0xFF is written to all bytes of a page or if random values are written to all bytes of a page, no error occurs when reading the same page afterwards.

The problem starts, when many 0x00 bytes are written to a page.

If 0x00 is written to all 256 bytes of a page, the first two bytes of the next page are changed.

If a buffer with 175 x 0x00, 81 x 0x01 are written to a page, reading the page gives no error.

If a buffer with 176 x 0x00, 80 x 0x01 are written to a page, reading the page gives an error with 177 x 0x00 and only 79 x 0x01 instead of 80.

All page writes with more than 176 bytes of 0x00 at the beginning and the rest 0x01 produces errors.

What is really strange is, that writing all 1024 pages for many thousend times with random values does not produce a single error, only writing pages with (many) 0x00 produces errors.

Of course I have tried all kind of delays before and after CMD_WREN, CMD_WRITE, after writing all data bytes and so on.

Did anybody observe a similar pattern too?

Cheers,

Juergen

Paul1 · ‎2022-07-15

Possibly insufficient capacitance on the Vcc for the part, and/or too resistive a path to the IC's power pins. Maybe add more decoupling capacitors right at the IC. Check: If you have 100nF on V/Gnd of the EEPROM, solder another 100nF or even a 1uF on top of the existing decoupling capacitor.

If you device is battery powered this can be exacerbated by the battery's ESR, especially as battery drains/ages. Also true if power source is too small, or if source is supplying multiple devices. If there are other devices using same supply, and they draw extra power while you are writing lots of 0 then their may not be enough power. Try to plan your writes when nothing else is using power.

Try scoping the V directly at the EEPROM, as well as other signals at the EEPROM. Look for dips in V on EEPROM pin while writing lots of 0. Does V stay within EEPROM Datasheet limits? Maybe a different EEPROM would work under a wider V.

Remember: Writing 1 typically means doing nothing, but writing 0 actually affects the IC, so lots of 0 burns more power.

Juergen Abel · ‎2022-07-15

Hello Paul, thanks for your suggestion.

There was a 100 nF capacitor between VCC/GND directly at the EEPROM. After adding 2 more 100 nF ceramic, a 47 nF tantal and finally a 47 uF electrolyte capacitor, the error still occurs.

With the scope I measured VCC as 3.34 V and spikes of 70 mV above and below VCC.

The device is powered by a small regular power supply.

Tomorrow I will check the device with a more stable and bigger power supply, just to make sure it's not a problem of the power supply...

Juergen Abel · ‎2022-07-18

Using a very strong power supply produced even lower spikes, but the problem still occurs.

The reason for the problem must be probably something else...

Paul1 · ‎2022-07-18

Have you scoped the signals to the EEPROM and verified that all timing and voltages (and noise/ripple) meet both the MCU and EEPROM datasheet normal operation specs, and continue to meet them throughout the write?

Timing: Slow down everything to the EE, maybe 1/10 speed. Might be an issue with slew rate or max data frequency.

Are you checking that each write is complete before starting next write? There should be a "completed" flag to signal when write complete before start next write.

Is there any repeatable pattern or near pattern to where the failed writes are in memory? (Timing)

Try: Make a completely fresh project and put in it only code to write EE and to read it back, nothing else, and make everything simple polled operations (no interrupts/DMA/etc.).

With the Power and Decoupling caps proven then I can't guess what else could be causing intermittent write fails. Even a bad trace would show up when you scope the EE signals.

Maybe the EEPROM has ESD damage, try a fresh EE IC?

Juergen Abel · ‎2022-07-19

Concerning the speed, I have tried many different speeds, with no success. Using 0.1 MHz instead of 1 MHz SPI speed, the problem even arraises using less zeros on a page, i.e. get's worse.

After trying different circuit boards, all of them show the same problem, so it's not this singular EEPROM.

Also I use a very simple PSOC board layout using as few components as possible, and I use simple polling operations, no DMA or interrupt support.

Since there are more components on the SPI bus, there is a chance that maybe the components - even though they are not selected/activated - may stress the SPI bus signals (MISO, MOSI, CLK).

On the MOSI signal, I added a 100 KOhm resistor to GND as suggested in Application note

AN2014, but again without success.

Last but not least, I scoped some SPI signals of the EEPROM, but couldn't find any unusual voltages.

The SPI bus uses an RX FIFO and a TX FIFO, which builds an additional interface layer, so this could also be a source of the problem, as I can't controll the status of the MISO, MOSI, CLK and SELECT signals diretcly by myself. But all the other SPI devices work without a problem.

Currently I am trying to write small amounts of data, i.e. only 16 bytes at once (16 times) instead of 256 bytes...

Paul1 · ‎2022-07-19

a) WIP status?

BP1, BP0 settings?

Other bits? Status Register?

Read and log these bits (in your diagnostics output) throughout your writes to see if any change unexpectedly, and to verify write complete before start next write.

b) Could be a mismatch in SPI Mode?

Maybe the EEPROM needs a different SPI Mode from the other devices.

If clocking on wrong edge then could sometimes miss bits.

c) Temporary patch:

Read back after each write to confirm block matches, else write again (with max tries per block, log actual tries per block).

d) Providing schematic might help (with actual population).

e) Providing pictures of scope traces may help.

f) From Datasheet @6.6:

"The instruction is not accepted, and is not executed, under the following conditions:

• if the Write enable latch (WEL) bit has not been set to 1 (by executing a Write enable instruction just before),

• if a Write cycle is already in progress,

• if the device has not been deselected, by driving high Chip select (S), at a byte boundary (after the eighth bit, b0, of the last data byte that has been latched in),

• if the addressed page is in the region protected by the Block protect (BP1 and BP0) bits."

g) Writing 16bytes and watching WIP and writing single bytes and watching WIP are worth trying.

Paul

Juergen Abel · ‎2022-07-19

Thanks for all the suggestions, Paul, some of them I have checked already, some of them can be excluded as a reason.

To a): Block protect bits BP0 and BP1: it only depends on the number of zeroes I write to a page. Writing itself is working without a problem, if I use random values; e.g. I can write many hundred thousend pages on the whole range of 0..1023 pages without a single error. Only if I write to many zeroes in one page, the error occurs.

to b): Yes, SPI mode could be a source of the problem, I tries all possible modes, which I can set on the PSOC enviroment, and mode Master, sub mode Motorola, CPHA=0, CPOL=0 works best.

To c): This was one of my first solution attemps, unfortunatly, if there is an error since of too many zeros, I can write the page one thousend times (and read back and compare each time) and the problem still exists on this page, i.e. rewriting doesn't help.

To d): The layout of the circuit board is owned and copyrighted by my customer, so unfortunately I can't publish it.

To f): Please note, that the page is written and can be read back. In case that there are more than 175 zeroes on one page, there is an error. Normaly only 1 or 2 byte differ between the original page and the readback page.

Example:

###############

data_org:

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 ###############

data_cmp (read back):

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 ###############

The only diffference is at the end of the zero block: there is a 0x03 instead of a 0x07.

Also the problem occurs on any page written (0..1023), not only on page 0, 1 or 2...

To g:) Unfortunatly writing 16 times 16 byte (and waiting 20ms after each write) didn't help either, same error at the same location (see example below), it only took 16 times more time since of the 20 ms delay after each write ;) .

Example:

###############

data_org:

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 ###############

data_cmp (read back):

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 ###############

The only diffference is at the end of the zero block: there is a 0x01 instead of a 0x07.

Now I tried some more patterns:

Writing 200 times 0x00 and 56 times 0x07 in one page always produces an error.

Now I added two 0xFF at position 127 and position 128: no error anymore.

Even with all 256 0x00 exept position 127 and 128 (=0xFF): no error.

What irritates me the most is, that I don't change any technical parameter like delay, SPI mode, FIFO size, SPI speed et cetera but it depends only on the content (pattern) of a page, if the error occurs or not. :)

Paul1 · ‎2022-07-19

I find it interesting that the error is at the same memory location whether you write in 16byte or 256byte blocks.

Is the glitch always at offset 0xBE?

That may suggest a software bug in your code, or some other code be writing to that memory location?

Try inserting another 256byte array just before this data, such that this block's address is shifted by 256bytes in RAM, and any other code may not clash there. (Fill the other array with a pattern to see if it gets corrupted).

You might want to contact ST Support to see if any errata for your EEPROM.

Your Data:

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F //Index

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //10

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //20

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //30

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //40

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //50

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //60

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //70

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //80

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //90

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 //A0

00 00 00 00 00 00 00 00 00 00 00 00 00 00 07 07 //B0 = Glitch@BE

07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 //C0

07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 //D0

07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 //E0

07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 //F0

###############

data_cmp (read back):

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 07 //B0 = Glitch@BE

07 07 07 07 07 07 07 07 07 07 07 07 07 07 07 07

###############

Paul

Juergen Abel · ‎2022-07-19

No, the glitch is not always at the same position, e.g. if the page is filled with 200 times 0x00 and after that with 56 times 0x07, the error position will be at position 201 (and sometimes also at 202), i.e. the first one or two bytes after the 0x00 area.

Even though a software bug is always possible, the probability is quite low, as I write and then read a page and compare the content. So I can see exactly, what bytes differ. It is always the first one or two bytes after the zero block. Many times only a few bits are missing, e.g. 0x03 instead of 0x07 or 0x01 instead of 0x07.

Also writing pseudo random numbers never produced any error, all pages written and reread are the same, i.e. I can exactly reproduce the error by adding too many (176) zeroes into one page.

Thanks, I will asked ST for any errata or further information about the EEPROM.