Memory corruption when using USART1

el12t2o · ‎2014-10-18

Posted on October 18, 2014 at 15:09

When I write data to USART1 register (for Tx) either via USART1->SR or USART_SendData(USART1, ...) I am experiencing an odd data corruption issue. The data out of the USART port is transmitted correctly however other data in SRAM is corrupted.

The most significant data corruption occurs in a large data array consisting of about 230 x 16-bit values. The values get replaced with E0 or F0 alternately in this pattern (E0 E0 F0 F0 E0 E0 F0 F0 etc.) Every value gets replaced in this pattern, not just a few values.

I have tried turning off interrupts to no avail. One thing I noticed is this bug only started occurring when I used SPI & DMA to transmit data at the same time. The SPI DMA has priority over the USART. Before when I was just transmitting data over SPI without DMA, this corruption issue did not happen. (Note, the array of values isn't what is actually transmitted - the values get transposed into a variable pulse width bit stream and the corruption appears not to affect this particular operation which essentially replaces bit 1s with one byte and bit 0s with another byte. In my case the SPI data controls some RGB LED array, and the RGB LED data stream is correctly encoded for the incorrect data.)

And, if I disable all USART function, I also do not get any memory corruption issues.

Any ideas would be much appreciated, have read errata sheets and cannot seem to figure out what would cause this.

Tesla DeLorean · ‎2014-10-18

Posted on October 18, 2014 at 15:26

Ok, well I'm pretty sure the USART isn't causing this problem here.

Debuggers are somewhat invasive, they distort peripheral registers and they do not stop DMA occurring in the background.

DMA should avoid using local/auto (stack) variables, the DMA setting will/can outlive the current stack context.

Circular DMA occurs outside the control of the CPU, memory will continuously change.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

el12t2o · ‎2014-10-18

Posted on October 18, 2014 at 16:21

Thanks for your input.

The DMA is one shot and initiated by a timer interrupt (although the bug still occurs if initiated manually.) Transfer is about 1KB lasting ~3ms every 5ms.

I am using global variables in several places in the timer interrupt to determine what data to send. The data is then sent by DMA using an pointer to a global variable. Is this bad? I couldn't imagine that I would be able to give it a variable in the stack.

It doesn't seem like the actual DMA data is being corrupted. I am sending data 11100000 (a zero) and 11111000 (a one) via SPI which encodes a pulse-width bit train. This controls the LED array via 1-bit serial. The pulse train is there and intact - but it is reading the incorrect source data. To back this idea up I forced the generation function to always return greyscale 128 codes (that's 1 followed by 7 x 0) and it does control the LEDs correctly. It just seems the global variable containing the LED intensity/colour information (R,G,B) gets completely corrupted. The global variable is an array of 230 x 3 x 16-bit values and not a pointer which could be moved - it seems like something is actually re-writing all the values in my array with this random E0 E0 F0 F0 garbage.

I have tried pinpointing it in the debugger, but it doesn't seem like any particular event in the USART is corrupting it, just after it does the write_led function begins writing garbage into the array (pointing to some kind of stack corruption? I tried extending the size of the stack to no avail.)

carl2399 · ‎2014-10-19

Posted on October 19, 2014 at 14:38

As soon as I read your post I identified with your issue. My symptoms are a bit different, but there is enough in comment to warrant a mention.

I have one SPI port transmitting (using DMA and SPI master mode) via a long chain of shift registers, and another SPI port receiving (using DMA and SPI slave mode). I'm running at about 7MHz SPI clock which is as high as I can go without getting too much slip between the SPI signals to be reliable in the circuit I'm using. I've struggled to get this working, as whenever I activate this functionality I get occasional corruptions in a separate (and critical) part of memory. Both sets of memory (SPI + corrupted area) are in the internal chip RAM (STM32F4) but not the CCM RAM, and all of it is statically assigned at compile time and not part of any stack. The area of RAM that is being corrupted is subject to a read / write access as part of a PENDSV interrupt handler - which occurs while the SPI DMA process is happening. It is as if the SPI DMA access and the normal RAM access from within the interrupt handler are somehow conflicting - and it specifically seems to be related to the slave SPI port DMA. If I disable the slave SPI port DMA then I don't get any memory corruptions.

As it turns out, the slip in the SPI signals is too great for us to use this architecture, but the memory corruption has bothered me a bit - especially considering the errata about DCMI DMA transfers. While I'm not meeting any of the supposed conditions of the errata, I still seem to be getting memory corruptions.

waclawek.jan · ‎2014-10-19

Posted on October 20, 2014 at 08:36

Most of the ''mysterious overwrites'' are simply results of program bugs. Hardware bugs are very rare.

The debugger allows data breakpoints, that should help you to find out the function which overwrites the variables unexpectedly.

JW

el12t2o · ‎2014-10-20

Posted on October 20, 2014 at 23:29

Good suggestion on the data watch point.

I set it up. It doesn't trip after I run through one USART write. The corruption still occurs.

When I write it from my program it does trip.

Later, I'll try seeing if reading via DMA will trigger it.

el12t2o · ‎2014-10-20

Posted on October 20, 2014 at 23:44

[quote]I have one SPI port transmitting (using DMA and SPI master mode) via a long chain of shift registers, and another SPI port receiving (using DMA and SPI slave mode). I'm running at about 7MHz SPI clock which is as high as I can go without getting too much slip between the SPI signals to be reliable in the circuit I'm using. I've struggled to get this working, as whenever I activate this functionality I get occasional corruptions in a separate (and critical) part of memory. Both sets of memory (SPI + corrupted area) are in the internal chip RAM (STM32F4) but not the CCM RAM, and all of it is statically assigned at compile time and not part of any stack. The area of RAM that is being corrupted is subject to a read / write access as part of a PENDSV interrupt handler - which occurs while the SPI DMA process is happening. It is as if the SPI DMA access and the normal RAM access from within the interrupt handler are somehow conflicting - and it specifically seems to be related to the slave SPI port DMA. If I disable the slave SPI port DMA then I don't get any memory corruptions. [/quote]

This is interesting, and would explain what I am seeing. The odd thing is, the corruption is so consistent. It's not garbage/random data per se, always E0 or F0.

I haven't yet tried to move some stuff to CCM (I believe it's in general purpose RAM, but I've made no special attempt to locate it anywhere. Address is 0x20000xxx), which would be an acceptable, although undocumented workaround if necessary. I will try this later.

As I said above I've tried setting memory/data watches on it. The data appears to change and the watches do not trigger. I've also tried to step through the USART write instruction. Here's what's really odd: the corruption *doesn't* occur immediately but occurs a few instructions after the actual write to the SR register, and then it occurs after on a semi-regular basis...

Going to investigate further, will update.