AnsweredAssumed Answered

Problem with STM32F407 RAM

Question asked by forrest.jonathan on Apr 20, 2012
Latest reply on Sep 29, 2012 by Bassett.David
I have recently assembled 6 PCBs with STM32F407IGT6 micros on them.
I seem to be having a problem reading from RAM.
This was first noticed when one of the boards was hard faulting and returning a non-precise bus error.
When this was tracked down through the stack pointer/program counter the problem occurred when data was being sent down USART2.
Instead of looking at 0x40004400 for the status register, the micro appearred to be trying to look at 0x48004400 which of course is outside the register addresses. This didn't happen everytime, just sometimes.
We put a conditional break point just before the register read checking for 0x48004400. When this broke we looked at the disassembly. It had a line:
ldr r0, [r4, #0]
When I looked at the memory location pointed to by r4 the memory said 0x40004400 however the value that ended up in r0 was 0x48004400. An extra bit had been set for no apparent reason.

The following code was then added as the first thing the micros did after initialisation:



// in INTs.
#define MEM_SIZE 2048

volatile uint32_t memtest[MEM_SIZE];

volatile uint32_t mem_errors;

void setMem(uint32_t val)
{
    uint32_t i;
    for (i = 0; i < MEM_SIZE; i++)
    {
        memtest[i] = val;
    }
}

void galTest()
{
    uint32_t testCell;
    setMem(0);
    for (testCell = 0; testCell < MEM_SIZE; testCell++)
    {
        uint32_t compare;
        memtest[testCell] = ~memtest[testCell];
        for (compare = 0; compare < MEM_SIZE; compare++)
        {
            if (compare == testCell)
            {
                continue;
            }
            uint32_t cmpVal = memtest[compare];
            uint32_t cellVal = memtest[testCell];
            if (cmpVal != 0)
            {
                mem_errors++;
            }
            else if (cellVal != ~cmpVal)
            {
                mem_errors++;
            }
        }
        memtest[testCell] = ~memtest[testCell];
    }
}

void doMemTest()
{
    mem_errors = 0;
    gpio_off(&runup1_pin);
    while (1)
    {
        galTest();
        if (mem_errors > 0)
            gpio_on(&runup1_pin);
    }
}


On 2 boards, mem_errors stays 0 for at least half an hour of run time.
On the other 4 boards mem_errors keeps ramping up. Slow on some boards, faster on others.
When debugging on the worst of these boards, if I pause at any point in time and look at memtest, approximately half the values in the array will be zero and half of the values in the array will be 0x08000000. The same bit that was causing the hard fault is set. This appears to be happening when the value is loaded from RAM into a register. Then when the value is stored from the register back into RAM it stays as 0x08000000.

Has anyone seen anything like this? It looks like a problem with the ICs themselves. I even swapped a good IC and a bad IC to make sure it was the micro and not the PCB that was causing the problem.
We are probably just going to buy some more micros, change them over and hope the problem goes away. But if anyone has any good ideas...

Outcomes