cancel
Showing results for 
Search instead for 
Did you mean: 

[STM32F2xx] 120MHz + 3 wait state = bugs?

root
Associate II
Posted on August 09, 2011 at 08:41

Hello,

Since the beginning of the project on the STM32F205RG, I had hard faults from time to time.

These last days, I wrote a hard fault handler as described in Joseph Yiu's book to see why ... and the result was very unexpected.

I had hard fault from basically everywhere, any line of code (even in the STM periph library) ... and the type of fault was even more wierd ... illegal instruction, coprocessor request, etc ...

Somewhere in my code I write FLASH ... and sometimes I had a programming error with no obvious reason why. I've put a bearkpoint just where (in the periph library) the test is done to see if there was a programming error (doing a logical and between FLASH->SR and 0xEF), and surprise, when breakpoint triggered, the debugger showed me a FLASH->SR value of ... 0.

This could only mean a problem within the MCU ...

I was running at 120MHz with all prefetches etc and 3 flash wait states.

I switched to no optimization and 4 wait states ...

No more hard faults, not more flash programming errors, program was running - slower but - smooth.

I need to figure out which optimization made the MCU to act like this, but I'm a bit disappointed that ST claims some figures about flash and ART performances if it's not true ...

Regards,

Thomas.
14 REPLIES 14
Nickname12657_O
Associate III
Posted on August 09, 2011 at 15:17

Hi,

 

 

It is the first time i hear such problem on F2.

 

Could you please tell us what is the STM32 revision used ?

 

 

If you try to run some example which are provided with the standard peripherals library and see if this behavior is still there?

 

 

Could you tell us what is the data size of the flash programming (64 bit, 32bit, 16bit or 8bit)?

 

 

Cheers,

 

 

STOne-32. From: legrand.thomas.001

Posted: Tuesday, August 09, 2011 8:41 AM

Subject: [STM32F2xx] 120MHz + 3 wait state = bugs?

Hello,

Since the beginning of the project on the STM32F205RG, I had hard faults from time to time.

These last days, I wrote a hard fault handler as described in Joseph Yiu's book to see why ... and the result was very unexpected.

I had hard fault from basically everywhere, any line of code (even in the STM periph library) ... and the type of fault was even more wierd ... illegal instruction, coprocessor request, etc ...

Somewhere in my code I write FLASH ... and sometimes I had a programming error with no obvious reason why. I've put a bearkpoint just where (in the periph library) the test is done to see if there was a programming error (doing a logical and between FLASH->SR and 0xEF), and surprise, when breakpoint triggered, the debugger showed me a FLASH->SR value of ... 0.

This could only mean a problem within the MCU ...

I was running at 120MHz with all prefetches etc and 3 flash wait states.

I switched to no optimization and 4 wait states ...

No more hard faults, not more flash programming errors, program was running - slower but - smooth.

I need to figure out which optimization made the MCU to act like this, but I'm a bit disappointed that ST claims some figures about flash and ART performances if it's not true ...

Regards,

Thomas.
Posted on August 09, 2011 at 17:28

Whatever contraptions are placed in front of the flash array to hide it's real latency, they are still dependent on it's actual speed, as manufactured.

If changing to 4 waits, it suggests the speed is closer to 30 MHz (33 ns) than it is 40 MHz (25 ns).

Now changing the optimization, that's another matter, it makes one wonder about the compiler, and if it is treating the hardware registers as volatile or not, or flattening loops it shouldn't.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
root
Associate II
Posted on August 10, 2011 at 08:45

Hello,

Thanks for the reply.

It is difficult to run examples from the standard periph lib as is, because it's already the target PCB, not an evaluation board. I don't have the board here, I'll look at the revision later today.

The flash programming size is word (32bits, didn't even know I could program 64 bits at a time ...). Voltage is 3.3V (TI REG113). I'm still running the bootloader at the same speed with all optimizers on without problem ... it seems more related to interrupts (I got a lot of fast ITs in my application). I even did some fixed point library benchmarks (very calc intensive) without problems (no IT).

@clive : I did turn off all optimizations at once and it worked so I developed urgent features before started looking what was exactly the cause of the mess (flash latency or one optimizer).

As soon as possible I'll try to re-enable each optimization and flash wait states to see if the problem is related to only one parameter.

Regards,

Thomas.

root
Associate II
Posted on August 10, 2011 at 08:48

The hard faults I had were illegal instructions and coprocessor calls, these are not register related I think.

The compiler is from Altium, TASKING for ARM.

Thomas.

root
Associate II
Posted on August 10, 2011 at 09:14

Clive : ''If changing to 4 waits, it suggests the speed is closer to 30 MHz (33 ns) than it is 40 MHz (25 ns).''

In fact it already is ... 3 wait state is 4 CPU cycles (stated in the datasheet), so I assume the flash is running at max 30MHz.

But as the bus width is 128 bits, each read is up to 4 instructions (8 if using 16 bit thumb instructions, which is, I think, the case, as ARM 32 bit instructions code is not supported on M3, only thumb2 declares 32 bit instructions, and I doubt the TASKING compiler outputs Thumb2 32 bit instructions).

So if the read is ''right'' each time, you need 4 CPU cycles to read 4 instructions ahead, and you are running at ''virtual 0 wait state''.

The performance loss by disabling all optimizers seems ... huge :(

My hope is that the optimizers are not the cause, only flash wait state, going from 3 to 4 wait state is less perf loss than disabling all other optimizers.

Thomas.

js23
Associate III
Posted on August 10, 2011 at 13:22

While using the F205 for the past 2 months now, the only strange thing I discovered was that everything works as expected. No problems with our prototype boards.

My guess for your problem: Instable power supply. Maybe oscillations of internal regulator caused by board layout or ESR of capacitors connected to VCAP?

Posted on August 10, 2011 at 18:49

There are some Thumb#1 instructions that are 32-bit wide, BL springs to mind. So these or some Thumb#2 could cause them to span a flash line.

I'm not sure how ART passes through, or handles misses, one might realistically lose a cycle in there. All these schemes assume you can out wit the word requests, I'd suppose a sufficiently inopportune string of requests would bring it to it's knees and expose the limits of the array. The speed is tied to voltage.

Reading garbled data will cause plenty of faults, registers loaded from literal pools might also be garbled, check for bit sensitivities. I guess you could observe if specific address masks show up more regularly with these faults, and if this highlights a boundary issue. I'd hope literals would be aligned, but who knows.

You can typically write 16 or 32-bit words with the library. If the F2 is anything like the F1, you certainly don't need to be banging PG up and down repeatedly for each word. There might be some tricks to writing a whole flash line before spinning on ready.

You might want to check if TASKING is handling the __IO (volatile) specified on the peripheral registers within the structures, or if it needs to be applied to the structures themselves.

#define     __IO    volatile                  /*!< defines 'read / write' permissions   */

/**

  * @brief FLASH Registers

  */

typedef struct

{

  __IO uint32_t ACR;      /*!< FLASH access control register, Address offset: 0x00 */

  __IO uint32_t KEYR;     /*!< FLASH key register,            Address offset: 0x04 */

  __IO uint32_t OPTKEYR;  /*!< FLASH option key register,     Address offset: 0x08 */

  __IO uint32_t SR;       /*!< FLASH status register,         Address offset: 0x0C */

  __IO uint32_t CR;       /*!< FLASH control register,        Address offset: 0x10 */

  __IO uint32_t OPTCR;    /*!< FLASH option control register, Address offset: 0x14 */

} FLASH_TypeDef;

#define FLASH               ((FLASH_TypeDef *) FLASH_R_BASE)

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
root
Associate II
Posted on August 11, 2011 at 12:38

Hello,

(damn forum ate my last message ......... as buggy as the MCUs ....)

Here are my latest investigations ... I setup my project for Atollic TrueSTUDIO (free version, based on GCC) and it enters hard fault (illegal instruction) after only a few thousands of instructions, no matter what are my flash wait states and optimizers enabled.

In TASKING, my code runs but enter hardfault (or just goes away, like nothing works, I break the mcu and the PC is 0x0 or 0x20, etc) after a few seconds (between 1 and 20s max).

It's getting worse every day !!!!

My application needs to be very stable, we switched to these MCU to get better peripherals and more processing power, we end up with unusable I2C and a few seconds of ''processing power''.

I really need help on this one, even paid help.

Regards,

Thomas.
root
Associate II
Posted on August 11, 2011 at 12:44

About revision ...

Read at address 0xE0042000 with ST-Link utility gives me 0x20016411, so I assume REV_ID is 2001 and the revision is revision Y.

Thomas.