cancel
Showing results for 
Search instead for 
Did you mean: 

Compiler Optimization Problems

raptorhal2
Lead
Posted on December 23, 2011 at 17:07

After following CJacob's 18 Dec post on Yagarto vs Raisonance problems, it is probably useful to identify known pitfalls in assuming optimized code runs the same as non-optimized code. Here is my contribution. Y'all can chime in with your experience.

Optimized code maximizes register usage for temporary storage of variables. This is OK if the compiler has knowledge of any change in a variable. But if a variable is being updated by an interrupt such as a SysTick timer, the optimized code continues using the register value, and you wonder why your processing hangs. Debugging is further complicated by believing the variables watch window. It references memory. Put up a register watch and the discrepancy is then visible.

Cheers, Hal

#optimisation
8 REPLIES 8
picguy2
Associate II
Posted on December 23, 2011 at 18:50

Optimization of incorrect code simply makes debugging harder.  baird.hal.001 with his incorrect ISR code example gives an example of hard to track down incorrect code.  The fact assumed in his thread is that the incorrectness only appears when the C code is optimized.  The incorrect program is not the C code optimized or not.  But the incorrectness is in the same binary program.

OTOH, compilers have been known to have bugs.  Optimizers may be incorrect.  Incorrect debug pods may introduce random errors.  Recently my debugging finds about 50% personal coding and design errors and 50% confusing and incorrect product documentation.

I’m to lazy to track down what might be an IAR C++ overloading error.  It’s something about signed vs. unsigned when combined with int, short int and char.  Auto allocated stack variables vs. variables in RAM may also play a part.
raptorhal2
Lead
Posted on December 23, 2011 at 22:19

#*%&$@ forum - ate my reply.

The problem is only in the optimized binary program, not the nonoptimized version. Consider the following:

void Delay(__IO uint32_t nTime)

{

  StartCounter = *PtrTickCounter;

  do

  {

/* Counter overflow logic here irrelevant to discussion */

  }

  while ( (*PtrTickCounter - StartCounter) <=  nTime);

}

void SysTick_Handler(void)

{

  TickCounter++;                       /* Increment counter at SysTick rate */

}

The above works for both optimized and nonoptimized because the compiler is forced to reference memory for every usage of TickCounter. Now consider this:

void Delay(__IO uint32_t nTime)

{

  StartCounter = TickCounter;

  do

  {

/* Counter overflow logic here irrelevant to discussion */

  }

  while ( (TickCounter - StartCounter) < nTime);

}

void SysTick_Handler(void)

{

  TickCounter++;                       /* Increment counter at SysTick rate */

}

In this case, the first referenced value of TickCounter is placed in a register for the optimized code, and the register is then used everywhere in the rest of the Delay function.

My purpose of this discussion was to develop guidance for avoiding problems in code optimization, so:

Guidance # 1

  Use pointers or some suitable other method for synchronizing the use of variables updated by higher priority functions.

Guidance # 2 anyone ?

Cheers, Hal

Posted on December 23, 2011 at 23:33

''The problem is only in the optimized binary program, not the nonoptimized version''

 

Wrong.

 

 

The problem

is

 in the source code - which makes a false assumption that the compiler has to re-read the value from memory for each & every reference.

It just so happens that, in this specific case, this false assumption does not happen to cause any visible effect without optimisation.

If you want to force the compiler has to re-read the value from memory for each & every reference, then you must use the volatile keyword - which is a standard part of the 'C' language, and exists precisely for this kind of situation.

See any good 'C' textbook for further details...

If code ''breaks'' at higher optimisation, it is (almost) invariably due to flaws in the source...

Posted on December 24, 2011 at 00:44

Indeed, any variable that might change outside normal program flow should be marked as volatile.

Marking local/automatic variables as volatile is generally pretty pointless, but I'll mention it because I've seen enough people do so. The only reason to do so would be if you used them as a pointer an interrupt might use, but that would be rather dangerous practice itself. Another might be an event/semaphore flag, but making them, or containing structures, static as well if they have the potential to outlive the subroutine.

Using static analysis/verification tools like Lint, MISRA, or Coverity might pick up a lot of sloppy coding. GCC's -Wall is also surprising good, particularly with printf/scanf validations.

Optimizing compilers are real pigs to validate, but most failures I've seen in recent years are with the source being compiled, and not the compiler.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
carl2399
Associate II
Posted on December 24, 2011 at 03:03

Three common gotchas with optimising compilers:

1) The volatile / non-volatile variable item just discussed.

2) Related, but slightly different:

A code segment like:

...

for(i=0;i<10;i++);  // Add a short delay

...

If i is volatile, you'll get a short delay. If not, then the code will be optimised out of existence.

I know these sorts of short delays are not ideal, but sometimes when doing embedded code they are the most elegant solution. It's generally an epic fail for PC's because you don't know how fast the system is, but if there's no operating system on your embedded processor, then you generally know how fast it's going.

3) Uninitialised variables:

int foo(int bar)

{

int ret;

if (bar == 0)

ret = 0;

if (bar > 0)

ret = 1;

return ret;

So what happens when bar < 0? In non-optimised code this may lurk in code for years without causing a problem, but will tend to break almost immediately when using an optimiser. The code example I've given may be simplistic, but it's amazing how many times you see uninitialised variables in code. Even harder to detect is when the return result is passed in by reference.  

Several years ago I worked on a large embedded system. It was a hierarchical intercom system spread throughout a prison. I was called in to help them figure out why the system was unreliable during bad weather (electrical storms). Fixing items like the ones above was all that was required to make the system stable again. I make it sound a little easier than it actually was, as the code was spread out around the installation and I first had to figure out what the source code was doing in the first place.
Posted on December 24, 2011 at 11:56

''Three common gotchas with optimising compilers''

 

ISO/IEC 9899:1990 - the 'C' language standard used by most embedded compilers - states:

An implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).

This means that compilers are perfectly entitled to generate

no

  output code for source code that has no effect - irrespective of any ''oprimisation'' setting.

Therefore these should be regarded as basic 'C' programming gotchas! - not just ''optimisation'' gotchas!

''Uninitialised variables''

These are certainly not ''optimisation'' related.

 
Posted on December 24, 2011 at 11:59

''Marking local/automatic variables as volatile is generally pretty pointless''

Indeed.

The exception is, possibly, the loop-counter for a software delay - as mentioned earlier.

But that's a whole different can of worms:

http://www.8052.com/forum/read/162556

''static analysis/verification tools like ... MISRA''

 

Note that MISRA is just a set of rules; not a tool - there are tools to check MISRA compliance, but MISRA is not a tool in itself.

''GCC's -Wall is also surprising good''

 

Indeed. But even

-Wall

omits some warnings; for a totally complete set, you also need to add

-Wunused-parameter

,

and

‑Wextra

''most failures I've seen in recent years are with the source being compiled, and not the compiler''

Absolutely!

http://www.catb.org/~esr/faqs/smart-questions.html#id478549

mckenney
Senior
Posted on January 04, 2012 at 04:59

>The above works for both optimized and nonoptimized because the compiler is forced to >reference memory for every usage of TickCounter. Now consider this:

Sorry, I have to disagree with this one. Depending on what the omitted code does, this might

not prevent caching either.

The compiler is in two cases (actually more, but these are the main ones) required to assume

that any/all global memory has been modified (requiring a re-load):

1) An assignment through (almost) any pointer.

2) A call to a function outside the compiler's purview -- usually this means outside the current

    source file, but this gets murky with a global optimizer.

If the omitted code did either of these, then that's why the code operated as expected,

otherwise the compiler could have cached even the reference through a pointer. It's also

possible the people who wrote your compiler decided to take a more conservative approach,

but the next compiler you use might not.

''volatile'' is really what you want here.

Knowing the rules above can also be used in a positive sense, helping you to write code that

the optimizer can do a better job with.