Wrong CRC calculation with optimized code

matic · ‎2016-02-16

Posted on February 16, 2016 at 20:00

Hi.

I encountered a problem where I use CRC hardware with optimized code in Keil. I use CRC hardware inside of interrupt routine and without optimization I get correct result. But if I turn optimization on (level 1 is enough) I don't get correct result anymore. Do you have any idea what could be wrong here?

Before calculation (already inside of ISR) I reset CRC and then do 6 consecutive writes to DR register. First I write a half-word and after that five writes of a byte variable from an array. Then I store a result to new 8-bit variable:

*(uint16_t *)&CRC->DR = half_word_var;

*(uint8_t *)&CRC->DR = byte_array[0];

*(uint8_t *)&CRC->DR = byte_array[1];

*(uint8_t *)&CRC->DR = byte_array[2];

*(uint8_t *)&CRC->DR = byte_array[3];

*(uint8_t *)&CRC->DR = byte_array[4];

*(uint8_t *)&CRC->DR = byte_array[5];

byte_result = (uint8_t)CRC->DR;

Radosław · ‎2016-02-16

Posted on February 16, 2016 at 20:23

From RM

The duration of the computation depends on data width:

â€¢

4 AHB clock cycles for 32-bit

â€¢

2 AHB clock cycles for 16-bit

â€¢

1 AHB clock cycles for 8-bit

matic · ‎2016-02-16

Posted on February 16, 2016 at 20:45

Ok, but I don't understand here what does this have to do with correctness of calculation if code is optimized or not. As I said, if I don't have code optimized (OP level 0) then I get correct values.

Tesla DeLorean · ‎2016-02-16

Posted on February 16, 2016 at 21:14

Try

*(volatile uint16_t *)&CRC->DR = half_word_var;
*(volatile uint8_t *)&CRC->DR = byte_array[0];
*(volatile uint8_t *)&CRC->DR = byte_array[1];
*(volatile uint8_t *)&CRC->DR = byte_array[2];
*(volatile uint8_t *)&CRC->DR = byte_array[3];
*(volatile uint8_t *)&CRC->DR = byte_array[4];
*(volatile uint8_t *)&CRC->DR = byte_array[5];
byte_result = (uint8_t)CRC->DR;

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

TDK · ‎2016-02-16

Posted on February 16, 2016 at 21:27

Optimized code runs faster (i.e. has less instructions) than unoptimized. It is possible that in your optimized program, the DR registers get loaded faster than they can be processed, since SYSCLK may be more than the AHB clock. Or you're reading it too quickly.

If you feel a post has answered your question, please click "Accept as Solution".

Tesla DeLorean · ‎2016-02-16

Posted on February 16, 2016 at 21:52

It is all synchronous, you should be able to jam data as fast as the CPU can deliver it.

The most probably issue is that the compiler is folding multiple writes into the same location, the volatile keyword should remedy that.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

matic · ‎2016-02-16

Posted on February 17, 2016 at 08:09

Thanks to all!

Clive, your suggestion with volatile keywords works.

re.wolff9 · ‎2016-02-20

Posted on February 20, 2016 at 11:34 I simply created enough context to compile a function with your code. When I compile with -O0 I get:

myisr:
 @ Function supports interworking.
 @ args = 0, pretend = 0, frame = 0
 @ frame_needed = 1, uses_anonymous_args = 0
 @ link register save eliminated.
 str fp, [sp, #-4]!
 add fp, sp, #0
 ldr r2, .L2
 ldr r3, .L2+4
 ldrh r3, [r3]
 mov r3, r3, asl #16
 mov r3, r3, lsr #16
 strh r3, [r2] @ movhi
 ldr r3, .L2
 ldr r2, .L2+8
 ldrb r2, [r2] @ zero_extendqisi2
 strb r2, [r3]
 ldr r3, .L2
 ldr r2, .L2+8
 ldrb r2, [r2, #1] @ zero_extendqisi2
 strb r2, [r3]
 ldr r3, .L2
 ldr r2, .L2+8
 ldrb r2, [r2, #2] @ zero_extendqisi2
 strb r2, [r3]
 ldr r3, .L2
 ldr r2, .L2+8
 ldrb r2, [r2, #3] @ zero_extendqisi2
 strb r2, [r3]
 ldr r3, .L2
 ldr r2, .L2+8
 ldrb r2, [r2, #4] @ zero_extendqisi2
 strb r2, [r3]
 ldr r3, .L2
 ldr r2, .L2+8
 ldrb r2, [r2, #5] @ zero_extendqisi2
 strb r2, [r3]
 sub sp, fp, #0
 @ sp needed
 ldr fp, [sp], #4
 bx lr

While when I compile with -O2 I much better optimized code:

myisr:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r3, .L2
ldr r2, .L2+4
ldrh r1, [r3]
ldrb r2, [r2, #5] @ zero_extendqisi2
ldr r3, .L2+8
strh r1, [r3, #8] @ movhi
strb r2, [r3, #8]
bx lr

However it is no longer correct.... (because you omitted the ''volatile'' keyword, the compiler is allowed to optimize unnecessary writes away).

qwer.asdf · ‎2016-02-22

Posted on February 22, 2016 at 10:25

I see this as a ST library bug, it's the library that should've defined the CRC->DR as a volatile type.

Tesla DeLorean · ‎2016-02-22

Posted on February 22, 2016 at 14:32

I see this as a ST library bug, it's the library that should've defined the CRC->DR as a volatile type.

Why, the library defines it as volatile, it is the user casting to 8 and 16-bit widths that is falling over here.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..