Wrong CRC calculation with optimized code
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 11:00 AM
Hi.
I encountered a problem where I use CRC hardware with optimized code in Keil. I use CRC hardware inside of interrupt routine and without optimization I get correct result. But if I turn optimization on (level 1 is enough) I don't get correct result anymore. Do you have any idea what could be wrong here?Before calculation (already inside of ISR) I reset CRC and then do 6 consecutive writes to DR register. First I write a half-word and after that five writes of a byte variable from an array. Then I store a result to new 8-bit variable:*(uint16_t *)&CRC->DR = half_word_var;*(uint8_t *)&CRC->DR = byte_array[0];*(uint8_t *)&CRC->DR = byte_array[1];*(uint8_t *)&CRC->DR = byte_array[2];*(uint8_t *)&CRC->DR = byte_array[3];*(uint8_t *)&CRC->DR = byte_array[4];*(uint8_t *)&CRC->DR = byte_array[5];byte_result = (uint8_t)CRC->DR;- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 11:23 AM
From RM
The duration of the computation depends on data width:
•
4 AHB clock cycles for 32-bit
•
2 AHB clock cycles for 16-bit
•
1 AHB clock cycles for 8-bit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 11:45 AM
Ok, but I don't understand here what does this have to do with correctness of calculation if code is optimized or not. As I said, if I don't have code optimized (OP level 0) then I get correct values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 12:14 PM
Try
*(volatile uint16_t *)&CRC->DR = half_word_var;
*(volatile uint8_t *)&CRC->DR = byte_array[0];
*(volatile uint8_t *)&CRC->DR = byte_array[1];
*(volatile uint8_t *)&CRC->DR = byte_array[2];
*(volatile uint8_t *)&CRC->DR = byte_array[3];
*(volatile uint8_t *)&CRC->DR = byte_array[4];
*(volatile uint8_t *)&CRC->DR = byte_array[5];
byte_result = (uint8_t)CRC->DR;
Up vote any posts that you find helpful, it shows what's working..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 12:27 PM
Optimized code runs faster (i.e. has less instructions) than unoptimized. It is possible that in your optimized program, the DR registers get loaded faster than they can be processed, since SYSCLK may be more than the AHB clock. Or you're reading it too quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 12:52 PM
It is all synchronous, you should be able to jam data as fast as the CPU can deliver it.
The most probably issue is that the compiler is folding multiple writes into the same location, the volatile keyword should remedy that.Up vote any posts that you find helpful, it shows what's working..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-16 11:09 PM
Thanks to all!
Clive, your suggestion with volatile keywords works.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-20 2:34 AM
myisr:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
str fp, [sp, #-4]!
add fp, sp, #0
ldr r2, .L2
ldr r3, .L2+4
ldrh r3, [r3]
mov r3, r3, asl #16
mov r3, r3, lsr #16
strh r3, [r2] @ movhi
ldr r3, .L2
ldr r2, .L2+8
ldrb r2, [r2] @ zero_extendqisi2
strb r2, [r3]
ldr r3, .L2
ldr r2, .L2+8
ldrb r2, [r2, #1] @ zero_extendqisi2
strb r2, [r3]
ldr r3, .L2
ldr r2, .L2+8
ldrb r2, [r2, #2] @ zero_extendqisi2
strb r2, [r3]
ldr r3, .L2
ldr r2, .L2+8
ldrb r2, [r2, #3] @ zero_extendqisi2
strb r2, [r3]
ldr r3, .L2
ldr r2, .L2+8
ldrb r2, [r2, #4] @ zero_extendqisi2
strb r2, [r3]
ldr r3, .L2
ldr r2, .L2+8
ldrb r2, [r2, #5] @ zero_extendqisi2
strb r2, [r3]
sub sp, fp, #0
@ sp needed
ldr fp, [sp], #4
bx lr
While when I compile with -O2 I much better optimized code:
myisr:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r3, .L2
ldr r2, .L2+4
ldrh r1, [r3]
ldrb r2, [r2, #5] @ zero_extendqisi2
ldr r3, .L2+8
strh r1, [r3, #8] @ movhi
strb r2, [r3, #8]
bx lr
However it is no longer correct.... (because you omitted the ''volatile'' keyword, the compiler is allowed to optimize unnecessary writes away).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-22 1:25 AM
I see this as a ST library bug, it's the library that should've defined the CRC->DR as a volatile type.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2016-02-22 5:32 AM
I see this as a ST library bug, it's the library that should've defined the CRC->DR as a volatile type.
Why, the library defines it as volatile, it is the user casting to 8 and 16-bit widths that is falling over here.Up vote any posts that you find helpful, it shows what's working..
