2024-06-06 10:01 PM - last edited on 2024-06-07 05:56 AM by Tesla DeLorean
I am trying to understand what I am doing wrong here. The following image was captured during a debug session to show strange behaviour.
In the If statement ap->enable and ap->active are both true, but ap->relay_map is 0, as shown on the right, so this should evaluate to false. So why does the processor execute the statement within the brackets (shown by the green highlight) since the If fails? It actually crashes the processor because the index evaluates to -1 which is out of bounds of the relays array.
Solved! Go to Solution.
2024-06-17 03:37 AM
I have managed to prove that although the debugger actually looks like it steps to that line under these particular circumstances, in fact it does not execute the statement it steps to.
I put a line in between to turn on the LED. It steps to that line 10 below, but doesn't execute it, the LED does not turn on.
So apologies for wasting everyone's time. This is just an artefact of the debugger operation and not real.
for(ai=0;ai<(sizeof(Alarms)/sizeof(AlarmType *));ai++)
{
//this process effectively does an OR, so multiple alarms can share one relay
ap=Alarms[ai];
if(ap->enable == true)
if(ap->active == true)
{
if(ap->relay_map > 0)
{
HAL_GPIO_WritePin(GPIOC, LED_USER2_Pin, GPIO_PIN_RESET); //Turn Alarm LED on
relays[ap->relay_map-1].relay_state=true;
}
}
}
2024-06-06 10:27 PM
I absolutely agree, this should work.
But you are working with an OS, could it be that some other task jumps inbetween?
Just for testing, what if you put if(ap->relay_map > 0) directly before setting relay_state = true ?
And for comlpleteness it would be nice to actually tell us which compiler and version you are actually using.
2024-06-06 10:42 PM
Maybe - you should tell also, on which cpu and (if applicable ) d-cache used and INT or DMA ... ?
2024-06-06 11:33 PM
Any code we can't see to the right of the if?
Can you printf the values seen here?
Is optimization on or off? Does that change behavior?
An STM32F746 ?
2024-06-07 01:56 AM - edited 2024-06-07 03:04 AM
There are several ways this can happen (it has happened to me too):
I suggest turning off optimizations. Playing with stack size. Investigate timing of the problem (perhaps the crash coincides with a specific event and is correlated with it). Enable all warnings and change your code until you get 0 warnings. And add some sanity check to prove it actually loaded the latest code (delete build artifacts and add a print statement somewhere).
Comparing a boolean to true is redundant.
2024-06-07 03:06 AM
Thanks for all the contributions. Let me try to answer each question.
Compiler version is 1.15.0
Processor/board is STM32H735G-DK
No INT, but 2 DMA streams for ADC1 and ADC3.
Pretty sure Optimization is on, but not experienced enough to know yet how to turn it off.
I have this section protected from the other two tasks which can modify these values by Mutexes.
I modified the code in case after 30 years of writing in C I misunderstood how && works. I split it up as shown in the following. Behaviour does not change. For iterations through the loop where either enable or active are false, it does not do the relay_map test. But if these are true, it appears to execute the relays[... statement even though relay_map is set to zero.
The full task is
for(;;)
{
if ((osMutexAcquire(Alarm_Relay_Dyn_MutexHandle, mutexTWO_TICK_DELAY) == osOK) && (osMutexAcquire(Alarm_Relay_Stat_MutexHandle, mutexTWO_TICK_DELAY) == osOK))
{
for (ai=0;ai<8;ai++)
relays[ai].relay_state=false;
for(ai=0;ai<(sizeof(Alarms)/sizeof(AlarmType *));ai++)
{
//this process effectively does an OR, so multiple alarms can share one relay
ap=Alarms[ai];
if(ap->enable == true)
if(ap->active == true)
{
if(ap->relay_map > 0)
{
relays[ap->relay_map-1].relay_state=true;
}
}
}
osMutexRelease(Alarm_Relay_Dyn_MutexHandle);
osMutexRelease(Alarm_Relay_Stat_MutexHandle);
for (ai=0;ai<8;ai++)
{
rp=&relays[ai];
}
}
osDelay(1000);
}
No apparent compilation link errors.
I know that comparing a boolean to true is redundant. I added that in case it made a difference.
I cannot prove that it is actually executing that code line, because I don't know how to look at that memory location to see if it is being written to. But the debugger certainly steps there. It may all be just a case of the debugger stepping to a line it doesn't actually execute. I have seen it do that a lot.
I suspect the crash was coincidental, and I have yet to find the reason for that.
I will try playing with some of the things you have suggested.
2024-06-07 03:24 AM
H735...
So turn off (or not enable in Cube) D-cache , just to be sure, its not about caching. (for test)
Optimizer usually working perfect, so no need to set to zero. (but try...)
2024-06-07 04:20 AM
@AScha.3 wrote:H735...
So turn off (or not enable in Cube) D-cache , just to be sure, its not about caching. (for test)
Good one! D-cache can cause a lot of issues if not configured or used correctly (sometimes it needs to be flushed/invalidated). Disabling it is a good way to check.
@AScha.3 wrote:Optimizer usually working perfect, so no need to set to zero. (but try...)
My point about the optimizer wasn't that it was buggy, but that it can change the symptoms of bugs. Changing optimization settings can expose bugs or mask the effects of bugs.
2024-06-07 04:32 AM - edited 2024-06-07 05:52 AM
@CYBR wrote:Processor/board is STM32H735G-DK
I have the same board on my desk. Do you have any external hardware? If you upload your (simplified) project I can see if I can replicate your issue.
@CYBR wrote:Pretty sure Optimization is on, but not experienced enough to know yet how to turn it off.
right click your project in project explorer or click "Project" in menubar
Then go to Properties/C C++ Build/Settings/MCU GCC Compiler/Optimization/Optimization level
(In Settings select your build configuration(target), probably "Debug")
@CYBR wrote:I cannot prove that it is actually executing that code line, because I don't know how to look at that memory location to see if it is being written to. But the debugger certainly steps there. It may all be just a case of the debugger stepping to a line it doesn't actually execute. I have seen it do that a lot.
You can try opening disassembly window and step through machine code. You can also inspect memory in memory viewer window.
You can enable stack overflow detection option in FreeRTOS configuration, although it doesn't catch all types of stack overflow.
Another idea is to copy ap->relay_map to a local variable and then use that in the if and in the array brackets. It should prevent many causes of the value changing in between checking the value and using the value.
2024-06-07 05:53 AM
+1 The optimizer can rearrange the code and give non-linear behaviour, so there's less/no one-to-one relationship between the C code and the line/address meta data with the assembler that performs it. Data held in registers will not be visible in memory view.
Thus why I said to Print It Out, to force the code to express what IT'S SEEING
A disassembly of the code would also be instructive.