2019-05-23 07:41 AM
Hello all, I am still attempting to create my project on the STM32F411RE that is capable of taking in 3 sinusoidal signals and return the frequencies + amplitudes for these signals. Thus I need to use the CMSIS-DSP pack, but this is continuously giving me all sorts of random errors which I cannot seem to fix.
Because I cannot get my original code to work yet, I took a step back and made a very simple program to find the frequency and bin for a combined sinus signal. Steps:
When I run my code (https://pastebin.com/37n0fS1F) I get a Hard Fault Error which I cannot find out how to resolve:
Bus, memory management or usage fault (FORCED)
Precise data access violation (PRECISERR)
Bus Fault Address Register (BFAR): 0x20020000
For this I looked at the Fault analyzer in screenshot above, but I cannot figure out which line is causing the error in my program. I have been studying this document, but I with it I cannot solve my problem.
I am starting to get discouraged by the STM32, as even implementing basic examples seems to be fraught with error. Debugging is all part of the game, but this seems largely excessive.
For example, when I include the code
status = arm_rfft_fast_init_f32(&inst, fftLen);
which is supposed to tell me the status of the mathematical operations, I can no longer upload the code to my STM32 board via the CubeProgrammer. Literally just commenting this code allows me to upload it again.
I also attempted to use multiple calls of arm_max_f32() function to find the multiple maximums of the input signal, but this obviously wont work as it is a redefinition instead. I tried looking up how to access specific frequencies inside the FFT output to find the associated magnitudes but I cannot find how to do this either.
2019-05-23 07:51 AM
So blowing out the top of memory, suggestive that you have a stack based local/auto variable, and then other code walking it beyond scope.
Check dimension parameters.
Check the faulting code. What is the code at/immediately prior to the fault doing? What memory/array is it indexing?
You should have all the source, walk back the code tree, look at the parameters, confirm the expectations.
Here we usually start by having a USART/SWV debug channel working, and a Hard Fault Handler that can decompose the fault to the terminal.
2019-05-23 08:20 AM
@Community member By using breakpoints and stepping through the code it appears that the line that is causing issues is
arm_rfft_fast_f32(&inst, signalCombined, fftCombined, 0);
which is the initial fft function for the input signal. So I cannot even get the initial DSP function to work correctly :(
Up to that point the sin() signals are all behaving as expected wtih correct length and inputs, so I dont understand why this function is causing issues as I am not doing anything spectacular, I am just feeding data into the function.
From the CMSIS documentation for this function (LINK) I thought that maybe these need to be pointers as well as how it is stated, but
arm_rfft_fast_f32(&inst, &signalCombined, &fftCombined, 0);
// function( FFT instance pointer, input pointer, output pointer, forward / backward FFT)
returns the exact same Hard Fault, so that evidently does not matter if its a pointer or not.
2019-05-23 08:39 AM
You're going to have to STEP IN to the function, and get closer to the point of the fault. Or look EXPLICITLY at the PC reported as the faulting location.
It could fault immediately upon entry if the stack pointer is in the weeds, or there is a VPUSH without the FPU being enabled.
Without getting pulled to deeply in, my guess would be that there is an iterated loop blowing the bounds of either the input or output array. Guessing the input.
Memory alignment can also be an issue for CM4 doing 64-bit unaligned reads, don't see that here.
Hard Faults tend to be gross errors, rather than random.
2019-05-23 11:40 AM
Any clues on the call stack? Stack and heap size (linker file) value, error between pointer and real value.
Do check all compiler warnings too.
2019-05-23 12:30 PM
It is a GNU chain, so top of RAM most likely.
2019-05-23 09:50 PM
Hmm, the local arrays take 96kB (6 * 4 * 4096) alone. I hope the library doesn't use much more... The device seems to have 128 kB of RAM.
2019-05-24 02:19 AM
@Community member it took me a while to step through all the function call and machine code, but I have finally identified the specific line at which the program conks out. The following function calls are made in the code:
arm_rfft_fast_f32.c //called in the main code to generate the fft of input signal
arm_cfft_f32.c //is called in the rfft_fast_f32.c file when the ifft flag is set to 0
One of 3 functions is then called depending on which fft length was specified, via switch case with fall through behaviour:
lenghts(16, 128, 1024) result in function call arm_cfft_radix8by2_f32
lenghts(32, 256, 2048) result in function call arm_cfft_radix8by4_f32
lengths(64, 512, 4096) result in function call arm_cfft_butterfly_f32
Varying the fft lenght to end up in a different function call still crashes the program, so at least it is nicely reproducible. For this example I went with lenght 32 to get arm_cfft_radix8by4_f32
After the switch case (with fall through behaviour) the arm_cfft_f32.c does something called a bit reversal flag check. I looked up what it does and I understand that, but I have no clue why it does this here.
At any rate, to do this operation it performs a function call to arm_bitreversal32() but instead I end up at arm_bitreversal2.S
Varying the fft lenght to end up in a different function call still crashes the program, so at least it is nicely reproducible. For this example I went with lenght 32 to get arm_cfft_radix8by4_f32
After the switch case (with fall through behaviour) the arm_cfft_f32.c does something called a bit reversal flag check. I do not quite see how that is used in this context, but it does appear to be causing issues
At any rate, to do this operation it performs a function call to arm_bitreversal32() which ends up at arm_bitreverse2.S (at line 142).
file: arm_bitreverse2.S // accessed via function call from arm_cfft_f32.c
arm_bitreversal_32 PROC // this is what arm_cfft_f32.c calls to when calling arm_bitreversal_32
ADDS r3,r1,#1
CMP r3,#1
IT LS
BXLS lr
PUSH {r4-r9}
ADDS r1,r2,#2
LSRS r3,r3,#2
arm_bitreversal_32_0 LABEL ;/* loop unrolled by 2 */
LDRH r8,[r1,#4]
LDRH r9,[r1,#2]
LDRH r2,[r1,#0]
LDRH r12,[r1,#-2]
ADD r8,r0,r8
ADD r9,r0,r9
ADD r2,r0,r2
ADD r12,r0,r12
LDR r7,[r9,#0] // When stepping through from this command at line 159 the hard fault is generated (program crash)
LDR r6,[r8,#0]
So it turns out the command
(Line 159): LDR r7,[r9,#0]
is what causes my issues. Problem is that I do not understand enough of these ARM instructions to understand why this is happening, besides understanding that this line tries to instruct the "Load with immediate offset, pre-indexed immediate offset, or post-indexed immediate offset." [LINK].
I unfortunately also cannot view what is stored inside these registers with the variables window, being ARM instructions and all
2019-05-24 02:28 AM
Thanks for that remark! I looked it up and whilst the arrays were taking a lot of space, the overall usage seemed to still be within reasonable limits (approx. 78%).
Nevertheless I scaled down on the amount of samples (approx. 12% memory usage) for now to facilitate easier debugging, thanks for the remark!
2019-05-24 08:19 AM
Most debuggers of merit can show CPU registers.
This looks to be adding a 16-bit index, to a pointer, and in doing so exceeds the scope of the array. Look at the value of R9 and R0 after it executes line 15 immediately prior to the fault. Or the value in r9 at the fault, likely 0x20020000
arm_bitreversal_32(); Review parameter #3, parameter #4 is the size
Like I said, we normally dump register content in our Hard Fault Handler so we can determine a cause.