2023-07-20 01:51 PM - edited 2023-07-20 01:52 PM
Hello,
I am currently developing a DSP application on an STM32F439ZI Nucleo board in combination with a Pmod I2S2 audio codec board. So far, basic testing and filtering worked fine until I started to try FFT processing. I isolated the issue in a simple test program (code attached). I am using the CMSIS DSP library for the FFT and IFFT.
This is what the code does:
Audio sample data is transferred over I2S using DMA in circular mode. The DMA buffer itself is rather small in order to have the TxRxHalfComplete and TxRxComplete ISR called every time after one audio sample from the left and right channel has been received. I need this because I have an FIR filter function that needs to be called for every sample transmitted and received (tested and works, but not used inside the test program). Inside the ISRs the incoming samples are converted from 24 bit signed PCM into float32 and copied into a bigger circular input buffer. As soon as one half of the buffer is filled with data a flag is set. Inside the main while loop the sample block is then transformed into the frequency domain using arm_rfft_fast_f32 and immediately transformed back into the time domain. The output buffer's current index is delayed to ensure there is enough time to perform the FFT calculations (verified).
Here's the issue:
It works perfectly fine for a transform size of 1024 and 2048. If I make the transform / buffer size any larger or smaller the audio output gets corrupted. Output buffer delay is adjusted appropriately. On top you can see the 1 kHz sine wave that is fed into the unit, below that is the corrupted output of the left channel and below that is the output of the right channel (feeding samples straight though inside the ISR).
What I have already done trying to fix the bug:
I verified the FFT and IFFT is working correctly by feeding a known array with a known output into the transform functions - works.
Simply copying the input buffer to the output buffer in the while loop works as well.
The really weird thing is, if I simply uncomment line 155 where I memcpy just the input buffer into another totally unused buffer, that has nothing to do with the transform, it suddenly works for all transform sizes.
So I assume this must be some kind of memory access issue. However at this point I am absolutely clueless what could potentially cause this behavior or how I could debug it. I would highly appreciate if somebody could point me into the right direction!
Best regards
Solved! Go to Solution.
2023-08-01 12:15 PM
Your assumption is correct. I cannot capture whole blocks because it's going to be a real-time (nearly) zero delay application. There will be smaller FFT transforms as well, just testing the largest cases now to verify the fundamentals are working correctly.
Okay, so here's what I got right now. Works perfectly fine. However, as soon as I take this over into a FreeRTOS context, I get thrown into the HardFault_Handler with PRECISERR set. If I choose a smaller transform size than 4096 it works fine as well. All buffers are allocated statically and not on the FreeRTOS heap or inside the task. Any ideas what the issue could be?
Also I realized I have to introduce another magic delay before doing the FFT because otherwise the direct FIR filtering will read wrong data (because the the FFT modifies the input buffer). I am not really convinced the way I am handling this is smart. I can't just copy the buffers all the time because it's a waste of processing power and I won't have enough RAM to do that as soon as the whole algorithm is running with multiple tasks / instances.
The modulo operations inside the direct FIR filter function are probably stupid but I don't know of any other ways to make the buffer wrap around backwards.
I appreciate all suggestions how to change the structure of the buffering!
Thanks.