cancel
Showing results for 
Search instead for 
Did you mean: 

Suggestions on analyzing the performance of an application on STM32F746NG-Disco using STM32CubeIDE

HNall.1
Associate III

Hi All,

I'm working on creating a standalone application that has many matrix multiplications and one softmax operation. It seems softmax operation was consuming more time in the application. The ideal behavior of my application is that the softmax operation consumes 5% of the total time.

I'm using STM32CubeIDE and I'm trying to deploy on the STM32F746NG-DISCO board. Also, I have used the existing QSPI_Perf project to get the runtime of my application.

If I benchmark the same softmax operation outside then it was performing well for the same input dimension and data.

I would like to understand why the softmax operation was performing odd in the application and performing well in the standalone. I'm not pretty sure about the reasons for this behavior. Could anyone let me know the potential reasons for this behavior??

How can we fix these kinds of problems? Is there any tool available in STM32CubeIDE to identify these issues easily?

Any suggestions would be highly appreciated.

Note: Softmax operation uses expf() function.

Regards,

Hari

2 REPLIES 2
HNall.1
Associate III

Any Updates on this?

Evidently not..

Assume you'll need to dig into this yourself, perhaps use DWT CYCCNT to count processor cycles for given operations or algorithms.​

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..