How can I reduce cache-misses in order to optimize the energy efficiency?
I have to deploy an algorithm with a lightweight deep learning-based approach on a low-power STM System-on-Chip based on an Arm Cortex-M-class MCU or Arm Cortex-A-class MPU. Is there a way to reduce cache-misses or a way to reduce the CPU stalls on c...