Quantized Gemma Model Inference on STM32MP257F-DK Board
Hi,Could you share documentation or examples for running quantized foundational models (e.g. Google Gemma) on the STM32MP257F-DK—first in Python, then in C/C++ using the STM32MP2 NPU? Specifically:Does the STM32MP2 NPU support transformer-based archi...