cancel
Showing results for 
Search instead for 
Did you mean: 

Project Feasibility: Multimodal Medical Robot on STM32H723ZG (Audio/Vision AI + USB Host)

AMIRA11
Associate

Hello ST Community,

I am working on my final year project: an autonomous "Multimodal Medical Assistant Robot" based on the NUCLEO-H723ZG (Cortex-M7, 550MHz). I would like to have your expert opinion on the feasibility and technical architecture.

Key Features:

  1. 100% Offline AI (Edge AI):

    • Audio: Keyword Spotting (7 commands) using INMP441 (I2S) and CMSIS-DSP (MFCC).

    • Vision: Fall detection using OV2640 (DCMI) and a light MobileNet model.

  2. Software Stack: X-CUBE-AI, TensorFlow Lite Micro, FatFS.

  3. Innovation (Dynamic Loading): We plan to load the AI models (.tflite) and voice responses (.wav) dynamically from a USB Flash Drive (USB Host MSC) into RAM at boot time.

Specific Questions:

  • Is the NUCLEO-H723ZG powerful enough to run both Audio and Vision inference concurrently while managing motor control (PWM) and USB Host?

  • Regarding the USB Host / RAM loading: Does X-CUBE-AI support "relocatable weights" loaded into RAM from an external storage device via FatFS? Any specific memory alignment tips for the H7 AXI SRAM?

  • Any advice on DMA priorities between DCMI (Vision) and I2S (Audio) to avoid data loss?

Thank you for your help!

3 REPLIES 3
Tuomas95
Associate III

You would probably want to use the STM32N6 as it has an NPU.

yessine
Associate III

Hello @AMIRA11 

I don't see why it can't be done using a Nucleo H723ZG (144 pins), but let’s see what you really need to do in this case:

  • Microphone: Nucleo boards don’t have a microphone, so you should implement a microphone module (then port a working example from another board to the H723).

  • Vision: The Nucleo board doesn’t have an LCD screen or a single working example for it. So first, you need to implement an LCD screen (good luck configuring FMC correctly), then you need to port a working example from the H747.

  • USB MSC Host: You need to implement an SD card reader (at least here the H723 has a USB MSC Host example).

On the other hand, the H747 DISCO already has an implemented microphone and SD card slot with examples, which will help you skip porting tasks.

It is also very well supported for Edge AI applications. Many Edge AI functionalities were first developed on the H747 DISCO and later on the N6 when it was launched.

I can recommend the N6, but personal advice: stay away from it as much as you can.

To conclude, it is feasible using the H723 or N6, but with your STM32 experience  which I estimate to be around 1 year since this is your final-year project (2nd year in engineering) it will be very challenging.

The H747 DISCO will make it more doable (still nothing guaranteed; you may encounter many issues during your journey).

NB: For vision, you can take a look at these, they may help you: FP-AI Vision, Teachable Machine (I don’t know if it still works, but it gives you the ability to create convenient TFLite models that you can later flash using Model Zoo).

 
BR

Hi @AMIRA11,

 

It is difficult to answer as it will highly depends on your models.

I would suggest to work on your models first and see if they are compiled well with our ST Edge AI Core tool.

You can use the new tool replacing X Cube AI called STM32CubeAI Studio to easily compile and benchmark your model: Introducing STM32CubeAI Studio - STMicroelectronics Community

 

Then depending on the needs of your models, select a MCU with enough memory and power.

 

Note that the N6 and the NPU are mainly made to accelerate convolutions, so very useful if your model uses a lot of these layers, but won't do much if not using them.

 

Have a good day,

Julian


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.