cancel
Showing results for 
Search instead for 
Did you mean: 

STM32N6 NPU inference stuck at LL_ATON_RT_RunEpochBlock (no IRQ fired)

seokjs
Associate

안녕하세요,

현재 STM32N657-DK 와 X-CUBE-AI(ST Edge AI Core v2.2.0)를 사용하고 있습니다 .
TensorFlow Lite 모델을 NPU에서 실행되도록 변환했으며, 펌웨어를 성공적으로 빌드하고 플래싱할 수 있습니다.

seokjs_0-1758778760868.png

하지만 내가 전화하면:

 ret = LL_ATON_RT_RunEpochBlock(&NN_Instance_Default);

함수가 결코 반환되지 않습니다.

내가 이미 구성한 것

  1. 시계 및 재설정

    • NPU 클럭을 활성화하고 재설정을 해제했습니다.

       
      __HAL_RCC_NPU_CLK_ENABLE(); __HAL_RCC_NPU_FORCE_RESET(); __HAL_RCC_NPU_RELEASE_RESET();
    • set_clk_sleep_mode()에서도 슬립 모드 클록이 활성화됩니다.

  2. 인터럽트 라우팅

    • 보안 프로젝트에서:

       
      NVIC_DisableIRQ(NPU3_IRQn); NVIC_ClearPendingIRQ(NPU3_IRQn); NVIC_SetTargetState(NPU3_IRQn); // 비보안으로 경로 지정
    • NonSecure 프로젝트에서:

       
      HAL_NVIC_SetPriority(NPU3_IRQn, 0 , 0 ); HAL_NVIC_EnableIRQ(NPU3_IRQn); void NPU3_IRQHandler ( void ) { printf ( ">> NPU IRQ가 실행됨\r\n" ); ATON_STD_IRQHandler(); }
  3. RIF / RISAF 구성

    • NPU 마스터/슬레이브 속성을 비보안 + 특권으로 구성했습니다.

    • NonSecure에서 NPU RAM3~RAM6(0x3420_0000~0x343C_0000)에 액세스할 수 있도록 RISAF 구성을 추가했습니다.

       
      RISAF_ConfigRegion( 3 , 0x34200000 , 0x70000 , RISAF_ATTR_비보안 | RISAF_ATTR_PRIV); RISAF_ConfigRegion( 4 , 0x34270000 , 0x70000 , RISAF_ATTR_비보안 | RISAF_ATTR_PRIV); RISAF_ConfigRegion( 5 , 0x342E0000 , 0x70000 , RISAF_ATTR_비보안 | RISAF_ATTR_PRIV); RISAF_ConfigRegion( 6 , 0x34350000 , 0x70000 , RISAF_ATTR_비보안 | RISAF_ATTR_PRIV);
  4. 활성화 버퍼

    • 32바이트 정렬로 .noncacheable 섹션에 선언되었습니다.


문제

위의 모든 구성에도 불구하고:

  • LL_ATON_RT_RunEpochBlock()이 끝나지 않습니다.

  • ret은 LL_ATON_RT_DONE에 도달하지 않습니다.

  • NPU IRQ(NPU3_IRQn)가 트리거되지 않는 것 같습니다.


질문

  1. NPU가 비보안 환경에서 IRQ를 생성할 수 있도록 하려면 추가적인 RISAF 또는 RIF ​​구성이 필요합니까?

  2. Epoch Controller 인터럽트를 활성화하려면 ATON_INTCTRL 레지스터(예: ATON_INTCTRL_CTRL_SET_EN, ATON_INTCTRL_INTORMSK0_SET)를 명시적으로 구성해야 합니까 ? 아니면 X-CUBE-AI가 이를 자동으로 처리해야 합니까?

  3. 이 문제가 NPU 메모리 영역 속성 (캐시 가능 대 캐시 불가능) 과 관련이 있을 수 있나요 ? 그렇다면 권장되는 구성은 무엇인가요?

  4. 보안/비보안 TrustZone 프로젝트에서 STM32N6 NPU 인터럽트 라우팅(NPU3_IRQn) 과 관련하여 알려진 문제가 있습니까 ?


응원해주셔서 감사합니다.
감사합니다.
[seokjs]

 

5 REPLIES 5
seokjs
Associate

@PedroDeOliveira 

Please help me......

Imen.D
ST Employee

Hello @seokjs ,

Please try to write in English because most of the people on this community can speak English but not Korean.
Please follow the posting Tips in this article: How to write your question to maximize your chances to find a solution, for how to properly post and insert source code.

When your question is answered, please close this topic by clicking "Accept as Solution".
Thanks
Imen
seokjs
Associate

Hello,

I am currently using the STM32N657-DK board with X-CUBE-AI (ST Edge AI Core v2.2.0).
I have successfully converted a TensorFlow Lite model to run on the NPU and can build and flash the firmware without issues.

seokjs_0-1759100975297.png

 

Problem
When I call the function

ret = LL_ATON_RT_RunEpochBlock(&NN_Instance_Default);

it never returns.
The variable ret never reaches LL_ATON_RT_DONE, and it seems that the NPU interrupt (NPU3_IRQn) is not triggered.

Current configuration

Clock and Reset

  • Enabled the NPU clock and released the reset:

     
    __HAL_RCC_NPU_CLK_ENABLE(); __HAL_RCC_NPU_FORCE_RESET(); __HAL_RCC_NPU_RELEASE_RESET();
     
  • In set_clk_sleep_mode(), the NPU sleep-mode clock is also enabled.

Interrupt routing

  • In the Secure project:

     

     
    NVIC_DisableIRQ(NPU3_IRQn);
    NVIC_ClearPendingIRQ(NPU3_IRQn);
    NVIC_SetTargetState(NPU3_IRQn); // route to NonSecure
     
  • In the NonSecure project:

     

     
     
    HAL_NVIC_SetPriority(NPU3_IRQn, 0, 0);
    HAL_NVIC_EnableIRQ(NPU3_IRQn);
     
    void NPU3_IRQHandler(void)
    {
      printf(">> NPU IRQ triggered\r\n");
      ATON_STD_IRQHandler();
    }
     

RIF / RISAF configuration

  • Configured NPU master/slave attributes to NonSecure + privileged.

  • Added RISAF regions so that NonSecure code can access NPU RAM3–RAM6 (0x3420_0000–0x343C_0000):

     
    RISAF_ConfigRegion(3, 0x34200000, 0x70000, RISAF_ATTR_NONSECURE | RISAF_ATTR_PRIV); RISAF_ConfigRegion(4, 0x34270000, 0x70000, RISAF_ATTR_NONSECURE | RISAF_ATTR_PRIV); RISAF_ConfigRegion(5, 0x342E0000, 0x70000, RISAF_ATTR_NONSECURE | RISAF_ATTR_PRIV); RISAF_ConfigRegion(6, 0x34350000, 0x70000, RISAF_ATTR_NONSECURE | RISAF_ATTR_PRIV);

Activation buffer

  • Declared in the .noncacheable section with 32-byte alignment.

Questions

  1. Is any additional RISAF or RIF configuration required to allow the NPU to generate interrupts in a NonSecure environment?

  2. To enable Epoch Controller interrupts, do I need to explicitly configure ATON_INTCTRL registers (e.g. ATON_INTCTRL_CTRL_SET_EN, ATON_INTCTRL_INTORMSK0_SET), or should X-CUBE-AI handle this automatically?

  3. Could this issue be related to the memory attributes of the NPU region (cacheable vs. non-cacheable)? If so, what configuration is recommended?

  4. Are there any known issues with NPU interrupt routing (NPU3_IRQn) in Secure/NonSecure TrustZone projects on STM32N6?

Thank you very much for your support.

I am attaching all the files I have modified so far.

@Imen.D 

I have updated the post in English with the revised files.
Please help me resolve the issue.

VitorWagner
Associate

Greetings @seokjs,

I ran into a similar problem, I was triggering my inference with a interruption, but other interruptions, with higher priority, were being called at the same time my inference was running, cutting the process in half, I see you have already checked the NPU NVIC Priority level and trigger, but if your project has a similar behavior to the one I described, I would suggest checking your NVIC Priorities, by properly setting them up in my project I have been able to execute an inference without any major issues.

The model you used was originated from the Model Zoo or did you convert it yourself to a .tflite and then used the X-CUBE-AI package to convert it into a network.c file? Maybe by using a custom model you could have ran into a generation error in the network.c, this is probably a stretch but maybe you could look into the layers in the network.c file and see if you have a final output layer, same goes for network.h which usually has a named last layer.