cancel
Showing results for 
Search instead for 
Did you mean: 

linux 6.1.82 RT patch cannot disable CONFIG_CPU_IDLE ,Trigger exception Oops

bugman
Associate III

Dear all:

I am currently using the distribution version:

V5、rt 6.1.82-rt patch、stm32mp257

Prompt in Wiki,need disable CONFIG_CPU_IDLE,But in reality, it cannot be closed.

bugman_0-1763435271710.png

bugman_1-1763435557738.png

Why am I paying attention here? Because I encountered an Oops exception during running.

login: [  370.478552] BUG: scheduling while atomic: swapper/1/0/0x00000002
[  370.478575] Modules linked in: cfg80211 rfkill stm32_adc stm32_timer_trigger stm32_lptimer_trigger mcp251xfd crct10dif_ce                                                                                                                                             nvme nvme_core stusb160x typec stm32_adc_core stm32_crc32 spi_stm32 m_can_platform m_can can_dev cdc_acm optee_rng rng_core                                                                                                                                             sch_fq_codel 8021q garp mrp bridge stp llc sch_prio ch343(O) acm(O) edgx_pfm_lkm(O) stm32_deip(O) ip_tables x_tables ipv6
[  370.478682] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O       6.1.82-rt27 #1
[  370.478691] Hardware name: MYiR STM32MP257x Evaluation Board 2 (DT)
[  370.478696] Call trace:
[  370.478700]  dump_backtrace+0xdc/0x130
[  370.478720]  show_stack+0x18/0x30
[  370.478731]  dump_stack_lvl+0x64/0x80
[  370.478742]  dump_stack+0x18/0x34
[  370.478750]  __schedule_bug+0x54/0x70
[  370.478760]  __schedule+0x5e4/0x6b0
[  370.478771]  schedule_rtlock+0x28/0x60
[  370.478782]  rtlock_slowlock_locked+0x384/0xc1c
[  370.478792]  rt_spin_lock+0x88/0xb0
[  370.478800]  genpd_lock_nested_spin+0x1c/0x30
[  370.478814]  genpd_power_on+0x64/0x170
[  370.478822]  genpd_runtime_resume+0xcc/0x2d0
[  370.478831]  __rpm_callback+0x48/0x1ac
[  370.478840]  rpm_callback+0x6c/0x80
[  370.478848]  rpm_resume+0x460/0x6d0
[  370.478856]  __pm_runtime_resume+0x48/0x8c
[  370.478865]  __psci_enter_domain_idle_state.constprop.0+0xd4/0xe0
[  370.478878]  psci_enter_domain_idle_state+0x18/0x2c
[  370.478888]  cpuidle_enter_state+0x270/0x2fc
[  370.478897]  cpuidle_enter+0x38/0x50
[  370.478905]  do_idle+0x264/0x300
[  370.478915]  cpu_startup_entry+0x38/0x40
[  370.478924]  secondary_start_kernel+0x130/0x15c
[  370.478935]  __secondary_switched+0xb0/0xb4
[  370.478952] bad: scheduling from the idle thread!
[  370.478956] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W  O       6.1.82-rt27 #1
[  370.478964] Hardware name: MYiR STM32MP257x Evaluation Board 2 (DT)
[  370.478967] Call trace:
[  370.478970]  dump_backtrace+0xdc/0x130
[  370.478981]  show_stack+0x18/0x30
[  370.478992]  dump_stack_lvl+0x64/0x80
[  370.479000]  dump_stack+0x18/0x34
[  370.479008]  dequeue_task_idle+0x30/0x60
[  370.479018]  __do_set_cpus_allowed+0x94/0x174
[  370.479026]  __schedule+0x62c/0x6b0
[  370.479037]  schedule_rtlock+0x28/0x60
[  370.479048]  rtlock_slowlock_locked+0x384/0xc1c
[  370.479057]  rt_spin_lock+0x88/0xb0
[  370.479065]  genpd_lock_nested_spin+0x1c/0x30
[  370.479077]  genpd_power_on+0x64/0x170
[  370.479085]  genpd_runtime_resume+0xcc/0x2d0
[  370.479093]  __rpm_callback+0x48/0x1ac
[  370.479102]  rpm_callback+0x6c/0x80
[  370.479110]  rpm_resume+0x460/0x6d0
[  370.479118]  __pm_runtime_resume+0x48/0x8c
[  370.479127]  __psci_enter_domain_idle_state.constprop.0+0xd4/0xe0
[  370.479138]  psci_enter_domain_idle_state+0x18/0x2c
[  370.479148]  cpuidle_enter_state+0x270/0x2fc
[  370.479157]  cpuidle_enter+0x38/0x50
[  370.479166]  do_idle+0x264/0x300
[  370.479174]  cpu_startup_entry+0x38/0x40
[  370.479183]  secondary_start_kernel+0x130/0x15c
[  370.479193]  __secondary_switched+0xb0/0xb4
[  370.479210] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  370.479215] Mem abort info:
[  370.479217]   ESR = 0x0000000086000006
[  370.479220]   EC = 0x21: IABT (current EL), IL = 32 bits
[  370.479226]   SET = 0, FnV = 0
[  370.479229]   EA = 0, S1PTW = 0
[  370.479232]   FSC = 0x06: level 2 translation fault
[  370.479236] user pgtable: 4k pages, 48-bit VAs, pgdp=000000008699d000
[  370.479243] [0000000000000000] pgd=080000008699e003, p4d=080000008699e003, pud=080000008699f003, pmd=0000000000000000
[  370.479260] Internal error: Oops: 0000000086000006 [#1] PREEMPT_RT SMP
[  370.479266] Modules linked in: cfg80211 rfkill stm32_adc stm32_timer_trigger stm32_lptimer_trigger mcp251xfd crct10dif_ce                                                                                                                                             nvme nvme_core stusb160x typec stm32_adc_core stm32_crc32 spi_stm32 m_can_platform m_can can_dev cdc_acm optee_rng rng_core                                                                                                                                             sch_fq_codel 8021q garp mrp bridge stp llc sch_prio ch343(O) acm(O) edgx_pfm_lkm(O) stm32_deip(O) ip_tables x_tables ipv6
[  370.479356] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W  O       6.1.82-rt27 #1
[  370.479364] Hardware name: MYiR STM32MP257x Evaluation Board 2 (DT)
[  370.479368] pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  370.479377] pc : 0x0
[  370.479385] lr : __do_set_cpus_allowed+0xcc/0x174
[  370.479393] sp : ffff800009a0b970
[  370.479396] x29: ffff800009a0b970 x28: 0000000000000000 x27: 0000000000000000
[  370.479409] x26: ffff800009a0ba50 x25: ffff800009a0ba78 x24: ffff0000041340f0
[  370.479422] x23: 0000000000000002 x22: ffff800008ecaa28 x21: ffff000004136740
[  370.479435] x20: ffff000004133b00 x19: ffff00007238e640 x18: 00000000fffffffd
[  370.479448] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800009a0af20
[  370.479461] x14: 0000000000000000 x13: ffff800009a0b14a x12: 656c646920656874
[  370.479473] x11: fffffffffffe0000 x10: 000000000000000a x9 : ffff800009a0b070
[  370.479486] x8 : ffff800009753020 x7 : ffff800009a0b5c0 x6 : 000000000000000c
[  370.479498] x5 : ffff000004134528 x4 : 0000000000000000 x3 : 0000000000000000
[  370.479510] x2 : 000000000000000a x1 : ffff000004133b00 x0 : ffff00007238e640
[  370.479522] Call trace:
[  370.479525]  0x0
[  370.479530]  __schedule+0x62c/0x6b0
[  370.479541]  schedule_rtlock+0x28/0x60
[  370.479552]  rtlock_slowlock_locked+0x384/0xc1c
[  370.479561]  rt_spin_lock+0x88/0xb0
[  370.479569]  genpd_lock_nested_spin+0x1c/0x30
[  370.479581]  genpd_power_on+0x64/0x170
[  370.479589]  genpd_runtime_resume+0xcc/0x2d0
[  370.479597]  __rpm_callback+0x48/0x1ac
[  370.479605]  rpm_callback+0x6c/0x80
[  370.479613]  rpm_resume+0x460/0x6d0
[  370.479622]  __pm_runtime_resume+0x48/0x8c
[  370.479630]  __psci_enter_domain_idle_state.constprop.0+0xd4/0xe0
[  370.479641]  psci_enter_domain_idle_state+0x18/0x2c
[  370.479651]  cpuidle_enter_state+0x270/0x2fc
[  370.479660]  cpuidle_enter+0x38/0x50
[  370.479668]  do_idle+0x264/0x300
[  370.479677]  cpu_startup_entry+0x38/0x40
[  370.479686]  secondary_start_kernel+0x130/0x15c
[  370.479696]  __secondary_switched+0xb0/0xb4
[  370.479713] Code: bad PC value
[  371.023423] ---[ end trace 0000000000000000 ]---
[  371.023431] Kernel panic - not syncing: Attempted to kill the idle task!
[  371.023437] SMP: stopping secondary CPUs
[  372.106774] SMP: failed to stop secondary CPUs 0-1
[  372.106782] Kernel Offset: disabled
[  372.106785] CPU features: 0x00000,00000080,0000421b
[  372.106790] Memory Limit: none
E/TC:1   Panic 'Watchdog' at /usr/src/debug/optee-os-stm32mp/3.19.0-stm32mp-r2-r0/core/drivers/stm32_iwdg.c:193 <stm32_iwdg_                                                                                                                                            it_handler>
E/TC:1   TEE load address @ 0x82000000
E/TC:1   Call stack:
E/TC:1    0x8200831c
E/TC:1    0x82030254
E/TC:1    0x82019eb0
E/TC:1    0x8202f198
E/TC:1    0x82013fd4
E/TC:1    0x820017dc
NOTICE:  CPU: STM32MP257DAL Rev.Y
NOTICE:  Model: STMicroelectronics STM32MP257F-EV1 Evaluation Board
INFO:    Reset reason (0x2104):
INFO:      IWDG1 system reset (rst_iwdg1)
INFO:    PMIC2 version = 0x11
INFO:    PMIC2 product ID = 0x21
INFO:    FCONF: Reading TB_FW firmware configuration file from: 0xe011000
INFO:    FCONF: Reading firmware configuration information for: stm32mp_io
INFO:    FCONF: Reading firmware configuration information for: stm32mp_fuse
INFO:    Using EMMC
INFO:      Instance 2
INFO:    Boot used partition fsbl1
NOTICE:  BL2: v2.8-stm32mp2-r2.0(debug):a668299(a6682993)
NOTICE:  BL2: Built : 09:46:55, Dec 22 2024
INFO:    BL2: Loading image id 26
INFO:    Loading image id=26 at address 0xe041000
INFO:    Image id=26 loaded: 0xe041000 - 0xe049650
INFO:    BL2: Doing platform setup
INFO:    RAM: LPDDR4 1x16Gbits 1x32bits 1200MHz
INFO:    Memory size = 0x80000000 (2048 MB)
INFO:    BL2: Loading image id 1
INFO:    Loading image id=1 at address 0xe000000
INFO:    Image id=1 loaded: 0xe000000 - 0xe000326
INFO:    FCONF: Reading FW_CONFIG firmware configuration file from: 0xe000000
INFO:    FCONF: Reading firmware configuration information for: risaf_config
INFO:    RISAF2: No configuration in DT, use default
INFO:    FCONF: Reading firmware configuration information for: dyn_cfg
INFO:    BL31 max size = 0x17000 (94208B)
INFO:    BL2: Loading image id 3
INFO:    Loading image id=3 at address 0xe000000
INFO:    Image id=3 loaded: 0xe000000 - 0xe0157c0
INFO:    BL2: Loading image id 19
INFO:    Loading image id=19 at address 0x81fc0000
INFO:    Image id=19 loaded: 0x81fc0000 - 0x81fc39ec
INFO:    BL2: Loading image id 4
INFO:    Loading image id=4 at address 0x82000000
INFO:    Image id=4 loaded: 0x82000000 - 0x8200001c
INFO:    OPTEE ep=0x82000000
INFO:    OPTEE header info:
INFO:          magic=0x4554504f
INFO:          version=0x2
INFO:          arch=0x1
INFO:          flags=0x0
INFO:          nb_images=0x1
INFO:    BL2: Loading image id 8
INFO:    Loading image id=8 at address 0x82000000
INFO:    Image id=8 loaded: 0x82000000 - 0x820cc238
INFO:    BL2: Loading image id 2
INFO:    Loading image id=2 at address 0x84400000
INFO:    Image id=2 loaded: 0x84400000 - 0x84419790
INFO:    BL2: Loading image id 5
INFO:    Loading image id=5 at address 0x84000000
INFO:    Image id=5 loaded: 0x84000000 - 0x841975f0
NOTICE:  BL2: Booting BL31
INFO:    Entry point address = 0xe000000
INFO:    SPSR = 0x3cd
INFO:    ARM GICv2 driver initialized
NOTICE:  BL31: v2.8-stm32mp2-r2.0(debug):a668299(a6682993)
NOTICE:  BL31: Built : 09:46:55, Dec 22 2024
INFO:    BL31: Initializing runtime services
INFO:    BL31: Initializing BL32
I/TC: Early console on UART#2
I/TC:

 

 

6 REPLIES 6
Christophe Guibout
ST Employee

Hello @bugman,

 

Thanks for your post, it seems I've been able to reproduce the problem on my side : keep you in touch.
BR,

Christophe

 

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

hello, @Christophe Guibout 

I am very happy to receive your reply.

This question has been bothering me for a long time, and I look forward to your reply.

Thank you~

Christophe Guibout
ST Employee

Hello @bugman,

 

CPU_IDLE is kept enabled on Arm64 due to dependencies with ACPI, so the wiki needs to be updated accordingly, as well as the RT ARM64 kernel fragment (arch/arm64/configs/fragment-07-rt.config).

So now, the kernel panic becomes relevant to investigate : I've not been able to see it by testing with cyclictest : could you please detail how to reproduce the kernel panic issue ?

BR,

Christophe

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

hello, @Christophe Guibout 

I am using SPI driver MCP2518fd, which is an SPI to CAN chip with a built-in driver for the core.

I found that there is a probability of reporting the kernel panic  when MCP2518fd is running under light load 。

Based on the printed information, I roughly understand that the abnormal trigger was caused by power consumption adjustment. I tried to turn off PSCI or other kernel options, but none of them worked. However, I don't know how to handle it?

Thank you~

Hello @bugman,

 

I don't have any magical recipe, so I would recommand first to disable cpuidle and check if the issue is still there:

echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state0/disable
echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state1/disable
echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state1/disable
echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state0/disable

Then, you can also try to disable one core and check if you could have a race condition:

echo 0 > /sys/devices/system/cpu/cpu1/online

 

BR,

Christophe

 

 

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

hello, @Christophe Guibout 

 

Thanks for your post,I will further pinpoint the cause of the issue based on the method you provided.

keep you in touch.