Skip to content

[BUG][LNL] DSP panic during the stress capture test #8414

@keqiaozhang

Description

@keqiaozhang

Describe the bug
This issue happens on LNL-NOCODEC platforms. It runs to DSP panic after dozens of recording tests. The reproduction rate is almost 100%.

This issue cannot be reproduced in stress playback test.

dmesg

[ 2519.683135] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx done : 0x13000003|0x0: GLB_SET_PIPELINE_STATE
[ 2519.683141] kernel: snd_sof:sof_ipc4_set_pipeline_state: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc4 set pipeline instance 0 state 4
[ 2519.683145] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x13000004|0x0: GLB_SET_PIPELINE_STATE
[ 2519.686193] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx      : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[ 2519.686197] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 2519.686201] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: DSP panic!
[ 2519.686203] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[ 2519.686216] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ROM status: 0x0, ROM error: 0x0
[ 2519.686218] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ROM debug status: 0x0, ROM debug error: 0x0
[ 2519.686221] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ROM feature bit enabled
[ 2519.686252] kernel: snd_sof:sof_ipc4_find_debug_slot_offset_by_type: sof-audio-pci-intel-lnl 0000:00:1f.3: Slot type 0x4c455400 is not available in debug window
[ 2519.686254] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ DSP dump end ]------------
[ 2519.686256] kernel: snd_sof:sof_set_fw_state: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state change: 7 -> 8
[ 2519.686321] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx done : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[ 2520.188239] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc timed out for 0x13000004|0x0
[ 2520.188251] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[ 2520.188255] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ IPC dump start ]------------
[ 2520.188285] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: Host IPC initiator: 0x93000004|0x0|0x0, target: 0x1b0a0000|0x0|0x0, ctl: 0x3
[ 2520.188289] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ IPC dump end ]------------
[ 2520.188291] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: IPC timeout
[ 2520.188315] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ASoC: error at soc_dai_trigger on SSP2 Pin: -110
[ 2520.188323] kernel:  Port2: ASoC: error at dpcm_be_dai_trigger on Port2: -110
[ 2520.188327] kernel:  Port2: ASoC: trigger FE cmd: 1 failed: -110

mtrace

[   75.929828] <inf> ipc: ipc_cmd: rx	: 0x13000003|0x0
[   75.930236] <inf> ipc: ipc_cmd: rx	: 0x13000004|0x0
[   75.930285] <err> os: xtensa_excint1_c:  ** FATAL EXCEPTION
[   75.930296] <err> os: xtensa_excint1_c:  ** CPU 0 EXCCAUSE 6 (divide by zero)
[   75.930298] <err> os: xtensa_excint1_c:  **  PC 0xa0061c98 VADDR (nil)
[   75.930301] <err> os: xtensa_excint1_c:  **  PS 0x60c20
[   75.930305] <err> os: xtensa_excint1_c:  **    (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:12 CALLINC:2)
[   75.930308] <err> os: z_xtensa_dump_stack:  **  A0 0xa00702a7  SP 0xa00e1e80  A2 (nil)  A3 0xa00c60e8
[   75.930311] <err> os: z_xtensa_dump_stack:  **  A4 0x2  A5 0xa00c60e8  A6 0x1  A7 0xa00e1fc0
[   75.930315] <err> os: z_xtensa_dump_stack:  **  A8 0xa0061c7c  A9 0xa00e1e60 A10 (nil) A11 0xa0100540
[   75.930316] <err> os: z_xtensa_dump_stack:  ** A12 (nil) A13 (nil) A14 0x1 A15 0x1
[   75.930320] <err> os: z_xtensa_dump_stack:  ** LBEG 0xa0057ab3 LEND 0xa0057ac3 LCOUNT 0xa0057973
[   75.930323] <err> os: z_xtensa_dump_stack:  ** SAR 0x20

Backtrace:0xa0061c95:0xa00e1e80 0xa00702a4:0xa00e1ea0 0xa0070721:0xa00e1ec0 0xa006e550:0xa00e1ee0 0xa006dcae:0xa00e1f10 0xa006dba5:0xa00e1f40 0xa006258d:0xa00e1f80 0xa0062fdb:0xa00e1fa0 0xa0062f69:0xa00e1fc0 0xa0062d91:0xa00e2030 0xa003afb0:0xa00e2070 0xa00635d6:0xa00e20a0 0xa005a87f:0xa00e20f0 

[   75.930413] <err> os: z_fatal_error: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[   75.930418] <err> os: z_fatal_error: Current thread: 0x400f8d78 (unknown)
[   75.932920] <err> zephyr: k_sys_fatal_error_handler: Halting system
Terminated

To Reproduce
~/sof-test/test-case/check-capture.sh -d 1 -l 100 -r 1

Reproduction Rate
Almost 100%

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).
  2. Name of the topology file
    • Topology: {development/sof-lnl-nocodec.tplg}
  3. Name of the platform(s) on which the bug is observed.
    • Platform: {LNL-RVP-NOCODEC}

dmesg.txt

mtrace.txt

Metadata

Metadata

Assignees

Labels

DSP panicDSP panic observedI2SApplies to I2S bus for codec connectionLNLApplies to Lunar Lake platformP1Blocker bugs or important featuresbugSomething isn't working as expected

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions