Skip to content

[MTL][SDW] rmmod stuck on kmod-load-unload tests #5015

@plbossart

Description

@plbossart

I've seen repeated issues with the kmod-load-unload not completing on MTL Dell SKU 0CC7

Updating the firmware didn't help. I can still log-in remotely but the device has to be rebooted for audio tests

 starting test_kmod-load-unload 
+ test_kmod-load-unload
+ /root/sof-test/test-case/check-kmod-load-unload.sh -l 1
2024-05-24 17:26:54 UTC Sub-Test: [INFO] ===== Starting iteration 1 of 1 =====
2024-05-24 17:26:54 UTC Sub-Test: [INFO] wait dsp power status to become suspended
2024-05-24 17:26:55 UTC Sub-Test: [INFO] run kmod/sof-kmod-remove.sh

WARNING: running as root is not supported

Specified filename /sys/kernel/debug/sof/trace does not exist.
SKIP	snd_usb_audio  	not loaded
SKIP	snd_hda_intel  	not loaded
SKIP	snd_sof_pci_intel_tng  	not loaded
SKIP	snd_sof_pci_intel_skl  	not loaded
SKIP	snd_sof_pci_intel_apl  	not loaded
SKIP	snd_sof_pci_intel_tgl  	not loaded
SKIP	snd_sof_pci_intel_icl  	not loaded
SKIP	snd_sof_pci_intel_cnl  	not loaded
SKIP	snd_sof_pci_intel_lnl  	not loaded
RMMOD	snd_sof_pci_intel_mtl
[  427.318607] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[  427.422240] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x67000000|0x0: MOD_SET_DX
[  427.422258] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[  427.422264] Message payload: 00000000: 00000001 00000000
[  614.731740] INFO: task rmmod:7293 blocked for more than 122 seconds.
[  614.731760]       Not tainted 6.9.0-rc5-test-07960-ga15f31d16f50 #205
[  614.731764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  614.731767] task:rmmod           state:D stack:13384 pid:7293  tgid:7293  ppid:7292   flags:0x00004002
[  614.731780] Call Trace:
[  614.731784]  <TASK>
[  614.731792]  __schedule+0x385/0xb00
[  614.731812]  schedule+0x3a/0x130
[  614.731817]  schedule_timeout+0x131/0x140
[  614.731828]  __wait_for_common+0xb9/0x1d0
[  614.731833]  ? __pfx_schedule_timeout+0x10/0x10
[  614.731841]  remove_one+0xe8/0x110
[  614.731850]  simple_recursive_removal+0x1e0/0x280
[  614.731856]  ? __pfx_remove_one+0x10/0x10
[  614.731862]  debugfs_remove+0x44/0x70
[  614.731872]  snd_sof_device_remove+0xfb/0x180 [snd_sof]
[  614.731904]  sof_pci_remove+0x1d/0x50 [snd_sof_pci]
[  614.731913]  pci_device_remove+0x3f/0xb0
[  614.731922]  device_release_driver_internal+0x1a9/0x210
[  614.731931]  driver_detach+0x4b/0x90
[  614.731936]  bus_remove_driver+0x70/0xf0
[  614.731941]  pci_unregister_driver+0x3f/0x90
[  614.731948]  __do_sys_delete_module+0x1e0/0x2c0
[  614.731961]  do_syscall_64+0x9d/0x1a0
[  614.731967]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  614.731975] RIP: 0033:0x7f7c360577cb
[  614.731980] RSP: 002b:00007ffc0a379158 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  614.731985] RAX: ffffffffffffffda RBX: 000055a9a1e8f700 RCX: 00007f7c360577cb
[  614.731988] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055a9a1e8f768
[  614.731990] RBP: 00007ffc0a379180 R08: 1999999999999999 R09: 0000000000000000
[  614.731992] R10: 00007f7c360c8ac0 R11: 0000000000000206 R12: 0000000000000000
[  614.731994] R13: 00007ffc0a3793e0 R14: 000055a9a1e8f700 R15: 0000000000000000
[  614.732004]  </TASK>
[  614.732009] 
               Showing all locks held in the system:
[  614.732022] 1 lock held by khungtaskd/154:
[  614.732025]  #0: ffffffff89d6a0c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x120
[  614.732101] 1 lock held by dmesg/3479:
[  614.732104]  #0: ffff93f70174a0e0 (&user->lock){....}-{3:3}, at: devkmsg_read+0x6c/0x230
[  614.732117] 2 locks held by rmmod/7293:
[  614.732120]  #0: ffff93f7029251b0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x3d/0x210
[  614.732130]  #1: ffff93f70655beb8 (&sb->s_type->i_mutex_key#2){....}-{3:3}, at: simple_recursive_removal+0x15f/0x280

[  614.732144] =============================================

[  737.608739] INFO: task rmmod:7293 blocked for more than 245 seconds.
[  737.608753]       Not tainted 6.9.0-rc5-test-07960-ga15f31d16f50 #205
[  737.608757] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  737.608760] task:rmmod           state:D stack:13384 pid:7293  tgid:7293  ppid:7292   flags:0x00004002
[  737.608770] Call Trace:
[  737.608773]  <TASK>
[  737.608779]  __schedule+0x385/0xb00
[  737.608795]  schedule+0x3a/0x130
[  737.608800]  schedule_timeout+0x131/0x140
[  737.608811]  __wait_for_common+0xb9/0x1d0
[  737.608816]  ? __pfx_schedule_timeout+0x10/0x10
[  737.608824]  remove_one+0xe8/0x110
[  737.608831]  simple_recursive_removal+0x1e0/0x280
[  737.608836]  ? __pfx_remove_one+0x10/0x10
[  737.608843]  debugfs_remove+0x44/0x70
[  737.608850]  snd_sof_device_remove+0xfb/0x180 [snd_sof]
[  737.608878]  sof_pci_remove+0x1d/0x50 [snd_sof_pci]
[  737.608886]  pci_device_remove+0x3f/0xb0
[  737.608894]  device_release_driver_internal+0x1a9/0x210
[  737.608902]  driver_detach+0x4b/0x90
[  737.608907]  bus_remove_driver+0x70/0xf0
[  737.608912]  pci_unregister_driver+0x3f/0x90
[  737.608918]  __do_sys_delete_module+0x1e0/0x2c0
[  737.608930]  do_syscall_64+0x9d/0x1a0
[  737.608936]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  737.608943] RIP: 0033:0x7f7c360577cb
[  737.608947] RSP: 002b:00007ffc0a379158 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  737.608952] RAX: ffffffffffffffda RBX: 000055a9a1e8f700 RCX: 00007f7c360577cb
[  737.608954] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055a9a1e8f768
[  737.608957] RBP: 00007ffc0a379180 R08: 1999999999999999 R09: 0000000000000000
[  737.608959] R10: 00007f7c360c8ac0 R11: 0000000000000206 R12: 0000000000000000
[  737.608961] R13: 00007ffc0a3793e0 R14: 000055a9a1e8f700 R15: 0000000000000000
[  737.608970]  </TASK>
[  737.608975] 
               Showing all locks held in the system:
[  737.608986] 1 lock held by khungtaskd/154:
[  737.608988]  #0: ffffffff89d6a0c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x120
[  737.609019] 1 lock held by dmesg/3479:
[  737.609021]  #0: ffff93f70174a0e0 (&user->lock){....}-{3:3}, at: devkmsg_read+0x6c/0x230
[  737.609032] 2 locks held by rmmod/7293:
[  737.609035]  #0: ffff93f7029251b0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x3d/0x210
[  737.609045]  #1: ffff93f70655beb8 (&sb->s_type->i_mutex_key#2){....}-{3:3}, at: simple_recursive_removal+0x15f/0x280

[  737.609059] =============================================


Metadata

Metadata

Assignees

No one assigned

    Labels

    MTLApplies to Meteor Lake platform.SDWApplies to SoundWire bus for codec connection

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions