Skip to content

Conversation

@marc-hb
Copy link
Collaborator

@marc-hb marc-hb commented May 9, 2020

Re-adding missing goto err: fixes the crash below. It was accidentally
dropped when rewriting commit e150ef4 ("ASoC: SOF: Introduce
extended manifest") after revert commit d8e25a1 ("ASoC: SOF: Fix
build")

May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: use msi interrupt mode
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: hda codecs found, mask 4
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: using HDA machine driver skl_hda_dsp_generic now
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: DMICs detected in NHLT tables: 0
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: Direct firmware load for intel/sof/community/sof-apl.ri failed with error -2
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: error: request firmware intel/sof/community/sof-apl.ri failed err: -2
May 08 17:20:22 up2 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
May 08 17:20:22 up2 kernel: #PF: supervisor read access in kernel mode
May 08 17:20:22 up2 kernel: #PF: error_code(0x0000) - not-present page
May 08 17:20:22 up2 kernel: PGD 0 P4D 0
May 08 17:20:22 up2 kernel: Oops: 0000 [#1] SMP NOPTI
May 08 17:20:22 up2 kernel: CPU: 3 PID: 86 Comm: kworker/3:1 Not tainted 5.7.0-rc3-sof+ #12
May 08 17:20:22 up2 kernel: Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM40 08/06/2018
May 08 17:20:22 up2 kernel: Workqueue: events sof_probe_work [snd_sof]
May 08 17:20:22 up2 kernel: RIP: 0010:snd_sof_load_firmware_raw+0x7b/0x190 [snd_sof]
May 08 17:20:22 up2 kernel: Code: 85 c0 0f 84 2b 01 00 00 48 8b 13 48 89 c6 48 89 ef e8 d9 e3 d5 da 41 89 c4 85 c0 0f 88 24 92 00 00 0f 1f 4>
May 08 17:20:22 up2 kernel: RSP: 0018:ffffa8db80223e18 EFLAGS: 00010246
May 08 17:20:22 up2 kernel: RAX: 0000000000000000 RBX: ffff928dee8a6828 RCX: 0000000000000000
May 08 17:20:22 up2 kernel: RDX: 0000000000000007 RSI: 0000000000000002 RDI: ffff928dfbd98820
May 08 17:20:22 up2 kernel: RBP: ffff928deebe7c28 R08: 00000001383405ae R09: ffffffff9c8c1374
May 08 17:20:22 up2 kernel: R10: 0000000000000393 R11: 000000000002cfc0 R12: 00000000fffffffe
May 08 17:20:22 up2 kernel: R13: ffff928deebe7c28 R14: 0000000000000000 R15: ffff928de8d79bc0
May 08 17:20:22 up2 kernel: FS:  0000000000000000(0000) GS:ffff928dfbd80000(0000) knlGS:0000000000000000
May 08 17:20:22 up2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 08 17:20:22 up2 kernel: CR2: 0000000000000000 CR3: 0000000178fb4000 CR4: 00000000003406e0
May 08 17:20:22 up2 kernel: Call Trace:
May 08 17:20:22 up2 kernel:  sof_probe_work+0x81/0x220 [snd_sof]
May 08 17:20:22 up2 kernel:  process_one_work+0x1d2/0x3a0
May 08 17:20:22 up2 kernel:  worker_thread+0x45/0x3c0
May 08 17:20:22 up2 kernel:  kthread+0xf6/0x130
May 08 17:20:22 up2 kernel:  ? process_one_work+0x3a0/0x3a0
May 08 17:20:22 up2 kernel:  ? kthread_park+0x80/0x80
May 08 17:20:22 up2 kernel:  ret_from_fork+0x35/0x40
May 08 17:20:22 up2 kernel: Modules linked in: snd_hda_codec_hdmi snd_soc_hdac_hdmi snd_soc_dmic snd_sof_pci snd_sof_intel_hda_common snd_so>
May 08 17:20:22 up2 kernel: CR2: 0000000000000000
May 08 17:20:22 up2 kernel: ---[ end trace 33baffe6567bf588 ]---
May 08 17:20:22 up2 kernel: RIP: 0010:snd_sof_load_firmware_raw+0x7b/0x190 [snd_sof]
May 08 17:20:22 up2 kernel: Code: 85 c0 0f 84 2b 01 00 00 48 8b 13 48 89 c6 48 89 ef e8 d9 e3 d5 da 41 89 c4 85 c0 0f 88 24 92 00 00 0f 1f 4>
May 08 17:20:22 up2 kernel: RSP: 0018:ffffa8db80223e18 EFLAGS: 00010246
May 08 17:20:22 up2 kernel: RAX: 0000000000000000 RBX: ffff928dee8a6828 RCX: 0000000000000000
May 08 17:20:22 up2 kernel: RDX: 0000000000000007 RSI: 0000000000000002 RDI: ffff928dfbd98820
May 08 17:20:22 up2 kernel: RBP: ffff928deebe7c28 R08: 00000001383405ae R09: ffffffff9c8c1374
May 08 17:20:22 up2 kernel: R10: 0000000000000393 R11: 000000000002cfc0 R12: 00000000fffffffe
May 08 17:20:22 up2 kernel: R13: ffff928deebe7c28 R14: 0000000000000000 R15: ffff928de8d79bc0
May 08 17:20:22 up2 kernel: FS:  0000000000000000(0000) GS:ffff928dfbd80000(0000) knlGS:0000000000000000
May 08 17:20:22 up2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 08 17:20:22 up2 kernel: CR2: 0000000000000000 CR3: 0000000178fb4000 CR4: 00000000003406e0

Signed-off-by: Marc Herbert marc.herbert@intel.com

Re-adding missing goto err: fixes the crash below. It was accidentally
dropped when rewriting commit e150ef4 ("ASoC: SOF: Introduce
extended manifest") after revert commit d8e25a1 ("ASoC: SOF: Fix
build")

May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: use msi interrupt mode
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: hda codecs found, mask 4
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: using HDA machine driver skl_hda_dsp_generic now
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: DMICs detected in NHLT tables: 0
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: Direct firmware load for intel/sof/community/sof-apl.ri failed with error -2
May 08 17:20:22 up2 kernel: sof-audio-pci 0000:00:0e.0: error: request firmware intel/sof/community/sof-apl.ri failed err: -2
May 08 17:20:22 up2 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
May 08 17:20:22 up2 kernel: #PF: supervisor read access in kernel mode
May 08 17:20:22 up2 kernel: #PF: error_code(0x0000) - not-present page
May 08 17:20:22 up2 kernel: PGD 0 P4D 0
May 08 17:20:22 up2 kernel: Oops: 0000 [thesofproject#1] SMP NOPTI
May 08 17:20:22 up2 kernel: CPU: 3 PID: 86 Comm: kworker/3:1 Not tainted 5.7.0-rc3-sof+ thesofproject#12
May 08 17:20:22 up2 kernel: Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM40 08/06/2018
May 08 17:20:22 up2 kernel: Workqueue: events sof_probe_work [snd_sof]
May 08 17:20:22 up2 kernel: RIP: 0010:snd_sof_load_firmware_raw+0x7b/0x190 [snd_sof]
May 08 17:20:22 up2 kernel: Code: 85 c0 0f 84 2b 01 00 00 48 8b 13 48 89 c6 48 89 ef e8 d9 e3 d5 da 41 89 c4 85 c0 0f 88 24 92 00 00 0f 1f 4>
May 08 17:20:22 up2 kernel: RSP: 0018:ffffa8db80223e18 EFLAGS: 00010246
May 08 17:20:22 up2 kernel: RAX: 0000000000000000 RBX: ffff928dee8a6828 RCX: 0000000000000000
May 08 17:20:22 up2 kernel: RDX: 0000000000000007 RSI: 0000000000000002 RDI: ffff928dfbd98820
May 08 17:20:22 up2 kernel: RBP: ffff928deebe7c28 R08: 00000001383405ae R09: ffffffff9c8c1374
May 08 17:20:22 up2 kernel: R10: 0000000000000393 R11: 000000000002cfc0 R12: 00000000fffffffe
May 08 17:20:22 up2 kernel: R13: ffff928deebe7c28 R14: 0000000000000000 R15: ffff928de8d79bc0
May 08 17:20:22 up2 kernel: FS:  0000000000000000(0000) GS:ffff928dfbd80000(0000) knlGS:0000000000000000
May 08 17:20:22 up2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 08 17:20:22 up2 kernel: CR2: 0000000000000000 CR3: 0000000178fb4000 CR4: 00000000003406e0
May 08 17:20:22 up2 kernel: Call Trace:
May 08 17:20:22 up2 kernel:  sof_probe_work+0x81/0x220 [snd_sof]
May 08 17:20:22 up2 kernel:  process_one_work+0x1d2/0x3a0
May 08 17:20:22 up2 kernel:  worker_thread+0x45/0x3c0
May 08 17:20:22 up2 kernel:  kthread+0xf6/0x130
May 08 17:20:22 up2 kernel:  ? process_one_work+0x3a0/0x3a0
May 08 17:20:22 up2 kernel:  ? kthread_park+0x80/0x80
May 08 17:20:22 up2 kernel:  ret_from_fork+0x35/0x40
May 08 17:20:22 up2 kernel: Modules linked in: snd_hda_codec_hdmi snd_soc_hdac_hdmi snd_soc_dmic snd_sof_pci snd_sof_intel_hda_common snd_so>
May 08 17:20:22 up2 kernel: CR2: 0000000000000000
May 08 17:20:22 up2 kernel: ---[ end trace 33baffe6567bf588 ]---
May 08 17:20:22 up2 kernel: RIP: 0010:snd_sof_load_firmware_raw+0x7b/0x190 [snd_sof]
May 08 17:20:22 up2 kernel: Code: 85 c0 0f 84 2b 01 00 00 48 8b 13 48 89 c6 48 89 ef e8 d9 e3 d5 da 41 89 c4 85 c0 0f 88 24 92 00 00 0f 1f 4>
May 08 17:20:22 up2 kernel: RSP: 0018:ffffa8db80223e18 EFLAGS: 00010246
May 08 17:20:22 up2 kernel: RAX: 0000000000000000 RBX: ffff928dee8a6828 RCX: 0000000000000000
May 08 17:20:22 up2 kernel: RDX: 0000000000000007 RSI: 0000000000000002 RDI: ffff928dfbd98820
May 08 17:20:22 up2 kernel: RBP: ffff928deebe7c28 R08: 00000001383405ae R09: ffffffff9c8c1374
May 08 17:20:22 up2 kernel: R10: 0000000000000393 R11: 000000000002cfc0 R12: 00000000fffffffe
May 08 17:20:22 up2 kernel: R13: ffff928deebe7c28 R14: 0000000000000000 R15: ffff928de8d79bc0
May 08 17:20:22 up2 kernel: FS:  0000000000000000(0000) GS:ffff928dfbd80000(0000) knlGS:0000000000000000
May 08 17:20:22 up2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 08 17:20:22 up2 kernel: CR2: 0000000000000000 CR3: 0000000178fb4000 CR4: 00000000003406e0

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb marc-hb requested a review from ktrzcinx May 9, 2020 20:04
@marc-hb
Copy link
Collaborator Author

marc-hb commented May 9, 2020

The only checkpatch warnings are:

  • Possible unwrapped commit description (prefer a maximum 75 chars per line) because of the call trace
  • Unknown commit ID because of the git clone --depth 20
    This is only a fixup! commit anyway.

@marc-hb
Copy link
Collaborator Author

marc-hb commented May 9, 2020

The pr-device-test failures are platform-specific whereas this patch is not. The failures are also long after loading the firmware.

@marc-hb marc-hb marked this pull request as ready for review May 9, 2020 20:56
@marc-hb
Copy link
Collaborator Author

marc-hb commented May 9, 2020

Looking at git diff d8e25a10ef87~1 -- sound/soc/sof/loader.c I think some other error handling was dropped too but I don't understand that code enough to tell if that other "drop" was accidental too or intentional.

Do we have any "missing or invalid firmware" test? I didn't find any.

@marc-hb marc-hb added Boot Firmware boot or code signing related. P1 Blocker bugs or important features labels May 9, 2020
@ktrzcinx
Copy link
Member

This goto was added as separate fixup PR also in first attempt, it's why it's missed.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering how this got caught? We don't check this in our automated testing? Was it just by a lucky coincidence too?

@kv2019i
Copy link
Collaborator

kv2019i commented May 11, 2020

Phew, good catch @marc-hb !

@kv2019i kv2019i merged commit 6da1967 into thesofproject:topic/sof-dev May 11, 2020
@marc-hb marc-hb deleted the nofw-crash-fixup branch May 11, 2020 22:38
@marc-hb
Copy link
Collaborator Author

marc-hb commented May 12, 2020

Looking at git diff d8e25a10ef87~1 -- sound/soc/sof/loader.c

Another suspicious difference compared to the version that was reverted is for instance this one:

--- a/sound/soc/sof/loader.c
+++ b/sound/soc/sof/loader.c
@@ -283,11 +274,11 @@ static int snd_sof_fw_ext_man_parse(struct snd_sof_dev *sdev,
        if (remaining) {
                dev_err(sdev->dev, "error: sof_ext_man header is inconsistent\n");
                ret = -EINVAL;
        }
 
-       return ret;
+       return ret ? ext_man_size : ret;
 }
 
 /*
  * IPC Firmware ready.
  */

Should it be return ret ? ret : ext_man_size ?

Was this error handling code not tested either?

@ktrzcinx
Copy link
Member

@marc-hb Good catch. I send fixup - #2095

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Boot Firmware boot or code signing related. P1 Blocker bugs or important features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants