-
Notifications
You must be signed in to change notification settings - Fork 140
ASoC: SOF: ipc: Log unexpected DSP replies as debug messages instead of error #1575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sound/soc/sof/ipc.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you clarify what you are trying to fix?
Is this that the error level is too high?
This doesn't change any functionality so why change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart yes it isnt really an error. We see this every time the FW boots when the DSP resumes and it is misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then let's describe what the change really is in the commit message...
@plbossart i've updated the commit message now. Is this better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 I find the updated commit still confusing, specifically this part "In fact, the host just ignores this message
and returns an error in such cases. But, the error message could be
misleading when trying to debug other issues."
It's odd to mention an 'error in such cases' when it's removed here? Something's not quite right.
|
Google is asking to fix the error print, they understood it is not causing functionality issue. Also, question we got it is: |
then let's describe what the change really is in the commit message... |
…of error During FW boot, the host receives a reply message from the DSP when it is not really expected. This reply message has not been seen to cause any issues. In fact, the host just ignores this message and returns an error in such cases. But, the error message could be misleading when trying to debug other issues. So, modify the snd_sof_ipc_reply()'s return type to void and log the unexpected reply as a debug message. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
5ce7206 to
d0716a4
Compare
|
@ranj063 @plbossart I prefer keep this error message and eliminate the root cause why this happen, we already had PR #1458 to fix it. |
@keyonjie I'm all for fixing the root cause but didn't we revert that commit from you? I can't recollect why though. Also, even if you want to keep the error, what's the point of returning an error that's not cared about anyway? |
Oh We thought it lead to regression and Pierre reverted it, though that's unjust for it. :) after #1524 merged, let's get #1458 back, then this "unexpected" message at boot time should not happen anymore.
I think if there is still this error happen again after #1458 applied, we should treat it as an issue, root cause and fix it. That's the value why we should keep the error message here. |
@keyonjie I am not touching ANY PRs related to IPC until there are PR #1524 is not in a state where it can be merged, this entire S0ix effort is unfortunately going in circles, mostly because we keep tying D0ix to suspend/resume and not enabling it on plain vanilla Ubuntu platforms. I voiced my concerns many months ago and we are still contemplating what needs to happen... |
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 can you update the commit message to state that the return value of snd_sof_ipc_reply is currently ignored? the wording "the host just ignores this message
and returns an error in such cases" is a bit confusing
|
SOFCI TEST |
|
@plbossart sorry I didnt update the commit message at all. I am seeing some weird devce test results with CI, so I just re-triggered this one to see whats going on. In fact, I think I should close this PR. The right fix for the issue is #1458 and it should be re-applied after the S0ix patches are appalied. Sorry for the confusion |
The driver does not check if the cooling state passed to gpio_fan_set_cur_state() exceeds the maximum cooling state as stored in fan_data->num_speeds. Since the cooling state is later used as an array index in set_fan_speed(), an array out of bounds access can occur. This can be exploited by setting the state of the thermal cooling device to arbitrary values, causing for example a kernel oops when unavailable memory is accessed this way. Example kernel oops: [ 807.987276] Unable to handle kernel paging request at virtual address ffffff80d0588064 [ 807.987369] Mem abort info: [ 807.987398] ESR = 0x96000005 [ 807.987428] EC = 0x25: DABT (current EL), IL = 32 bits [ 807.987477] SET = 0, FnV = 0 [ 807.987507] EA = 0, S1PTW = 0 [ 807.987536] FSC = 0x05: level 1 translation fault [ 807.987570] Data abort info: [ 807.987763] ISV = 0, ISS = 0x00000005 [ 807.987801] CM = 0, WnR = 0 [ 807.987832] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000001165000 [ 807.987872] [ffffff80d0588064] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 807.987961] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 807.987992] Modules linked in: cmac algif_hash aes_arm64 algif_skcipher af_alg bnep hci_uart btbcm bluetooth ecdh_generic ecc 8021q garp stp llc snd_soc_hdmi_codec brcmfmac vc4 brcmutil cec drm_kms_helper snd_soc_core cfg80211 snd_compress bcm2835_codec(C) snd_pcm_dmaengine syscopyarea bcm2835_isp(C) bcm2835_v4l2(C) sysfillrect v4l2_mem2mem bcm2835_mmal_vchiq(C) raspberrypi_hwmon sysimgblt videobuf2_dma_contig videobuf2_vmalloc fb_sys_fops videobuf2_memops rfkill videobuf2_v4l2 videobuf2_common i2c_bcm2835 snd_bcm2835(C) videodev snd_pcm snd_timer snd mc vc_sm_cma(C) gpio_fan uio_pdrv_genirq uio drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6 [ 807.988508] CPU: 0 PID: 1321 Comm: bash Tainted: G C 5.15.56-v8+ #1575 [ 807.988548] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT) [ 807.988574] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 807.988608] pc : set_fan_speed.part.5+0x34/0x80 [gpio_fan] [ 807.988654] lr : gpio_fan_set_cur_state+0x34/0x50 [gpio_fan] [ 807.988691] sp : ffffffc008cf3bd0 [ 807.988710] x29: ffffffc008cf3bd0 x28: ffffff80019edac0 x27: 0000000000000000 [ 807.988762] x26: 0000000000000000 x25: 0000000000000000 x24: ffffff800747c920 [ 807.988787] x23: 000000000000000a x22: ffffff800369f000 x21: 000000001999997c [ 807.988854] x20: ffffff800369f2e8 x19: ffffff8002ae8080 x18: 0000000000000000 [ 807.988877] x17: 0000000000000000 x16: 0000000000000000 x15: 000000559e271b70 [ 807.988938] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 807.988960] x11: 0000000000000000 x10: ffffffc008cf3c20 x9 : ffffffcfb60c741c [ 807.989018] x8 : 000000000000000a x7 : 00000000ffffffc9 x6 : 0000000000000009 [ 807.989040] x5 : 000000000000002a x4 : 0000000000000000 x3 : ffffff800369f2e8 [ 807.989062] x2 : 000000000000e780 x1 : 0000000000000001 x0 : ffffff80d0588060 [ 807.989084] Call trace: [ 807.989091] set_fan_speed.part.5+0x34/0x80 [gpio_fan] [ 807.989113] gpio_fan_set_cur_state+0x34/0x50 [gpio_fan] [ 807.989199] cur_state_store+0x84/0xd0 [ 807.989221] dev_attr_store+0x20/0x38 [ 807.989262] sysfs_kf_write+0x4c/0x60 [ 807.989282] kernfs_fop_write_iter+0x130/0x1c0 [ 807.989298] new_sync_write+0x10c/0x190 [ 807.989315] vfs_write+0x254/0x378 [ 807.989362] ksys_write+0x70/0xf8 [ 807.989379] __arm64_sys_write+0x24/0x30 [ 807.989424] invoke_syscall+0x4c/0x110 [ 807.989442] el0_svc_common.constprop.3+0xfc/0x120 [ 807.989458] do_el0_svc+0x2c/0x90 [ 807.989473] el0_svc+0x24/0x60 [ 807.989544] el0t_64_sync_handler+0x90/0xb8 [ 807.989558] el0t_64_sync+0x1a0/0x1a4 [ 807.989579] Code: b9403801 f9402800 7100003f 8b35cc00 (b9400416) [ 807.989627] ---[ end trace 8ded4c918658445b ]--- Fix this by checking the cooling state and return an error if it exceeds the maximum cooling state. Tested on a Raspberry Pi 3. Fixes: b5cf88e ("(gpio-fan): Add thermal control hooks") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20220830011101.178843-1-W_Armin@gmx.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
We currently ignore the reply messages from the DSP
when they are not expected but call it out as an error
and the return value of snd_sof_ipc_reply() is never
checked. So, change it to a void function and handle
replies only when they are expected.
Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com