-
Notifications
You must be signed in to change notification settings - Fork 140
ASoC: SOF: add spin_lock to protect stream allocation function #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
keyonjie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, only minor comment, shall we use spin_lock/unlock_irq() which can disable the preempted and changed by irq handler?
8754817 to
1a4dac8
Compare
|
yes, it should be spin_lock_irq. I was misleaded by some code. |
|
yes, I search the code, spin_lock_irq is used except one case which is copied by me...... |
|
In addition reg_lock is not needed when SOF_HDA is not needed...so this patch is getting in the way of the work done to have a better split between HDA and non-HDA platforms. |
|
Actually it is needed for all case no matter of HD or non-HD. For I2S codec, stream is used for host dma |
|
yes, as Rander stated, for non-HD case, we still need manage host-DMA and host streams, so better to initialize bus->reg_lock for all cases(put it out of SOF_HDA braces). |
|
ok, so it's needed, but why are we not using spin_lock_irqsave() in hda_dsp_ipc_irq_handler() and hda_dsp_stream_interrupt()? All other platforms do so in interrupt context, so is this a miss or something intentional? |
|
Hi Pierre, IMHO, spin_lock_irqsave() is useful when nested lock is hold with local irq disabled in interrupt context, in our case, both hda_dsp_ipc_irq_handler() and hda_dsp_stream_interrupt() are holding only one lock(and different lock), so spin_lock() there should be fine. Switching to use spin_lock_irqsave() pairs may lead to a little more time cost at irq and preempt store/restore, but I am fine with the change if that make us more released from concerning about dead lock possibility. |
|
@keyonjie I am utterly confused by your answer. The first part says there is no problem but the second part says we might have deadlock cases, which hints at scarier and fundamental issues with the design? Can you please clarify your point? |
|
@plbossart my point is that spin_lock() now in hda_dsp_ipc_irq_handler() and hda_dsp_stream_interrupt() are fine. |
I can't find where we are calling hda_dsp_stream_get() and hda_dsp_stream_put(). I believe the spin_lock is trying to protect bus->stream_list. However, there are still other functions changing it, like hda_dsp_stream_get_pstream() and etc. Any reason we don't need the spin_lock in those functions? |
|
@libinyang I believe you are looking into old code base, xx_get_pstream() is not existed anymore. Per my understanding, using spin_lock_irq() in thread context is fine, but in interrupt context, we need to be careful when using it(spin_lock_irq), spin_unlock_irq() there will enable irq which may confuse embedded spin_lock_irq() within the same irq handler. To avoid this kind of deadlock, we can use spin_lock_irqsave(). |
It seems my code is not sync with yours. Thanks for point it out. |
|
@libinyang thanks for education, so looks spin_lock() in hda_dsp_ipc_irq_handler() and hda_dsp_stream_interrupt() are fine, but in hda_dsp_stream_get/put() what @RanderWang's PR listed, better to switch to spin_lock_irqsave(). |
|
@keyonjie you will need to educate me further on why we can use spin_lock() in hda_dsp_ipc_irq_handler() when all other interrupt handling uses spin_lock_irq_save() in the SOF code base. If I understand well, the default is to use spin_lock_irq_save() unless you absolutely now that spin_lock() is risk-free, and I am not getting a good explanation as to why it is? |
|
@plbossart taking hda_dsp_ipc_irq_handler() as example, the resource we are locking here is sdev->hw_lock, we are holding this lock in these places of SOF code: $ grep "hw_lock" sound/soc/sof/* -r a. interrupt context: So, the scenarios I can imagine here are:
For both #1 and #2, I can't imagine any possibility for dead lock happening, so that's why I think we can keep using spin_lock() in hda_dsp_ipc_irq_handler(). |
|
Yes, it is good case for us to make clear everything. I also study lock in Linux kernel, there is a good article https://kernel.readthedocs.io/en/sphinx-samples/kernel-locking.html @plbossart @keyonjie The irq handler does not to use spin_lock_irq(), because the softirq (or other low priority thread) cannot run while the irq handler is running: it can use spin_lock(), which is slightly faster. The only exception would be if a different hardware irq handler uses the same lock: spin_lock_irq() will stop that from interrupting us. I study the code and my idea is why spin_lock of hw_lock is needed in hda_dsp_ipc_irq_handler ? Actually only HDA_DSP_REG_ADSPIC is accessed by irq_thread function which is called after irq_handler function. So no need to call spin_lock and it is not a good idea to call spin_lock in IRQ function |
|
And I also check all the usage of bus->reg_lock. Two types of scenarios, to protect stream between two user threads in kernel or user threads with a softirq function: azx_irq_pending_work. It is never used in IRQ function, so spin_lock_irq is suitable, and of course spin_lock_irqsave is always right. Actually, in HDA framework, spin_lock_irqsave is never used on bus->reg_lock |
According to hda framework, stream should be protected by reg_lock Signed-off-by: Rander Wang <rander.wang@linux.intel.com>
1a4dac8 to
c2625cb
Compare
Hi @RanderWang, I agree we actually can delete them, but with minor change as below at the same time: |
|
Sorry, @keyonjie @RanderWang your explanations are going in all sorts of directions without answering to my very simple question. I was asking whether "reg_lock" needs to be protected by spin_lock() or or spin_lock_irq() or spin_lock_irqsave(). How "hw_lock" is managed is a different discussion, let's take things one step at a time, shall we? Also your answers are sometimes quite difficult to follow, please re-read your write up to make sure there aren't missing verbs or that the sentence is easy to understand. Using 'it' or 'that' after 3 lines of explanations including multiple entities is a sure way to lose the reader. It may be perfectly clear to you, it's totally ambiguous to me. |
|
I checked all the usage of bus->reg_lock in Linux kernel. Two types of scenarios are found, protecting stream between two user threads in kernel or protecting stream between user threads and a softirq function: azx_irq_pending_work. bus->reg_lock is never used in IRQ function, so spin_lock_irq is suitable. Actually, in HDA framework, spin_lock_irqsave is never used on bus->reg_lock. |
|
@RanderWang thanks for double-checking, merging now |
|
@xiulipan I have no idea what the CI issues are, will merge anyways. |
|
@plbossart |
Raw packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path. WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70 Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:sk_mc_loop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? __warn (kernel/panic.c:693) ? sk_mc_loop (net/core/sock.c:760) ? report_bug (lib/bug.c:201 lib/bug.c:219) ? handle_bug (arch/x86/kernel/traps.c:239) ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) ? sk_mc_loop (net/core/sock.c:760) ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1)) ? nf_hook_slow (net/netfilter/core.c:626) ip6_finish_output (net/ipv6/ip6_output.c:222) ? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215) ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:602) ipvlan ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan dev_hard_start_xmit (net/core/dev.c:3594) sch_direct_xmit (net/sched/sch_generic.c:343) __qdisc_run (net/sched/sch_generic.c:416) net_tx_action (net/core/dev.c:5286) handle_softirqs (kernel/softirq.c:555) __irq_exit_rcu (kernel/softirq.c:589) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043) The warning triggers as this: packet_sendmsg packet_snd //skb->sk is packet sk __dev_queue_xmit __dev_xmit_skb //q->enqueue is not NULL __qdisc_run sch_direct_xmit dev_hard_start_xmit ipvlan_start_xmit ipvlan_xmit_mode_l3 //l3 mode ipvlan_process_outbound //vepa flag ipvlan_process_v6_outbound ip6_local_out __ip6_finish_output ip6_finish_output2 //multicast packet sk_mc_loop //sk->sk_family is AF_PACKET Call ip{6}_local_out() with NULL sk in ipvlan as other tunnels to fix this. Fixes: 2ad7bf3 ("ipvlan: Initial check-in of the IPVLAN driver.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240529095633.613103-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
According to hda framework, stream should be protected by reg_lock
Signed-off-by: Rander Wang rander.wang@linux.intel.com