-
Notifications
You must be signed in to change notification settings - Fork 140
S0ix rework #1631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S0ix rework #1631
Conversation
4055f3e to
32dd5f0
Compare
|
SOFCI TEST |
keyonjie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 Thank you for working on this, it looks work to me, my biggest open is that should we put these D0I3 related stuff in SOF generic or better to put them into Intel cAVS platform only.
The idea of scheduling D0I3 delayed work on each IPC sent worth more discussions from stakeholders, concerns from me include how should we handle those received IPC(from FW) and the power state changes during the delayed timeout or concurrent read/write without protection.
32dd5f0 to
d58d4d6
Compare
|
@keyonjie I have added the mutex now and consolidated the DSP power state transition to one place in hda_dsp_set_power_state(). There should be no other place where we should be handling the DSP state transition other than this. Let me know what you think of this and then we can take it to Pierre |
8b2c8d4 to
46152b8
Compare
|
@plbossart I think this PR is ready for review now. Keyon and I agree upon everything except the placement of the d0i3_work. While I have it as part of the top-level SOF device, he'd like it to be in the hda device. We could use your advise on what you think. |
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @ranj063 and @keyonjie you are not going to like my feedback. I realize there's a lot of work but I wonder if some of the directions are the right ones.
My main concern is that you chose design patterns that lead to mistakes, such as a 3-value state which you test with an if/else, or overloading state and transitions. if you test state != D3, then the else covers two cases. if you are IN S0, it's different from doing a suspend TO S0 (which really means S0ix which oddly is not even defined)
I am also not sold on the notion of removing the D0 substrates, this is a convenient change on paper that leads to quite a few confusions. If you want to store all the states in one variable, then maybe flags are required.
I really expected more discipline in hardening concepts, this needs a lot more work I am afraid.
|
@plbossart Thanks for your feedback but I have a simple request from you. This series is not easy to review without understanding the entire flow. Some of the patches probably need squashing but I Ieft them separate to make it easier to read. The first few patches only prepare for whats coming later without actually meddling with the flow. So what looks like hard-coded values will be later on changed as and when the flow changes. I could add some comments to maybe be easier to follow. Please stay tuned for a better commented version. |
well that's not a simple request if |
46152b8 to
cf54b20
Compare
|
@plbossart @keyonjie I've update this PR now with the suggested flow changes, name changes and some comments to make it a bit easier to follow. Let me know what you think |
@plbossart well unfortunately with this series there are too many fixes/additions/removals. So yes, a bit of overview of the overall flow is needed. I said this before as well, as far as the S0ix implementation goes, with the number of flags and helpers we have, its hard for anyone outside of the 3 of us to tell whats going on. Anyway, with the latest version of the patches, I've added some FIXME's in places where the changes are made in the subsequent patches. Let me know if this makes better sense |
cf54b20 to
eb8a873
Compare
|
@keyonjie please take a look now. I've modified IPC tx to use mod_delayed_work instead of schedule_delayed_work to avoid multiple workqueue items. I've not added the kernel module parameter yet. I'll add it after discussing it with Pierre. |
9cc864c to
5849e10
Compare
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial code rework could be done differently.
What you want first is to remove the D0i3-specific parts form snd_sof_suspend /snd_sod_resume
That would really unify the code first, and then you can start changing the functionality.
I am not happy that I let this initial code go through, it was a bad design to have this multi-stage decision making. Anyway, the past is the past, let's fix this.
Sorry @keyonjie I don't see the need for S0ix, there was the opportunistic D0i3 support that could be tested... I can't figure out why we are so fixated on S0ix when there is a way to test without it... |
|
@plbossart opportunistic D0i3 is one scenario, we have S0ix scenarios also. We have a test case list to verify this PR, in which S0ix WoV ones included. |
We had two weeks to test this PR. Nothing happened apparently, so I am leaning on merging this PR to force the validation work to happen... |
|
@plbossart ack |
I am fine if you are aligned on that. |
ff67bc2 to
bf00b5e
Compare
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall well written and well commented code. Just questions on the D0I3 entry.
|
SOFCI TEST |
Unify the suspend/resume routines for both the D0I3/D3 DSP targets in sof_suspend()/sof_resume(). Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Setting the prepared flag to false marks the streams for the hw_params to be reset upon resuming. In the case of the D0i3-compatible streams that ignored suspend to keep the pipeline active in the DSP during suspend, this should not be done. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add the system_suspend_target field to struct snd_sof_dev to track the intended system suspend power target. This will be used as one of the criteria for determining the final DSP power state. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a new enum sof_dsp_power_states for all the possible the DSP device states. The SOF driver currently handles only the D0 and D3 states and support for other states will be added later as needed. Also, add a helper to determine the target DSP power state based on the system suspend target. The snd_sof_dsp_d0i3_on_suspend() function is renamed to snd_sof_stream_suspend_ignored() to be more indicative of what it does and it used to determine the target DSP state during system suspend. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
The DSP device substates such as D0I0/D0I3 are platform-specific. Therefore, the d0_substate field of struct snd_sof_dev is replaced with the dsp_power_state field which represents the current state of the DSP. This field holds both the device state and the platform-specific substate values. With the DSP device substates being platform-specific, the DSP power state transitions need to be performed in the platform-specific suspend/resume ops as well. In order to achieve this, the ops signature has to be modified to pass the target device state as an argument. The target substate will be determined by the platform-specific ops before performing the transition. For example, in the case of the system suspending to S0IX, the top-level SOF device suspend callback needs to only determine if the DSP will be entering D3 or remain in D0. The target substate in case the device needs to remain in D0 (D0I0 or D0I3) will be determined by the platform-specific suspend op. With the addition of the extended set of power states for the DSP, the set_power_state op for HDA platforms has to be extended to handle only the appropriate state transitions. So, the implementation for the Intel HDA platforms is also modified. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a helper function to check if only D0i3-compatible streams are active. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Amend the DSP state transition diagram in preparation for introducing the feature to support opportunistic DSP D0I3 state when the system is in S0. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
This patch implements support for DSP D0i3 when the system is in S0. The basic idea is to schedule a delayed work after every successful IPC TX that checks if there are only D0I3-compatible streams active and if so transition the DSP to D0I3. With the introduction of DSP D0I3 in S0, we need to ensure that the DSP is in D0I0 before sending any new IPCs. The exception for this would be the compact IPCs that are used to set the DSP in D0I3/D0I0 states. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
…debug Trace DMA is disabled by default when the DSP is in D0I3. Add a debug option to keep trace DMA enabled when the DSP is in D0I3 during S0. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
|
SOFCI TEST |
|
Tested on several platforms:
|
|
ok, let's merge, it's been reviewed to death already. |
Memory regions (MR) of type DM (device memory) do not have an associated umem. In the __mlx5_ib_dereg_mr() -> mlx5_free_priv_descs() flow, the code incorrectly takes the wrong branch, attempting to call dma_unmap_single() on a DMA address that is not mapped. This results in a WARN [1], as shown below. The issue is resolved by properly accounting for the DM type and ensuring the correct branch is selected in mlx5_free_priv_descs(). [1] WARNING: CPU: 12 PID: 1346 at drivers/iommu/dma-iommu.c:1230 iommu_dma_unmap_page+0x79/0x90 Modules linked in: ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry ovelay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core CPU: 12 UID: 0 PID: 1346 Comm: ibv_rc_pingpong Not tainted 6.12.0-rc7+ thesofproject#1631 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:iommu_dma_unmap_page+0x79/0x90 Code: 2b 49 3b 29 72 26 49 3b 69 08 73 20 4d 89 f0 44 89 e9 4c 89 e2 48 89 ee 48 89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 07 b8 88 ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 0f 1f 44 00 RSP: 0018:ffffc90001913a10 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810194b0a8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88810194b0a8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f537abdd740(0000) GS:ffff88885fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f537aeb8000 CR3: 000000010c248001 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn+0x84/0x190 ? iommu_dma_unmap_page+0x79/0x90 ? report_bug+0xf8/0x1c0 ? handle_bug+0x55/0x90 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? iommu_dma_unmap_page+0x79/0x90 dma_unmap_page_attrs+0xe6/0x290 mlx5_free_priv_descs+0xb0/0xe0 [mlx5_ib] __mlx5_ib_dereg_mr+0x37e/0x520 [mlx5_ib] ? _raw_spin_unlock_irq+0x24/0x40 ? wait_for_completion+0xfe/0x130 ? rdma_restrack_put+0x63/0xe0 [ib_core] ib_dereg_mr_user+0x5f/0x120 [ib_core] ? lock_release+0xc6/0x280 destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs] uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs] uobj_destroy+0x3f/0x70 [ib_uverbs] ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs] ? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs] ? lock_acquire+0xc1/0x2f0 ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] ? ib_uverbs_ioctl+0x116/0x170 [ib_uverbs] ? lock_release+0xc6/0x280 ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs] ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] __x64_sys_ioctl+0x1b0/0xa70 do_syscall_64+0x6b/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f537adaf17b Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffff218f0b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffff218f1d8 RCX: 00007f537adaf17b RDX: 00007ffff218f1c0 RSI: 00000000c0181b01 RDI: 0000000000000003 RBP: 00007ffff218f1a0 R08: 00007f537aa8d010 R09: 0000561ee2e4f270 R10: 00007f537aace3a8 R11: 0000000000000246 R12: 00007ffff218f190 R13: 000000000000001c R14: 0000561ee2e4d7c0 R15: 00007ffff218f450 </TASK> Fixes: f18ec42 ("RDMA/mlx5: Use a union inside mlx5_ib_mr") Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/2039c22cfc3df02378747ba4d623a558b53fc263.1738587076.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
This PR replaces #1524