Skip to content

Conversation

@wangzhimin1179
Copy link
Contributor

Add suspend and resume support for smmuv3. The smmu is stopped when suspending and started when resuming.

When the smmu is suspended, it is powered off and the registers are cleared. So saves the msi_msg context during msi interrupt initialization of smmu. When resume happens it calls arm_smmu_device_reset() to restore the registers.

Closes: https://gitee.com/openeuler/kernel/issues/I4DZ7Q

@deepin-ci-robot
Copy link

Hi @wangzhimin1179. Thanks for your PR.

I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/* Saves the msg (base addr of msi irq) and restores it during resume */
desc->msg.address_lo = msg->address_lo;
desc->msg.address_hi = msg->address_hi;
desc->msg.data = msg->data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下面3222行只用到了address_hi和address_lo,这个data没用到? 是不是不用赋值?

@wangzhimin1179 wangzhimin1179 force-pushed the linux-6.6.y-arm-smmu-v3 branch from 2215d1c to 4043b2e Compare June 26, 2024 02:52
@wangzhimin1179
Copy link
Contributor Author

已更新

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zccrs for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@opsiff
Copy link
Member

opsiff commented Jun 26, 2024

Add suspend and resume support for smmuv3. The smmu is stopped when
suspending and started when resuming.

When the smmu is suspended, it is powered off and the registers are
cleared. So saves the msi_msg context during msi interrupt initialization
of smmu. When resume happens it calls arm_smmu_device_reset() to restore
the registers.

Closes: https://gitee.com/openeuler/kernel/issues/I4DZ7Q

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Wang Xu <wangxu@phytium.com.cn>
Signed-off-by: Li Mingzhe <limingzhe1839@phytium.com.cn>
Signed-off-by: Wang Zhimin <wangzhimin1179@phytium.com.cn>
@wangzhimin1179 wangzhimin1179 force-pushed the linux-6.6.y-arm-smmu-v3 branch from 4043b2e to 9ce9ded Compare June 26, 2024 03:07
@opsiff
Copy link
Member

opsiff commented Jun 26, 2024

/ok-to-test

@opsiff opsiff merged commit 14b1159 into deepin-community:linux-6.6.y Jun 26, 2024
@deepin-ci-robot
Copy link

deepin pr auto review

关键摘要:

  • arm_smmu_write_msi_msg函数中,desc->msg.address_lodesc->msg.address_hi的赋值应该在doorbell的计算之后,以确保正确的地址。
  • arm_smmu_resume_msis函数中,desc->msg.address_lodesc->msg.address_hi的恢复操作应该在smmu->dev->msi.domain不为0时执行,以确保不会恢复无效的MSI消息。
  • arm_smmu_setup_unique_irqs函数中,arm_smmu_resume_msis的调用应该在!resume为真时跳过,以避免在恢复模式下执行恢复操作。
  • arm_smmu_setup_irqs函数中,arm_smmu_resume_msis的调用应该在!resume为真时跳过,以避免在恢复模式下执行恢复操作。
  • arm_smmu_device_reset函数中,arm_smmu_setup_irqs的调用应该在!smmu->bypass为真时跳过,以避免在 bypass 模式下执行设置操作。
  • arm_smmu_pm_ops结构体中,arm_smmu_suspendarm_smmu_resume函数的调用应该在CONFIG_PM_SLEEP为真时执行,以确保在休眠状态下正确地管理设备。

是否建议立即修改:

opsiff pushed a commit that referenced this pull request May 30, 2025
[ Upstream commit f95d186 ]

[BUG]
When trying read-only scrub on a btrfs with rescue=idatacsums mount
option, it will crash with the following call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000208
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 1 UID: 0 PID: 835 Comm: btrfs Tainted: G           O        6.15.0-rc3-custom+ #236 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:btrfs_lookup_csums_bitmap+0x49/0x480 [btrfs]
  Call Trace:
   <TASK>
   scrub_find_fill_first_stripe+0x35b/0x3d0 [btrfs]
   scrub_simple_mirror+0x175/0x290 [btrfs]
   scrub_stripe+0x5f7/0x6f0 [btrfs]
   scrub_chunk+0x9a/0x150 [btrfs]
   scrub_enumerate_chunks+0x333/0x660 [btrfs]
   btrfs_scrub_dev+0x23e/0x600 [btrfs]
   btrfs_ioctl+0x1dcf/0x2f80 [btrfs]
   __x64_sys_ioctl+0x97/0xc0
   do_syscall_64+0x4f/0x120
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

[CAUSE]
Mount option "rescue=idatacsums" will completely skip loading the csum
tree, so that any data read will not find any data csum thus we will
ignore data checksum verification.

Normally call sites utilizing csum tree will check the fs state flag
NO_DATA_CSUMS bit, but unfortunately scrub does not check that bit at all.

This results in scrub to call btrfs_search_slot() on a NULL pointer
and triggered above crash.

[FIX]
Check both extent and csum tree root before doing any tree search.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 6e9770de024964b1017f99ee94f71967bd6edaeb)
opsiff pushed a commit that referenced this pull request May 30, 2025
[ Upstream commit f95d186 ]

[BUG]
When trying read-only scrub on a btrfs with rescue=idatacsums mount
option, it will crash with the following call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000208
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 1 UID: 0 PID: 835 Comm: btrfs Tainted: G           O        6.15.0-rc3-custom+ #236 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:btrfs_lookup_csums_bitmap+0x49/0x480 [btrfs]
  Call Trace:
   <TASK>
   scrub_find_fill_first_stripe+0x35b/0x3d0 [btrfs]
   scrub_simple_mirror+0x175/0x290 [btrfs]
   scrub_stripe+0x5f7/0x6f0 [btrfs]
   scrub_chunk+0x9a/0x150 [btrfs]
   scrub_enumerate_chunks+0x333/0x660 [btrfs]
   btrfs_scrub_dev+0x23e/0x600 [btrfs]
   btrfs_ioctl+0x1dcf/0x2f80 [btrfs]
   __x64_sys_ioctl+0x97/0xc0
   do_syscall_64+0x4f/0x120
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

[CAUSE]
Mount option "rescue=idatacsums" will completely skip loading the csum
tree, so that any data read will not find any data csum thus we will
ignore data checksum verification.

Normally call sites utilizing csum tree will check the fs state flag
NO_DATA_CSUMS bit, but unfortunately scrub does not check that bit at all.

This results in scrub to call btrfs_search_slot() on a NULL pointer
and triggered above crash.

[FIX]
Check both extent and csum tree root before doing any tree search.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Jun 9, 2025
[ Upstream commit f95d186 ]

[BUG]
When trying read-only scrub on a btrfs with rescue=idatacsums mount
option, it will crash with the following call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000208
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 1 UID: 0 PID: 835 Comm: btrfs Tainted: G           O        6.15.0-rc3-custom+ deepin-community#236 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:btrfs_lookup_csums_bitmap+0x49/0x480 [btrfs]
  Call Trace:
   <TASK>
   scrub_find_fill_first_stripe+0x35b/0x3d0 [btrfs]
   scrub_simple_mirror+0x175/0x290 [btrfs]
   scrub_stripe+0x5f7/0x6f0 [btrfs]
   scrub_chunk+0x9a/0x150 [btrfs]
   scrub_enumerate_chunks+0x333/0x660 [btrfs]
   btrfs_scrub_dev+0x23e/0x600 [btrfs]
   btrfs_ioctl+0x1dcf/0x2f80 [btrfs]
   __x64_sys_ioctl+0x97/0xc0
   do_syscall_64+0x4f/0x120
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

[CAUSE]
Mount option "rescue=idatacsums" will completely skip loading the csum
tree, so that any data read will not find any data csum thus we will
ignore data checksum verification.

Normally call sites utilizing csum tree will check the fs state flag
NO_DATA_CSUMS bit, but unfortunately scrub does not check that bit at all.

This results in scrub to call btrfs_search_slot() on a NULL pointer
and triggered above crash.

[FIX]
Check both extent and csum tree root before doing any tree search.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 50d0de59f66cbe6d597481e099bf1c70fd07e0a9)
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Jun 10, 2025
[ Upstream commit f95d186 ]

[BUG]
When trying read-only scrub on a btrfs with rescue=idatacsums mount
option, it will crash with the following call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000208
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 1 UID: 0 PID: 835 Comm: btrfs Tainted: G           O        6.15.0-rc3-custom+ deepin-community#236 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:btrfs_lookup_csums_bitmap+0x49/0x480 [btrfs]
  Call Trace:
   <TASK>
   scrub_find_fill_first_stripe+0x35b/0x3d0 [btrfs]
   scrub_simple_mirror+0x175/0x290 [btrfs]
   scrub_stripe+0x5f7/0x6f0 [btrfs]
   scrub_chunk+0x9a/0x150 [btrfs]
   scrub_enumerate_chunks+0x333/0x660 [btrfs]
   btrfs_scrub_dev+0x23e/0x600 [btrfs]
   btrfs_ioctl+0x1dcf/0x2f80 [btrfs]
   __x64_sys_ioctl+0x97/0xc0
   do_syscall_64+0x4f/0x120
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

[CAUSE]
Mount option "rescue=idatacsums" will completely skip loading the csum
tree, so that any data read will not find any data csum thus we will
ignore data checksum verification.

Normally call sites utilizing csum tree will check the fs state flag
NO_DATA_CSUMS bit, but unfortunately scrub does not check that bit at all.

This results in scrub to call btrfs_search_slot() on a NULL pointer
and triggered above crash.

[FIX]
Check both extent and csum tree root before doing any tree search.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 50d0de59f66cbe6d597481e099bf1c70fd07e0a9)
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Jun 10, 2025
[ Upstream commit f95d186 ]

[BUG]
When trying read-only scrub on a btrfs with rescue=idatacsums mount
option, it will crash with the following call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000208
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 1 UID: 0 PID: 835 Comm: btrfs Tainted: G           O        6.15.0-rc3-custom+ deepin-community#236 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:btrfs_lookup_csums_bitmap+0x49/0x480 [btrfs]
  Call Trace:
   <TASK>
   scrub_find_fill_first_stripe+0x35b/0x3d0 [btrfs]
   scrub_simple_mirror+0x175/0x290 [btrfs]
   scrub_stripe+0x5f7/0x6f0 [btrfs]
   scrub_chunk+0x9a/0x150 [btrfs]
   scrub_enumerate_chunks+0x333/0x660 [btrfs]
   btrfs_scrub_dev+0x23e/0x600 [btrfs]
   btrfs_ioctl+0x1dcf/0x2f80 [btrfs]
   __x64_sys_ioctl+0x97/0xc0
   do_syscall_64+0x4f/0x120
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

[CAUSE]
Mount option "rescue=idatacsums" will completely skip loading the csum
tree, so that any data read will not find any data csum thus we will
ignore data checksum verification.

Normally call sites utilizing csum tree will check the fs state flag
NO_DATA_CSUMS bit, but unfortunately scrub does not check that bit at all.

This results in scrub to call btrfs_search_slot() on a NULL pointer
and triggered above crash.

[FIX]
Check both extent and csum tree root before doing any tree search.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 50d0de59f66cbe6d597481e099bf1c70fd07e0a9)
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Dec 29, 2025
Add suspend and resume support for smmuv3. The smmu is stopped when
suspending and started when resuming.

When the smmu is suspended, it is powered off and the registers are
cleared. So saves the msi_msg context during msi interrupt initialization
of smmu. When resume happens it calls arm_smmu_device_reset() to restore
the registers.

Closes: https://gitee.com/openeuler/kernel/issues/I4DZ7Q

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Wang Xu <wangxu@phytium.com.cn>
Signed-off-by: Li Mingzhe <limingzhe1839@phytium.com.cn>
Signed-off-by: Wang Zhimin <wangzhimin1179@phytium.com.cn>
Link: deepin-community#236
[ remove bypass becasue of it removed:
commit 734554f
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Fri Apr 5 17:52:07 2024 +0100

    iommu/arm-smmu-v3: Retire disable_bypass parameter

    The disable_bypass parameter has been mostly meaningless for a long time
    since the introduction of default domains. Its original intent is now
    fulfilled by the controls users have over the default domain type, and
    its remaining effect in the brief window between Stream Table
    initialisation and default domain creation hardly seems worth the
    complication. Furthermore, thanks to 2-level Stream Tables, disabling
    disable_bypass (there's another reason not to like it right there) has
    never guaranteed that any particular StreamID *will* bypass anyway - any
    device which might actually care about that wants RMRs - so there's not
    really much lost by taking away that option (which has already been
    non-default for nearing 6 years now).

    As part of this, also remove the weird behaviour where we "successfully"
    probe and register a non-functional SMMU if the DT "#iommu-cells"
    property is wrong. I have no memory of what possessed me to think that
    was a good idea at the time, and by now I suspect it's likely to break
    things worse than simply failing probe would.

    Signed-off-by: Robin Murphy <robin.murphy@arm.com>
    Reviewed-by: Mostafa Saleh <smostafa@google.com>
    Link: https://lore.kernel.org/r/ea3ac4cd595a81b5511729601b2f7d4668178438.1712335927.git.robin.murphy@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
]
(cherry picked from commit dbcdbed)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
opsiff pushed a commit to opsiff/UOS-kernel that referenced this pull request Dec 29, 2025
Add suspend and resume support for smmuv3. The smmu is stopped when
suspending and started when resuming.

When the smmu is suspended, it is powered off and the registers are
cleared. So saves the msi_msg context during msi interrupt initialization
of smmu. When resume happens it calls arm_smmu_device_reset() to restore
the registers.

Closes: https://gitee.com/openeuler/kernel/issues/I4DZ7Q

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Wang Xu <wangxu@phytium.com.cn>
Signed-off-by: Li Mingzhe <limingzhe1839@phytium.com.cn>
Signed-off-by: Wang Zhimin <wangzhimin1179@phytium.com.cn>
Link: deepin-community#236
[ remove bypass becasue of it removed:
commit 734554f
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Fri Apr 5 17:52:07 2024 +0100

    iommu/arm-smmu-v3: Retire disable_bypass parameter

    The disable_bypass parameter has been mostly meaningless for a long time
    since the introduction of default domains. Its original intent is now
    fulfilled by the controls users have over the default domain type, and
    its remaining effect in the brief window between Stream Table
    initialisation and default domain creation hardly seems worth the
    complication. Furthermore, thanks to 2-level Stream Tables, disabling
    disable_bypass (there's another reason not to like it right there) has
    never guaranteed that any particular StreamID *will* bypass anyway - any
    device which might actually care about that wants RMRs - so there's not
    really much lost by taking away that option (which has already been
    non-default for nearing 6 years now).

    As part of this, also remove the weird behaviour where we "successfully"
    probe and register a non-functional SMMU if the DT "#iommu-cells"
    property is wrong. I have no memory of what possessed me to think that
    was a good idea at the time, and by now I suspect it's likely to break
    things worse than simply failing probe would.

    Signed-off-by: Robin Murphy <robin.murphy@arm.com>
    Reviewed-by: Mostafa Saleh <smostafa@google.com>
    Link: https://lore.kernel.org/r/ea3ac4cd595a81b5511729601b2f7d4668178438.1712335927.git.robin.murphy@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
]
(cherry picked from commit dbcdbed)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
lanlanxiyiji pushed a commit that referenced this pull request Jan 4, 2026
Add suspend and resume support for smmuv3. The smmu is stopped when
suspending and started when resuming.

When the smmu is suspended, it is powered off and the registers are
cleared. So saves the msi_msg context during msi interrupt initialization
of smmu. When resume happens it calls arm_smmu_device_reset() to restore
the registers.

Closes: https://gitee.com/openeuler/kernel/issues/I4DZ7Q

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Wang Xu <wangxu@phytium.com.cn>
Signed-off-by: Li Mingzhe <limingzhe1839@phytium.com.cn>
Signed-off-by: Wang Zhimin <wangzhimin1179@phytium.com.cn>
Link: #236
[ remove bypass becasue of it removed:
commit 734554f
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Fri Apr 5 17:52:07 2024 +0100

    iommu/arm-smmu-v3: Retire disable_bypass parameter

    The disable_bypass parameter has been mostly meaningless for a long time
    since the introduction of default domains. Its original intent is now
    fulfilled by the controls users have over the default domain type, and
    its remaining effect in the brief window between Stream Table
    initialisation and default domain creation hardly seems worth the
    complication. Furthermore, thanks to 2-level Stream Tables, disabling
    disable_bypass (there's another reason not to like it right there) has
    never guaranteed that any particular StreamID *will* bypass anyway - any
    device which might actually care about that wants RMRs - so there's not
    really much lost by taking away that option (which has already been
    non-default for nearing 6 years now).

    As part of this, also remove the weird behaviour where we "successfully"
    probe and register a non-functional SMMU if the DT "#iommu-cells"
    property is wrong. I have no memory of what possessed me to think that
    was a good idea at the time, and by now I suspect it's likely to break
    things worse than simply failing probe would.

    Signed-off-by: Robin Murphy <robin.murphy@arm.com>
    Reviewed-by: Mostafa Saleh <smostafa@google.com>
    Link: https://lore.kernel.org/r/ea3ac4cd595a81b5511729601b2f7d4668178438.1712335927.git.robin.murphy@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
]
(cherry picked from commit dbcdbed)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants