[6.6-velinux] Intel: backport KVM Fix for Clearing SGX EDECCSSA to 6.6 by zhiquan1-li · Pull Request #45 · openvelinux/kernel

zhiquan1-li · 2025-03-31T01:59:20Z

Description
When SGX EDECCSSA support was added to KVM in commit 16a7fe3 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest"), it forgot to clear the X86_FEATURE_SGX_EDECCSSA bit in KVM CPU caps when KVM SGX is disabled. Fix it.

Fixes: 16a7fe3 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest")

About the patches
The total patch number is 1:

7efb4d8a392a KVM: VMX: Also clear SGX EDECCSSA in KVM CPU caps when SGX is disabled

Tests

Build successfully for each commit
Kernel selftest - SGX: PASSED

cd tools/testing/selftests/sgx/
make
./test_sgx

Kernel selftest - SGX in VM: PASSED
Function test

Step 1. Original SGX EDECCSSA status in guest

[root@guest ~]# cpuid -1 -l 0x12
CPU:
   Software Guard Extensions (SGX) capability (0x12/0):
      SGX1 supported                           = true
      SGX2 supported                           = true
      SGX ENCLV E*VIRTCHILD, ESETCONTEXT       = false
      SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false
      SGX ENCLU EVERIFYREPORT2                 = false
      SGX ENCLS EUPDATESVN                     = false
      SGX ENCLU EDECCSSA                       = true
      MISCSELECT.EXINFO supported: #PF & #GP   = true
      MISCSELECT.CPINFO supported: #CP         = false
      MaxEnclaveSize_Not64 (log2)              = 0x1f (31)
      MaxEnclaveSize_64 (log2)                 = 0x38 (56)

Step 2. Disable SGX in guest

root@KVM-host:~# rmmod kvm_intel
root@KVM-host:~# modprobe kvm_intel sgx=0

Step 3. The SGX EDECCSSA capability is cleared in KVM, then its status becomes false

[root@guest ~]# cpuid -1 -l 0x12
CPU:
   Software Guard Extensions (SGX) capability (0x12/0):
      SGX1 supported                           = false
      SGX2 supported                           = false
      SGX ENCLV E*VIRTCHILD, ESETCONTEXT       = false
      SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false
      SGX ENCLU EVERIFYREPORT2                 = false
      SGX ENCLS EUPDATESVN                     = false
      SGX ENCLU EDECCSSA                       = false
      MISCSELECT.EXINFO supported: #PF & #GP   = false
      MISCSELECT.CPINFO supported: #CP         = false
      MaxEnclaveSize_Not64 (log2)              = 0x0 (0)
      MaxEnclaveSize_64 (log2)                 = 0x0 (0)
[root@TDX-guest ~]#

Known issue:
None

Default config change:
None

commit 7efb4d8 upstream. When SGX EDECCSSA support was added to KVM in commit 16a7fe3 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest"), it forgot to clear the X86_FEATURE_SGX_EDECCSSA bit in KVM CPU caps when KVM SGX is disabled. Fix it. Intel-SIG: commit 7efb4d8 KVM: VMX: Also clear SGX EDECCSSA in KVM CPU caps when SGX is disabled Backport a fix for the KVM exposing the SGX EDECCSSA capability. Fixes: 16a7fe3 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest") Signed-off-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20240905120837.579102-1-kai.huang@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>

…le nodes commit 2eaa6c2 upstream. The decreasing of hugetlb pages number failed with the following message given: sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE) CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.6+0x84/0xe4 show_stack+0x18/0x24 dump_stack_lvl+0x48/0x60 dump_stack+0x18/0x24 warn_alloc+0x100/0x1bc __alloc_pages_slowpath.constprop.107+0xa40/0xad8 __alloc_pages+0x244/0x2d0 hugetlb_vmemmap_restore+0x104/0x1e4 __update_and_free_hugetlb_folio+0x44/0x1f4 update_and_free_hugetlb_folio+0x20/0x68 update_and_free_pages_bulk+0x4c/0xac set_max_huge_pages+0x198/0x334 nr_hugepages_store_common+0x118/0x178 nr_hugepages_store+0x18/0x24 kobj_attr_store+0x18/0x2c sysfs_kf_write+0x40/0x54 kernfs_fop_write_iter+0x164/0x1dc vfs_write+0x3a8/0x460 ksys_write+0x6c/0x100 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.1+0x6c/0xe4 do_el0_svc+0x38/0x94 el0_svc+0x28/0x74 el0t_64_sync_handler+0xa0/0xc4 el0t_64_sync+0x174/0x178 Mem-Info: ... The reason is that the hugetlb pages being released are allocated from movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages need to be allocated from the same node during the hugetlb pages releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable node is always failed. Fix this problem by removing __GFP_THISNODE. Link: https://lkml.kernel.org/r/20230905124503.24899-1-yuancan@huawei.com Fixes: ad2fa37 ("mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

…le nodes commit 2eaa6c2 upstream. The decreasing of hugetlb pages number failed with the following message given: sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE) CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.6+0x84/0xe4 show_stack+0x18/0x24 dump_stack_lvl+0x48/0x60 dump_stack+0x18/0x24 warn_alloc+0x100/0x1bc __alloc_pages_slowpath.constprop.107+0xa40/0xad8 __alloc_pages+0x244/0x2d0 hugetlb_vmemmap_restore+0x104/0x1e4 __update_and_free_hugetlb_folio+0x44/0x1f4 update_and_free_hugetlb_folio+0x20/0x68 update_and_free_pages_bulk+0x4c/0xac set_max_huge_pages+0x198/0x334 nr_hugepages_store_common+0x118/0x178 nr_hugepages_store+0x18/0x24 kobj_attr_store+0x18/0x2c sysfs_kf_write+0x40/0x54 kernfs_fop_write_iter+0x164/0x1dc vfs_write+0x3a8/0x460 ksys_write+0x6c/0x100 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.1+0x6c/0xe4 do_el0_svc+0x38/0x94 el0_svc+0x28/0x74 el0t_64_sync_handler+0xa0/0xc4 el0t_64_sync+0x174/0x178 Mem-Info: ... The reason is that the hugetlb pages being released are allocated from movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages need to be allocated from the same node during the hugetlb pages releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable node is always failed. Fix this problem by removing __GFP_THISNODE. Link: https://lkml.kernel.org/r/20230905124503.24899-1-yuancan@huawei.com Fixes: ad2fa37 ("mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Songtang Liu <liusongtang@bytedance.com>

…x-kvm-sgx-clear-edeccssa' into intel-6.6-velinux == Description When SGX EDECCSSA support was added to KVM in commit 16a7fe3 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest"), it forgot to clear the X86_FEATURE_SGX_EDECCSSA bit in KVM CPU caps when KVM SGX is disabled. Fix it. Fixes: 16a7fe3 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest") About the patches The total patch number is 1: 7efb4d8 KVM: VMX: Also clear SGX EDECCSSA in KVM CPU caps when SGX is disabled == Tests 1. Build successfully for each commit 2. Kernel selftest - SGX: PASSED cd tools/testing/selftests/sgx/ make ./test_sgx 3. Kernel selftest - SGX in VM: PASSED 4. Function test Step 1. Original SGX EDECCSSA status in guest [root@guest ~]# cpuid -1 -l 0x12 CPU: Software Guard Extensions (SGX) capability (0x12/0): SGX1 supported = true SGX2 supported = true SGX ENCLV E*VIRTCHILD, ESETCONTEXT = false SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false SGX ENCLU EVERIFYREPORT2 = false SGX ENCLS EUPDATESVN = false SGX ENCLU EDECCSSA = true MISCSELECT.EXINFO supported: #PF & #GP = true MISCSELECT.CPINFO supported: #CP = false MaxEnclaveSize_Not64 (log2) = 0x1f (31) MaxEnclaveSize_64 (log2) = 0x38 (56) Step 2. Disable SGX in guest root@KVM-host:~# rmmod kvm_intel root@KVM-host:~# modprobe kvm_intel sgx=0 Step 3. The SGX EDECCSSA capability is cleared in KVM, then its status becomes false [root@guest ~]# cpuid -1 -l 0x12 CPU: Software Guard Extensions (SGX) capability (0x12/0): SGX1 supported = false SGX2 supported = false SGX ENCLV E*VIRTCHILD, ESETCONTEXT = false SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false SGX ENCLU EVERIFYREPORT2 = false SGX ENCLS EUPDATESVN = false SGX ENCLU EDECCSSA = false MISCSELECT.EXINFO supported: #PF & #GP = false MISCSELECT.CPINFO supported: #CP = false MaxEnclaveSize_Not64 (log2) = 0x0 (0) MaxEnclaveSize_64 (log2) = 0x0 (0) [root@TDX-guest ~]# == Known issue: None == Default config change: None

commit 227cb4ca5a6502164f850d22aec3104d7888b270 upstream. When running the following code on an ext4 filesystem with inline_data feature enabled, it will lead to the bug below. fd = open("file1", O_RDWR | O_CREAT | O_TRUNC, 0666); ftruncate(fd, 30); pwrite(fd, "a", 1, (1UL << 40) + 5UL); That happens because write_begin will succeed as when ext4_generic_write_inline_data calls ext4_prepare_inline_data, pos + len will be truncated, leading to ext4_prepare_inline_data parameter to be 6 instead of 0x10000000006. Then, later when write_end is called, we hit: BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); at ext4_write_inline_data. Fix it by using a loff_t type for the len parameter in ext4_prepare_inline_data instead of an unsigned int. [ 44.545164] ------------[ cut here ]------------ [ 44.545530] kernel BUG at fs/ext4/inline.c:240! [ 44.545834] Oops: invalid opcode: 0000 [openvelinux#1] SMP NOPTI [ 44.546172] CPU: 3 UID: 0 PID: 343 Comm: test Not tainted 6.15.0-rc2-00003-g9080916f4863 openvelinux#45 PREEMPT(full) 112853fcebfdb93254270a7959841d2c6aa2c8bb [ 44.546523] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 44.546523] RIP: 0010:ext4_write_inline_data+0xfe/0x100 [ 44.546523] Code: 3c 0e 48 83 c7 48 48 89 de 5b 41 5c 41 5d 41 5e 41 5f 5d e9 e4 fa 43 01 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 0b <0f> 0b 0f 1f 44 00 00 55 41 57 41 56 41 55 41 54 53 48 83 ec 20 49 [ 44.546523] RSP: 0018:ffffb342008b79a8 EFLAGS: 00010216 [ 44.546523] RAX: 0000000000000001 RBX: ffff9329c579c000 RCX: 0000010000000006 [ 44.546523] RDX: 000000000000003c RSI: ffffb342008b79f0 RDI: ffff9329c158e738 [ 44.546523] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 [ 44.546523] R10: 00007ffffffff000 R11: ffffffff9bd0d910 R12: 0000006210000000 [ 44.546523] R13: fffffc7e4015e700 R14: 0000010000000005 R15: ffff9329c158e738 [ 44.546523] FS: 00007f4299934740(0000) GS:ffff932a60179000(0000) knlGS:0000000000000000 [ 44.546523] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 44.546523] CR2: 00007f4299a1ec90 CR3: 0000000002886002 CR4: 0000000000770eb0 [ 44.546523] PKRU: 55555554 [ 44.546523] Call Trace: [ 44.546523] <TASK> [ 44.546523] ext4_write_inline_data_end+0x126/0x2d0 [ 44.546523] generic_perform_write+0x17e/0x270 [ 44.546523] ext4_buffered_write_iter+0xc8/0x170 [ 44.546523] vfs_write+0x2be/0x3e0 [ 44.546523] __x64_sys_pwrite64+0x6d/0xc0 [ 44.546523] do_syscall_64+0x6a/0xf0 [ 44.546523] ? __wake_up+0x89/0xb0 [ 44.546523] ? xas_find+0x72/0x1c0 [ 44.546523] ? next_uptodate_folio+0x317/0x330 [ 44.546523] ? set_pte_range+0x1a6/0x270 [ 44.546523] ? filemap_map_pages+0x6ee/0x840 [ 44.546523] ? ext4_setattr+0x2fa/0x750 [ 44.546523] ? do_pte_missing+0x128/0xf70 [ 44.546523] ? security_inode_post_setattr+0x3e/0xd0 [ 44.546523] ? ___pte_offset_map+0x19/0x100 [ 44.546523] ? handle_mm_fault+0x721/0xa10 [ 44.546523] ? do_user_addr_fault+0x197/0x730 [ 44.546523] ? do_syscall_64+0x76/0xf0 [ 44.546523] ? arch_exit_to_user_mode_prepare+0x1e/0x60 [ 44.546523] ? irqentry_exit_to_user_mode+0x79/0x90 [ 44.546523] entry_SYSCALL_64_after_hwframe+0x55/0x5d [ 44.546523] RIP: 0033:0x7f42999c6687 [ 44.546523] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff [ 44.546523] RSP: 002b:00007ffeae4a7930 EFLAGS: 00000202 ORIG_RAX: 0000000000000012 [ 44.546523] RAX: ffffffffffffffda RBX: 00007f4299934740 RCX: 00007f42999c6687 [ 44.546523] RDX: 0000000000000001 RSI: 000055ea6149200f RDI: 0000000000000003 [ 44.546523] RBP: 00007ffeae4a79a0 R08: 0000000000000000 R09: 0000000000000000 [ 44.546523] R10: 0000010000000005 R11: 0000000000000202 R12: 0000000000000000 [ 44.546523] R13: 00007ffeae4a7ac8 R14: 00007f4299b86000 R15: 000055ea61493dd8 [ 44.546523] </TASK> [ 44.546523] Modules linked in: [ 44.568501] ---[ end trace 0000000000000000 ]--- [ 44.568889] RIP: 0010:ext4_write_inline_data+0xfe/0x100 [ 44.569328] Code: 3c 0e 48 83 c7 48 48 89 de 5b 41 5c 41 5d 41 5e 41 5f 5d e9 e4 fa 43 01 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 0b <0f> 0b 0f 1f 44 00 00 55 41 57 41 56 41 55 41 54 53 48 83 ec 20 49 [ 44.570931] RSP: 0018:ffffb342008b79a8 EFLAGS: 00010216 [ 44.571356] RAX: 0000000000000001 RBX: ffff9329c579c000 RCX: 0000010000000006 [ 44.571959] RDX: 000000000000003c RSI: ffffb342008b79f0 RDI: ffff9329c158e738 [ 44.572571] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 [ 44.573148] R10: 00007ffffffff000 R11: ffffffff9bd0d910 R12: 0000006210000000 [ 44.573748] R13: fffffc7e4015e700 R14: 0000010000000005 R15: ffff9329c158e738 [ 44.574335] FS: 00007f4299934740(0000) GS:ffff932a60179000(0000) knlGS:0000000000000000 [ 44.575027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 44.575520] CR2: 00007f4299a1ec90 CR3: 0000000002886002 CR4: 0000000000770eb0 [ 44.576112] PKRU: 55555554 [ 44.576338] Kernel panic - not syncing: Fatal exception [ 44.576517] Kernel Offset: 0x1a600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Reported-by: syzbot+fe2a25dae02a207717a0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fe2a25dae02a207717a0 Fixes: f19d587 ("ext4: add normal write support for inline data") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Cc: stable@vger.kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://patch.msgid.link/20250415-ext4-prepare-inline-overflow-v1-1-f4c13d900967@igalia.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 829e0c9 upstream. There is another found exception that the "timerlat/1" thread was scheduled on CPU0, and lead to timer corruption finally: ``` ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220 WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0 Modules linked in: CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:debug_print_object+0x7d/0xb0 ... Call Trace: <TASK> ? __warn+0x7c/0x110 ? debug_print_object+0x7d/0xb0 ? report_bug+0xf1/0x1d0 ? prb_read_valid+0x17/0x20 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? debug_print_object+0x7d/0xb0 ? debug_print_object+0x7d/0xb0 ? __pfx_timerlat_irq+0x10/0x10 __debug_object_init+0x110/0x150 hrtimer_init+0x1d/0x60 timerlat_main+0xab/0x2d0 ? __pfx_timerlat_main+0x10/0x10 kthread+0xb7/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ``` After tracing the scheduling event, it was discovered that the migration of the "timerlat/1" thread was performed during thread creation. Further analysis confirmed that it is because the CPU online processing for osnoise is implemented through workers, which is asynchronous with the offline processing. When the worker was scheduled to create a thread, the CPU may has already been removed from the cpu_online_mask during the offline process, resulting in the inability to select the right CPU: T1 | T2 [CPUHP_ONLINE] | cpu_device_down() osnoise_hotplug_workfn() | | cpus_write_lock() | takedown_cpu(1) | cpus_write_unlock() [CPUHP_OFFLINE] | cpus_read_lock() | start_kthread(1) | cpus_read_unlock() | To fix this, skip online processing if the CPU is already offline. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240924094515.3561410-4-liwei391@huawei.com Fixes: c8895e2 ("trace/osnoise: Support hotplug operations") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>

nzq-hw mentioned this pull request Jul 23, 2025

[Backport][6.6] IPIV-BUGFIX Lostwayzxc/kernel#14

Merged

bhe4 mentioned this pull request Oct 29, 2025

NO MERGE: TEST ONLY: 6.6-velinux combine all PRs. #84

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[6.6-velinux] Intel: backport KVM Fix for Clearing SGX EDECCSSA to 6.6#45

[6.6-velinux] Intel: backport KVM Fix for Clearing SGX EDECCSSA to 6.6#45
zhiquan1-li wants to merge 1 commit into6.6-velinuxfrom
6.6-velinux-kvm-sgx-clear-edeccssa

zhiquan1-li commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhiquan1-li commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants