Virtual NMI for SNP#3511
Conversation
There was a problem hiding this comment.
Pull request overview
This PR enables delivering a host-triggered debug interrupt to AMD SEV-SNP guests by using virtual NMI (V_NMI) when the host CPU supports it, avoiding unsafe repeated “classic” NMI injection behavior on SNP.
Changes:
- Added CPUID bitfield parsing for SVM features EDX (Fn8000_000A_EDX), including the V_NMI bit.
- Detected host V_NMI capability at SNP partition init time and enabled V_NMI delivery for VTL0 VMSAs.
- Updated debug-interrupt/NMI paths to use virtual NMI when available, including waking halted VPs to ensure delivery.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| vm/x86/x86defs/src/cpuid.rs | Adds a bitfield struct for parsing SVM feature bits (EDX), including V_NMI. |
| openhcl/virt_mshv_vtl/src/processor/snp/mod.rs | Detects V_NMI support, enables it in VMSA init, and injects NMIs via V_NMI for VTL0 when supported. |
| openhcl/virt_mshv_vtl/src/lib.rs | Gates SNP debug-interrupt delivery on V_NMI support and warns/returns if unsupported. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
openhcl/virt_mshv_vtl/src/lib.rs:2043
assert_debug_interruptacceptshvdef::Vtl, butpulse_lintwillexpect("higher vtl not configured")when passedVtl::Vtl2, causing a panic. Since this is reachable via the GET debug-interrupt callback, it should defensively reject unsupported VTLs (e.g., convert toGuestVtland return/log on error) rather than panicking.
if self.inner.isolation == IsolationType::Snp {
let vnmi = match &self.inner.backing_shared {
BackingShared::Snp(snp) => snp.vnmi,
_ => false,
};
if !vnmi {
tracing::error!("debug interrupt is not supported on SNP without virtual NMI");
return;
}
}
let bsp_index = VpIndex::new(0);
self.pulse_lint(bsp_index, vtl, LINT_INDEX_1);
| // For SNP CVMs, only deliver the debug NMI when the host CPU | ||
| // supports virtual NMI (CPUID Fn8000_000A_EDX[V_NMI]). Without | ||
| // V_NMI, injecting multiple NMIs can corrupt the NMI stack in the | ||
| // guest. | ||
| if self.inner.isolation == IsolationType::Snp { | ||
| let vnmi = match &self.inner.backing_shared { | ||
| BackingShared::Snp(snp) => snp.vnmi, | ||
| _ => false, | ||
| }; | ||
| if !vnmi { | ||
| tracing::error!("debug interrupt is not supported on SNP without virtual NMI"); | ||
| return; | ||
| } | ||
| } | ||
| let bsp_index = VpIndex::new(0); | ||
| self.pulse_lint(bsp_index, vtl, LINT_INDEX_1); |
There was a problem hiding this comment.
It is kindof a footgun I guess. Would it cause problems if we set_nmi_enable(true) on all VTLs? Could we support injecting an NMI to VTL 1, even if we block it at the GET today?
There was a problem hiding this comment.
I'm mainly thinking about futureproofing. If someday we decide we do want NMI injection to VTL 1 for whatever reason it'd be great if we just had to remove the block at the GET level. As long as there are no potential problems with doing so, I think it'd be great if we didn't have to check for VTL0 anywhere in this new code. Of course if doing that would break something, that's a different story.
There was a problem hiding this comment.
I'm not aware of any issues but haven't validated it. its a bit different as secure avic isn't enabled on vtl 1 iirc
| if let Ok(vtl) = Vtl::try_from(vtl) { | ||
| p.assert_debug_interrupt(vtl) |
| let p = partition.clone(); | ||
| let debug_interrupt_callback = move |vtl: u8| p.assert_debug_interrupt(vtl); | ||
| let debug_interrupt_callback = move |vtl: u8| { | ||
| if let Ok(vtl) = Vtl::try_from(vtl) { |
There was a problem hiding this comment.
Can we push this conversion into the GET and have the callback itself just take a Vtl? IIRC the GET already depends on hvdef (where Vtl is defined). Then we can issue a warning/error there if an invalid number comes in, just like we do for 1 today.
There was a problem hiding this comment.
I don't think this is useful
smalis-msft
left a comment
There was a problem hiding this comment.
Yeah alright, minor issues aside this has gone through enough revisions
This change enables the host to assert a debug interrupt in a SNP guest through virtual NMI.