Fix and improve VTL switching loop and state management#545
Fix and improve VTL switching loop and state management#545sangho2 merged 8 commits intosanghle/lvbs/ktlsfrom
Conversation
bd87679 to
a298043
Compare
898c3d5 to
1cbad25
Compare
wdcui
left a comment
There was a problem hiding this comment.
It's not clear to me (1) why we have both per_cpu and ktls and (2) why we load/store vtl0 state from per_cpu via stack. I left some other minor comments.
| /// `fn_save_state` is the function to save VTL state stored in the stack. | ||
| /// `scratch_off` is the offset of the kernel TLS which holds the scratch space to | ||
| /// save/restore the current stack pointer. | ||
| macro_rules! SAVE_VTL_STATE_ASM { |
There was a problem hiding this comment.
Code like this should be moved out of mshv at some point?
There was a problem hiding this comment.
xsave and xstore are handled inside vtl_switch_loop_body because we don't particularly need assembly code for it.
| stringify!($scratch_off), | ||
| "]\n", // anchor | ||
| "sub rsp, ", | ||
| stringify!($state_size), |
There was a problem hiding this comment.
Is the rsp value after line 159 the same as after the call?
There was a problem hiding this comment.
We save and restore rsp in Lines 156 and 177, respectively, using scrachpad.
| "mov rsp, gs:[{kernel_sp_off}]", // reset kernel stack pointer. Hyper-V saves/restores rsp and rip. | ||
| VTL_RETURN_ASM!({vtl_ret_addr_off}), | ||
| // *** VTL1 resumes here regardless of the entry reason (VTL switch or intercept) *** | ||
| SAVE_VTL_STATE_ASM!({save_vtl0_state}, {scratch_off}), |
There was a problem hiding this comment.
Can we save the vtl0 state to the per cpu storage directly?
There was a problem hiding this comment.
As far as I know, using stack as temporary storage to save the current registers is one of the most reliable and efficient ways.
| // is due to memory or register access violation, the VTL0 kernel might not have saved | ||
| // its states), we conservatively save and restore its extended states on every VTL switch. | ||
| with_per_cpu_variables_mut(|per_cpu_variables| { | ||
| per_cpu_variables.save_extended_states(HV_VTL_NORMAL); |
There was a problem hiding this comment.
How do we make sure all the code executed before we get to this point doesn't change the extended state?
There was a problem hiding this comment.
It only executes some assembly code and memory copy operations. I agree that some compiler optimization might be able to use SSE though.
The main reason I introduced |
Is this being dropped because all VTL1 invocations can start with a fresh stack for now. Should this change be revisited once dynamic TAs are enabled? |
We shouldn't use per-core storage for saving the VTL1 state because, as discussed, we can't ensure that VTL1 is re-entered with the same core. We need a separate global data structure for this. |
f7e7331 to
9f80306
Compare
9f80306 to
c85aac5
Compare
|
Let me add XSAVE/XRSTORE when I merge #541. doing this in this PR is a bit tricky due to complicated rebase. |
This PR adds `run_thread` support for LVBS platform and revise its syscall handling routine based on Linux userland platform's syscall handling. It has suppressed new page table creation for now because it needs to be done by the runner via an upcall (userspace support still exists, but doesn't create a separate one for each TA instance for now). --------- Co-authored-by: Sangho Lee <sanghle@microsoft.com>
tgopinath-microsoft
left a comment
There was a problem hiding this comment.
approved with comments
This PR fixes some bugs in the VTL switching loop and improve its reliability and performance.
Specifically, this PR uses assembly code to carefully switch between VTLs while saving/restoring
VTL0 registers. This PR removes some unreliable stack manipulation code we had before.
In addition, this PR makes it clear that we no longer store the VTL1 state in per-core storage.
Note: this PR and #541 should be merged together since #541 suffers from a crash due to
some bugs that this PR is fixing.