Skip to content

WIP windows support for kured#460

Closed
marosset wants to merge 6 commits intokubereboot:mainfrom
marosset:windows-support
Closed

WIP windows support for kured#460
marosset wants to merge 6 commits intokubereboot:mainfrom
marosset:windows-support

Conversation

@marosset
Copy link
Copy Markdown

@marosset marosset commented Nov 16, 2021

Signed-off-by: Mark Rossetti marosset@microsoft.com

Here is a prototype that I have working for Windows support.

A few things to note include:

  • This requires WindowsHostProcessContainers feature (introduced in v1.22) to be enabled for kube-apiserver and kubelet
    • This also means that kubeclient/API refs must be v1.22 or greater
    • WindowsHostProcessContainers is on-track for beta in v1.23
  • InClusterConfig does not work with HostProcessContainers today (see not and linked issue in code)
  • An init container is used to register a scheduled task that runs on startup to delete c:\var\run\reboot-required if it exists. This way things outside of the Windows servicing stack can write the sentinel file and expect a reboot to be triggered once.
  • The script to check for pending reboots will write c:\var\run\reboot-required if one of several registry keys controlled by the servicing stack is set. These keys are cleared as servicing operations are processed.
  • Windows escapes "'s in command line args (these interfere with command construction if not trimmed)
  • This PR does not address how the Windows container image should be built. I left some open questions / comments in the dockerfile / makefile for discussion tho

Comment thread cmd/kured/main.go Outdated
$cbsKey = Get-ItemProperty "HKLM:\Software\Microsoft\Windows\CurrentVersion\Component Based Servicing" -ErrorAction Ignore
if ($cbsKey.PSObject.Properties.name -contains 'RebootPending') {
Write-Output "Reboot requried - Component Based Servicing:RebootPending registry value detected"
New-Item -Path '/var/run' -Name 'reboot-required' -ItemType File -Force
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me: why do we want to emulate the sentinel filepath pattern here, why not just return 0 (reboot required) if any of the two keys here have the required reboot properties?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - Both of those registry keys are only cleared on the next boot if Windows servicing operations set those keys.
If a user manually sets either of those keys and then reboots this would most likely result in the node being stuck in a reboot loop.
Adding a sentinel file here seemed like a more generic approach. I can image situations where things outside of the windows servicing stack would want to trigger a reboot once (things like node-problem-detector for example).
I also feel it is important to be able to detect when there is a pending reboot due to things like Windows Update needing to reboot the node for security fixes to be applied. Having a powershell script that can detect both situations seemed a good compromise.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. So basically the idea here is to promote the general idea of "create the sentinel file c:\var\run\reboot-required to indicate that you want a reboot.

And the init container that we're shipping is a POC for how to implement an end-to-end solution (because you need a thing to delete that file at bootup to make this strategy work).

Copy link
Copy Markdown
Collaborator

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully review, but I feel like we can improve the code readability by implementing interfaces and the right types.

Comment thread cmd/kured/main.go
}

func main() {
os.Stdout.WriteString(fmt.Sprintf("Command line args: %v\n", os.Args))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this line? Do we need it in main?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry these were to help me figure out an issue related to microsoft/hcsshim#1207.
I'll remove all of these.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you found it useful to have flags displayed, so we should probably address that in a later patch :)

Comment thread cmd/kured/main.go Outdated
Comment thread cmd/kured/main.go Outdated
Comment thread cmd/kured/main.go Outdated
// On Windows kubed daemonset needs to run in a HostProcess container which
// run all processes in the hosts process namespace.
return command
} else {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we start to build an interface instead, and make sure the linker/user input allows us to infer the type of command to run?

Comment thread cmd/kured/main.go Outdated
var config *rest.Config
var err error

if runtime.GOOS == windows {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we implement interface with windows/linux k8s servers as types, then we could have a method for getting the config based on those. It would look simpler and simpler to test.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into building an interface to for windows/linux behaviors so we can remove the GOOS if/else statements in the code.
I wanted to cleanup some other parts of this PR first (like building the container images with make targets) but that is done.

Comment thread cmd/kured/main.go Outdated
// buildSentinelCommand creates the shell command line which will need wrapping to escape
// the container boundaries
func buildSentinelCommand(rebootSentinelFile string, rebootSentinelCommand string) []string {
func buildSentinelCommand(rebootSentinelFile string, rebootSentinelCommand string, GOOS string) []string {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above.

Comment thread cmd/kured/main.go Outdated

var (
version = "unreleased"
windows = "windows"
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this

"k8s.io/client-go/tools/clientcmd"
)

type osSpecificBehaviors interface {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to suggestions for better names for any of the objects in this file :-/

Signed-off-by: Mark Rossetti <marosset@microsoft.com>
Copy link
Copy Markdown

@rchaganti rchaganti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a couple of comments around Test-PendingReboot.ps1 to ensure the code written follows the PowerShell best practices.


$autoUpdateKey = Get-ItemProperty "HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update" -ErrorAction Ignore
if ($autoUpdateKey.PSObject.Properties.name -contains 'RebootRequired') {
Write-Output "Reboot required - Auto Update:RebootRequired registry value detected"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write-Output returns objects to the pipeline. We do not use it in a script unless that behavior is expected. Instead, you can use Write-Information.

PS C:\scripts> .\Test-PendingReboot.ps1
Reboot not required


if (Test-Path -Path '/var/run/reboot-required') {
Write-Output "Reboot required"
exit 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if your intention is to return 0 or 1, you must use return keyword and not exit. exit kills the PowerShell session where this function runs.

PS C:\scripts> $return = .\Test-PendingReboot.ps1 

PS C:\scripts> $return
Reboot not required

If you see in this, the return value from the script is not 0 or 1. Instead, it returned Reboot not required. If you replace the Write-Output with Write-Information and exit with return, here is what you will see.

PS C:\scripts> $return = .\Test-PendingReboot.ps1 

PS C:\scripts> $return
1

@github-actions
Copy link
Copy Markdown

This PR was automatically considered stale due to lack of activity. Please refresh it and/or join our slack channels to highlight it, before it automatically closes (in 7 days).

@ckotzbauer ckotzbauer added keep This won't be closed by the stale bot. and removed no-pr-activity labels Mar 26, 2022
@github-actions
Copy link
Copy Markdown

This PR was automatically considered stale due to lack of activity. Please refresh it and/or join our slack channels to highlight it, before it automatically closes (in 7 days).

@github-actions
Copy link
Copy Markdown

This PR was automatically considered stale due to lack of activity. Please refresh it and/or join our slack channels to highlight it, before it automatically closes (in 7 days).

@ckotzbauer
Copy link
Copy Markdown
Member

@marosset Are there any news on this? Is this still WIP?

@github-actions
Copy link
Copy Markdown

This PR was automatically considered stale due to lack of activity. Please refresh it and/or join our slack channels to highlight it, before it automatically closes (in 7 days).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

keep This won't be closed by the stale bot. no-pr-activity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants