add tty support for runc executor by coryb · Pull Request #1731 · moby/buildkit

coryb · 2020-10-12T17:39:06Z

This reuses some of the containerd logic for handling console io. The basic outline is that now all runc calls are made with detach: true and an associated pidfile. We have a SIGCHLD reaper so we can get notified when the detached pid exits (and determine exit status). When a tty is requested we use the console-socket options in runc so that runc creates the pty pair and returns the ptm via the socket.

I tried naively creating a pty pair in buildkit and setting the input streams to the pts, which mostly worked, except the ptm resize had no effect this way, it seems using the console-socket is the only way to allow resize events to propagate.

Unfortunately this is a linux only implementation since the containerd libraries in use are linux only:
github.com/containerd/containerd/sys/reaper
github.com/containerd/containerd/runtime/v2/runc
The non linux implementation will work as before, no tty support for runc.

Also I added a basic tty test with generated pty (again linux only via github.com/containerd/console), which works to test out most of the tty logic, but that is obviously not a fully implemented vt10x terminal so we cannot check the success of resize events.

coryb · 2020-10-12T18:09:43Z

I'm looking into the runc worker test failures, the integration tests were passing, not sure what is wrong with these tests yet.

hinshun · 2020-10-12T17:56:28Z

+// testClientGatewayContainerPID1Tty is testing that we can get a tty via
+// a container pid1, executor.Run
+func testClientGatewayContainerPID1Tty(t *testing.T, sb integration.Sandbox) {
+	requiresLinux(t)


Do we need both requiresLinux and +build linux?

we need the +build linux otherwise the test wont compile on windows due to container/console dep. So, yeah, I think I can remove the requiresLinux

hinshun · 2020-10-12T17:57:57Z

+		// TODO fix this
+		// We get `panic: cannot statfs cgroup root` when running this test
+		// with runc-rootless
+		t.Skip("Skipping runc-rootless for cgroup error")


Is this waiting for: opencontainers/runc#2634?

hinshun · 2020-10-12T18:01:49Z

+	for {
+		select {
+		case <-ctx.Done():
+			return ctx.Err()


Do we need to signal.Stop(signals) and close(signals) here?

yeah, will add. If the runc worker is enabled this loop will effectively run forever though.

hinshun · 2020-10-12T18:03:07Z

+	if err != nil {
+		return containerd.UnknownExitStatus, errors.Wrap(err, "failed to create pidfile")
+	}
+	pidfile.Close()


Defer remove?

It should be unnecessary for pid1 since we os.RemoveAll for the entire bundle directory when (*runcExecutor).Run ends.

hinshun · 2020-10-12T18:10:05Z

+	}
+
+	status := waitpid(ctx, pid, exitCh)
+	defer endedOnce.Do(func() {


Curious why do we need a sync.Once here but not in the non-linux one?

I was originally thinking the timing of closing ended was important and didn't want it delayed by the deferred cleanup for console shutdown. Looking again, I dont think the timing is important, so will just defer close(ended) in the linux impl as well.

hinshun · 2020-10-12T18:11:04Z

+	return status, nil
+}
+
+func (w *runcExecutor) exec(ctx context.Context, id, bundle string, pid0 int, specsProcess *specs.Process, process executor.ProcessInfo) (int, error) {


A lot of this seems duplicated with (*runcExecutor).run.

Yeah, I will take a look to try to dedup.

hinshun · 2020-10-12T18:12:00Z

+	defer wg.Wait()
+
+	status := waitpid(ctx, pid, exitCh)
+	cancel() // stop the parent waitpid if still running


Why do we need this cancel when we already have a defer cancel() on line 124? AFAIK, this call isn't blocking.

We need to cancel so the parent waitpid will exit:

var wg sync.WaitGroup wg.Add(1) go func() { defer wg.Done() // watch for parent process to die, if that happens then cancel the // exec context exitCh := reaper.Default.Subscribe() waitpid(ctx, pid0, exitCh) cancel() }() defer wg.Wait()

The defer wg.Wait() will hang until cancel() is called, or the parent exits. In the case that the parent continues to run we have to manually call cancel() so the defer wg.Wait() will exit. The previous defer cancel() is just to clean up resources leak in the case of error.

hinshun · 2020-10-12T18:19:50Z

+	socket, err := runc.NewTempConsoleSocket()
+	if err != nil {
+		return nil, nil, errors.Wrap(err, "failed to create OCI runtime console socket")
+	}


I think the code may be simpler and without the initErrCh if we just create the console socket and then handle everything below in one of the (*errgroup.Group).Go on line 103.

Then, the goroutines on line 235 and 253 can just be synchronous.

One more thing, the cleaners aren't returned on errors so: e.g. an error from cio.NewFIFOSetInDir means socket.Close won't get run.

I think I see what you are saying, I will try to simplify. Currently the cleaners are not returned on errors, they are run directly via a hacky defer, this will likely be simplified also with the refactoring you suggest.

tonistiigi · 2020-10-12T20:02:23Z

+	"github.com/containerd/console"
+	"github.com/containerd/containerd"
+	"github.com/containerd/containerd/cio"
+	runcV2 "github.com/containerd/containerd/runtime/v2/runc"


Looking to keep these dependencies somewhat under control. iiuc this is what adds most of them while we only use single stdio.Platform helper implementation from there. I wouldn't mind copy/paste for this and eventually refactoring this code so it can be cleanly imported.

There are a bunch of deps coming from the reaper logic as well. I will look into extracting the logic we need for the console and the reaper to see if there is a way to trim what is being pulled in.

coryb · 2020-10-14T17:02:03Z

I spent a while yesterday debugging the CNI Setup error: errors that I randomly see in the tests, it turns out to be caused by a race condition due to the sigchld reaper added to monitor the runc process exits. This monitor conflicts with the process wait logic used by the cni plugins, so that (*os.Process).Wait might return os.SyscallError{"wait", syscall.ECHILD} (ie wait got "no child" error because of dueling wait'ing process).

I dont have any workarounds currently. We have to run runc with the detach flag to get tty support, and there is no built-in mechanism to get exits events from a container after it has been detached. This solution works for containerd because this reaper logic is isolated in the shims that end up calling runc, so it does not conflict with other process management in the containerd daemon.

It seems like we need add our own shim to make this work properly. I will keep looking for alternate solutions. I just noticed there is a --no-subreaper option to runc, not entirely sure what that does yet, maybe could be useful...

coryb · 2020-10-15T16:33:16Z

Okay, reimplemented everything, I think it is much cleaner. I was able to determine what was missing when creating a pty and running runc in the foreground ... the SIGWINCH needs to be sent to the process after resizing. I had tried sending SIGWINCH the first time I tried this but I used runc.Kill(ctx, id, signal.SIGWINCH) which did not seem to work, perhaps it was sending the signal to the wrong process (it will only send to pid1, I maybe have been testing with tty in exec'd process 🤦). Getting the pid to signal is a bit tricky, we have to use the pidfile option in runc, then wait for it to be populated (using polling since there is no available event/channels to use for this) before we can setup the resize/sigwinch routine.

tonistiigi · 2020-10-15T16:48:58Z

+		// test resize, note we cannot verify the console size in the container
+		// because we need a controlling terminal to report size back, and the
+		// simple ptm we are using is not enough.  Ideally we could just use
+		// `/bin/ttysize` but that reports `0 0` with this test.


what's the reason for it reporting 0 0? Afaics we don't need NewPty at all for this test, could just use io.Pipe.

Not sure why ttysize does not work, I thought an actual terminal was necessary for this, but I might be wrong. I will try it again after the updates, also will try with io.Pipe()

updated the tests, removed the NewPty usage. ttysize works now, the problem was I was sending the SIGWINCH to the pid from the pidfile but that was Meta.Args program (ie sh) and the resize only takes effect when we send SIGWINCH to the runc run process.

It was a bit fussy getting the test reliable, there were spurious control characters printed at random times, not sure where they are coming from but I only noticed them after ttysize was added, not sure if is related. To make the test more reliable we should probably use github.com/hinshun/vt10x or github.com/Netflix/go-expect, but those will bring in a few deps. I figure we can try it as is, the short sleeps should keeps things consistent, but if it becomes unreliable we can disable or rework.

One way to clean this up would be to run stty -echo as the first command. Then you don't need to worry about the timings between prompt echo and response strings. This leaves the 500ms timeouts that waits for the resize. I guess these are fine but you could also write a helper that tests the buffer contents for expected contents every 50-100ms and has a longer 5-10sec timeout that produces failure if no content appears.

hinshun

Looks a lot cleaner now. I only have a small nit.

hinshun · 2020-10-15T16:42:17Z

+		if err != nil {
+			logrus.Warningf("error while shutting down tty io: %s", err)
+		}
+	}()


Prefer defer closer to line 78 in case code changes in the future and introduces early exit code paths.

tonistiigi · 2020-10-16T05:58:47Z

+		})
+		require.NoError(t, err)
+
+		err = pid2.Resize(ctx, client.WinSize{


most of this test looks very similar. can we put the shared part in a helper function or use subtest.

updated to add common routine for the terminal input and output

tonistiigi · 2020-10-16T06:02:25Z

+
+	if process.Stdout != nil {
+		eg.Go(func() error {
+			io.Copy(process.Stdout, ptm)


no error handling in here? don't they exit cleanly? add comment with the error if that is the case.

They didnt error cleanly, but I have updated to ignore the expected errors.

tonistiigi · 2020-10-16T06:13:53Z

+
+func (w *runcExecutor) run(ctx context.Context, id, bundle string, process executor.ProcessInfo) (int, error) {
+	return w.callWithIO(ctx, id, bundle, process, func(ctx context.Context, pidfile string, io runc.IO) (int, error) {
+		return w.runc.Run(ctx, id, bundle, &runc.CreateOpts{


Instead of the proc/ppid hack could we make a upstream change so we can get the pid directly from this/exec function. A Started chan int field in runc.CreateOpts should suffice. https://github.com/containerd/go-runc/blob/master/runc.go#L267 would call

if opts.Started != nil { opts.Started <- cmd.Process.Pid }

We can still merge the current and fix later when upstream has updated.

Yeah, we will need to fix upstream. There is another issue upstream which prevents us from discovering the exit status of runc.Exec when there is stdio associated. These lines discard the status code:
https://github.com/containerd/go-runc/blob/master/runc.go#L241-L245
I will try to get these changes in, but I suspect it will be a slow process.

I have created a PR: containerd/go-runc#69 I will remove our local hacks when hopefully the RP is merged eventually.

Signed-off-by: Cory Bennett <cbennett@netflix.com>

hinshun

Looks good!

hinshun · 2020-10-16T19:44:23Z

+			Stdin:  inputR,
+			Stdout: &nopCloser{output},
+			Stderr: &nopCloser{output},
+			Env:    []string{"PS1=% "},


Just curious, what is this for?

Just sets the prompt to constant % to prevent the output from changing if the the defaults change in the future. The default prompt currently echos the directory first in sh on busybox, and echos ansi color codes on alpine, so I was just trying to make it consistent for this test. It was more useful originally when I was trying to parse the output per prompt, but I gave up on that without a vt10x terminal to process the terminal escape sequences, so it is less useful now.

did you try getting rid of the prompt per #1731 (comment) to remove the sleeps per line?

I am working on adding a simple output scanner looking for the prompt instead of the sleeps, it seems to work fine. The expected output is a lot more confusing with the stty -echo so I think it works fine to go without that. We will still have the sleeps to allow the resize signals to go through, 500ms is pessimistic I think, but these tests systems can be slow.

I am now trying to work though an issue with the test hangs (ctrProc.Wait hangs) when the expected prompt does not show up in time (ie context cancelled). It appears to be waiting for the container process Done signal even after ctr.Release() is called. I send another update when I get to the bottom of this.

The expected output is a lot more confusing with the stty -echo so I think it works fine to go without that.

What's the new confusing part in the output? I'd expect it to be same, but without repeating all the commands you send. So you can send the commands together and don't need to time between waiting results from the previous command and sending the next one. Still need to wait for Resize() calls to trigger but size check can happen multiple times until it eventually gets the correct size or times out.

If we leave % the prompt in, with stty -echo we will see something like:

% stty -echo % 80 40 % % /tmp % % foobar % 100 60

vs what we see with the echo on:

% ttysize 80 40 % cd /tmp % pwd /tmp % echo foobar > newfile % cat /tmp/newfile foobar % ttysize 100 60 % exit 99

We need to leave the prompt in so we can scan for it. There is only one prompt we really need to wait for, that is the one after the first ttysize command, otherwise there is a race between the ttysize and the second ctrProc.Resize(ctx, client.WinSize{Rows: 60, Cols: 100}) so that we know the results of the ttysize are from the first resize event.

I thought one way to do this is to just:

for { input.Write([]byte("ttysize\n")) time.Sleep(100 * time.Millisecond) if strings.Contains(output.String(), expectedSize) { break } if timeout { error } }

and not worry about how many times ttysize was called if it eventually switches to correct size before timeout. But that only works if echo is disabled.

Sure something like that should work, will update when I figure out what is causing the hanging on timeout errors.

I refactored the prompting into a simple helper, it seems cleaner to me. I also found the hanging issue where if the context was cancelled used for container Start then the event loop was not finishing so stdin continued to be read from.

Signed-off-by: Cory Bennett <cbennett@netflix.com>

hinshun · 2020-10-19T17:20:03Z

+		err = pid1.Resize(ctx, client.WinSize{Rows: 60, Cols: 100})
+		require.NoError(t, err)
+		prompt.SendExpect("ttysize", "100 60")
+		prompt.SendExit(99)


Nice little go-expect.

tonistiigi · 2020-10-19T21:22:10Z

+			if errors.As(err, &ptmClosedError) {
+				if ptmClosedError.Op == "read" &&
+					ptmClosedError.Path == "/dev/ptmx" &&
+					ptmClosedError.Err == syscall.Errno(0x5) {


syscall.EIO

Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb linked an issue Oct 12, 2020 that may be closed by this pull request

gateway exec: runc executor needs to be able to create pty #1714

Closed

hinshun reviewed Oct 12, 2020

View reviewed changes

tonistiigi reviewed Oct 12, 2020

View reviewed changes

coryb force-pushed the issue-1714 branch from 8f51d19 to aa038a5 Compare October 12, 2020 22:20

coryb marked this pull request as draft October 14, 2020 16:48

coryb force-pushed the issue-1714 branch 5 times, most recently from ad23f18 to 0297aa8 Compare October 15, 2020 15:32

coryb marked this pull request as ready for review October 15, 2020 16:23

tonistiigi reviewed Oct 15, 2020

View reviewed changes

hinshun reviewed Oct 15, 2020

View reviewed changes

coryb force-pushed the issue-1714 branch 2 times, most recently from a766964 to 1a9fdf3 Compare October 16, 2020 03:36

tonistiigi reviewed Oct 16, 2020

View reviewed changes

coryb force-pushed the issue-1714 branch from 1a9fdf3 to 17c3191 Compare October 16, 2020 07:30

add tty support for runc executor

e308ef7

Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb force-pushed the issue-1714 branch from 17c3191 to e308ef7 Compare October 16, 2020 17:37

hinshun approved these changes Oct 16, 2020

View reviewed changes

coryb added 2 commits October 17, 2020 06:52

refactor tty output tests to avoid flakey sleeps

4d265f5

Signed-off-by: Cory Bennett <cbennett@netflix.com>

shutdown container process when context is done

cb12a56

Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb mentioned this pull request Oct 17, 2020

allow for optional started channel to receive runc pid containerd/go-runc#69

Merged

tonistiigi added this to the v0.8.0 milestone Oct 19, 2020

hinshun reviewed Oct 19, 2020

View reviewed changes

tonistiigi reviewed Oct 19, 2020

View reviewed changes

tonistiigi approved these changes Oct 19, 2020

View reviewed changes

syscall.Errno(0x5) == syscall.EIO

74db85d

Signed-off-by: Cory Bennett <cbennett@netflix.com>

hinshun merged commit 5eaecb9 into moby:master Oct 19, 2020

Conversation

coryb commented Oct 12, 2020

Uh oh!

coryb commented Oct 12, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coryb commented Oct 14, 2020

Uh oh!

coryb commented Oct 15, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hinshun left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hinshun left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

tonistiigi Oct 16, 2020 •

edited

Loading