campaigns: use a persistent RPC container in volume mode #418

LawnGnome · 2021-01-08T06:20:47Z

One performance issue I called out in #412 in volume mount mode was that large diffs could be slower. My hypothesis was that this was caused by the diffs being transmitted over the container's stdout, which means it has to pass through Docker, potentially get logged, and so on.

This PR adds an experimental new workspace mode called marionette (although, if we decide to go ahead with this, I would replace volume with it). This uses the same basic technique as volume mode, where the workspace is on a Docker volume, but instead of running ad hoc containers to run git and unzip commands, a single container runs during the entire set of steps that exposes a service (called marionette) that provides a gRPC server that closely mimics the Workspace interface.

As a result, the diff (and intermediate changes metadata) comes back over an established gRPC connection, instead of via stdout.

I used the same stress test I used in #412 for this: a campaign that essentially does this:

Clones sourcegraph/sourcegraph.
Runs gunzip -ck /usr/share/cracklib/cracklib-words.gz >> README.md.

This results in a ~15 MB diff. Extreme, but not unreasonable if a user is doing something with binaries (say, running ImageMagick or pngcrush transforms on image assets).

On macOS:

Bind mode: ~13 seconds
Volume mode: ~28 seconds
Marionette mode: ~11 seconds

Therefore, this results in a workspace mode that outperforms both bind and volume mode on macOS. (And probably also on Windows.)

The big question for me in this PR is: is this a good enough improvement to be worth the extra complexity? The workspace implementation isn't really any more complicated, but the control flow gets a bit more complex because we have to manage the lifecycle of the marionette container, plus we now have another cmd to manage, plus we have to pull in the protobuf and gRPC libraries.

So, @sourcegraph/campaigns, thoughts?

mrnugget

Pretty cool idea for a surprisingly small amount of code.

The big question for me in this PR is: is this a good enough improvement to be worth the extra complexity? The workspace implementation isn't really any more complicated, but the control flow gets a bit more complex because we have to manage the lifecycle of the marionette container, plus we now have another cmd to manage, plus we have to pull in the protobuf and gRPC libraries.

What's your gut instinct?

I have to say that I'm a bit worried about adding another container for every additional container and whether this approach here could be a source of debugging headaches when something goes wrong. Especially because we're talking about Docker for Mac here where containers are not as performant/lightweight as they are under Linux. (Side note: it's not a problem if two containers are attached to the same volume?) I'm also peeking at the ideas we had for run-steps-in-subdirectories, which increases the number of containers per repository. With this approach here, we'd have double.

So, verdict: not sure. If performance would be doubled, then I'd say let's give it a shot. But the improvements here don't look as substantial as in your other PR, so my gut tells me to put this on hold (and maybe pull it out in the future, once we have more customer input (on debuggability, maintenance, performance, use cases)

chrispine · 2021-01-08T16:55:59Z

No strong opinion, but if I had to give an answer: I'd say to save this for later.

eseliger · 2021-01-08T17:05:09Z

I like the simplicity of this approach, it's definitely super interesting! As Chris and Thorsten, I'm not sure we need to take the effort for now, but if you feel like it's almost ready, I wouldn't be opposed.
Regardless, very interesting that it takes so much time just streaming the content to stdout.

LawnGnome · 2021-01-08T21:11:58Z

OK, that sounds like a consensus to me. Let's keep this on ice for now, and we can pick it up again if we have user feedback that makes this useful.

Just to pick up on one question:

Side note: it's not a problem if two containers are attached to the same volume?

It's not, but it's the same as any other shared filesystem: if you break the locking, you get to keep all the pieces. In this case, it should be fine: src controls the containers that are attached to the volume, and no commands are issued to marionette when another container is attached.

This becomes an issue in other circumstances: for example, trying to share a package manager cache across containers.

mrnugget · 2021-11-01T12:34:25Z

Closing because PR seems to be stale. Feel free to re-open if needed.

LawnGnome · 2021-11-01T19:37:15Z

I think we've determined, seven months later, that this is unnecessary. Which I'm a little sad about, because it was nifty, but our lives are simpler without it.

LawnGnome added 4 commits January 7, 2021 20:48

Initial, partial marionette implementation.

b2b39f2

WIP

ea2b236

WIP

20df8eb

WIP

a04a760

LawnGnome added the team/code-search label Jan 8, 2021

LawnGnome added this to the Campaigns Sprint 8 milestone Jan 8, 2021

sourcegraph-bot mentioned this pull request Jan 8, 2021

Campaigns Sprint 8 Tracking issue sourcegraph/sourcegraph-public-snapshot#17030

Closed

18 tasks

mrnugget reviewed Jan 8, 2021

View reviewed changes

LawnGnome added the on-hold label Jan 8, 2021

chrispine modified the milestones: Campaigns Sprint 8, Backlog Jan 19, 2021

LawnGnome mentioned this pull request Jan 21, 2021

campaigns: handle non-root volume use cases more gracefully #434

Merged

eseliger force-pushed the main branch from 38f79b1 to 1d4a792 Compare February 22, 2021 22:11

mrnugget closed this Nov 1, 2021

keegancsmith deleted the aharvey/in-docker-tool branch November 11, 2025 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

campaigns: use a persistent RPC container in volume mode #418

campaigns: use a persistent RPC container in volume mode #418

Uh oh!

LawnGnome commented Jan 8, 2021

Uh oh!

mrnugget left a comment

Uh oh!

chrispine commented Jan 8, 2021

Uh oh!

eseliger commented Jan 8, 2021

Uh oh!

LawnGnome commented Jan 8, 2021

Uh oh!

mrnugget commented Nov 1, 2021

Uh oh!

LawnGnome commented Nov 1, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

campaigns: use a persistent RPC container in volume mode #418

campaigns: use a persistent RPC container in volume mode #418

Uh oh!

Conversation

LawnGnome commented Jan 8, 2021

Uh oh!

mrnugget left a comment

Choose a reason for hiding this comment

Uh oh!

chrispine commented Jan 8, 2021

Uh oh!

eseliger commented Jan 8, 2021

Uh oh!

LawnGnome commented Jan 8, 2021

Uh oh!

mrnugget commented Nov 1, 2021

Uh oh!

LawnGnome commented Nov 1, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants