-
Notifications
You must be signed in to change notification settings - Fork 70
campaigns: use a persistent RPC container in volume mode #418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mrnugget
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty cool idea for a surprisingly small amount of code.
The big question for me in this PR is: is this a good enough improvement to be worth the extra complexity? The workspace implementation isn't really any more complicated, but the control flow gets a bit more complex because we have to manage the lifecycle of the marionette container, plus we now have another cmd to manage, plus we have to pull in the protobuf and gRPC libraries.
What's your gut instinct?
I have to say that I'm a bit worried about adding another container for every additional container and whether this approach here could be a source of debugging headaches when something goes wrong. Especially because we're talking about Docker for Mac here where containers are not as performant/lightweight as they are under Linux. (Side note: it's not a problem if two containers are attached to the same volume?) I'm also peeking at the ideas we had for run-steps-in-subdirectories, which increases the number of containers per repository. With this approach here, we'd have double.
So, verdict: not sure. If performance would be doubled, then I'd say let's give it a shot. But the improvements here don't look as substantial as in your other PR, so my gut tells me to put this on hold (and maybe pull it out in the future, once we have more customer input (on debuggability, maintenance, performance, use cases)
|
No strong opinion, but if I had to give an answer: I'd say to save this for later. |
|
I like the simplicity of this approach, it's definitely super interesting! As Chris and Thorsten, I'm not sure we need to take the effort for now, but if you feel like it's almost ready, I wouldn't be opposed. |
|
OK, that sounds like a consensus to me. Let's keep this on ice for now, and we can pick it up again if we have user feedback that makes this useful. Just to pick up on one question:
It's not, but it's the same as any other shared filesystem: if you break the locking, you get to keep all the pieces. In this case, it should be fine: This becomes an issue in other circumstances: for example, trying to share a package manager cache across containers. |
|
Closing because PR seems to be stale. Feel free to re-open if needed. |
|
I think we've determined, seven months later, that this is unnecessary. Which I'm a little sad about, because it was nifty, but our lives are simpler without it. |
One performance issue I called out in #412 in volume mount mode was that large diffs could be slower. My hypothesis was that this was caused by the diffs being transmitted over the container's stdout, which means it has to pass through Docker, potentially get logged, and so on.
This PR adds an experimental new workspace mode called
marionette(although, if we decide to go ahead with this, I would replacevolumewith it). This uses the same basic technique as volume mode, where the workspace is on a Docker volume, but instead of running ad hoc containers to rungitandunzipcommands, a single container runs during the entire set of steps that exposes a service (calledmarionette) that provides a gRPC server that closely mimics theWorkspaceinterface.As a result, the diff (and intermediate changes metadata) comes back over an established gRPC connection, instead of via stdout.
I used the same stress test I used in #412 for this: a campaign that essentially does this:
gunzip -ck /usr/share/cracklib/cracklib-words.gz >> README.md.This results in a ~15 MB diff. Extreme, but not unreasonable if a user is doing something with binaries (say, running ImageMagick or
pngcrushtransforms on image assets).On macOS:
Therefore, this results in a workspace mode that outperforms both bind and volume mode on macOS. (And probably also on Windows.)
The big question for me in this PR is: is this a good enough improvement to be worth the extra complexity? The workspace implementation isn't really any more complicated, but the control flow gets a bit more complex because we have to manage the lifecycle of the marionette container, plus we now have another
cmdto manage, plus we have to pull in the protobuf and gRPC libraries.So, @sourcegraph/campaigns, thoughts?