Skip to content

Fix: make Dockerfile self-contained with multi-stage build#24277

Open
Famous077 wants to merge 6 commits intogoogle-gemini:mainfrom
Famous077:fix/self-contained-dockerfile
Open

Fix: make Dockerfile self-contained with multi-stage build#24277
Famous077 wants to merge 6 commits intogoogle-gemini:mainfrom
Famous077:fix/self-contained-dockerfile

Conversation

@Famous077
Copy link
Copy Markdown
Contributor

Summary

Fixes Dockerfile to work on a clean git clone without requiring host pre-built artifacts.

Details

The Dockerfile used COPY packages/cli/dist/*.tgz which failed if the user hadn't
run npm run build locally first. Converted to a multi-stage build so everything
compiles inside the container.

Stage 1 (builder): installs git, runs npm install + npm run build + npm pack
Stage 2 (runtime): copies .tgz artifacts from builder via --from=builder

Added HUSKY=0 to skip git hooks during container build.

Related Issues

Fixes #15859

How to Validate

  1. Fresh clone the repo (no local build)
  2. Run: docker build --build-arg CLI_VERSION_ARG=1.0.0 -t gemini .
  3. Run: docker run --rm gemini --version
  4. Expected: builds successfully and prints 1.0.0

Before this fix, step 2 fails with:
ERROR: lstat /packages/cli/dist: no such file or directory

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • Docker
    • Windows
      • Docker
    • Linux
      • Docker ✅ (validated in GitHub Codespaces)

@Famous077 Famous077 requested a review from a team as a code owner March 30, 2026 21:39
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the Dockerfile to support a self-contained build process. By transitioning to a multi-stage build, the container now compiles the necessary packages internally, ensuring that users can build the image from a fresh repository clone without needing to perform local builds beforehand. This improves the developer experience and ensures consistent build environments.

Highlights

  • Multi-stage Docker build: Implemented a multi-stage Dockerfile to handle the build process entirely within the container, removing the dependency on pre-built artifacts from the host machine.
  • Build environment configuration: Added necessary build dependencies like git and configured the builder stage to install dependencies, run builds, and package the CLI and core modules.
  • Build optimization: Disabled Husky git hooks during the container build process to streamline execution and ensure compatibility.
  • Entrypoint update: Updated the container entrypoint to use the absolute path of the installed binary.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Famous077 Famous077 changed the title fix: make Dockerfile self-contained with multi-stage build Fix: make Dockerfile self-contained with multi-stage build Mar 30, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a multi-stage Docker build to compile and package the Gemini CLI within the container, eliminating the need for host-side pre-building. Feedback focuses on optimizing the Dockerfile for better caching and smaller image sizes, specifically by managing dependency files more efficiently and excluding the .git directory. Other recommendations include using npm ci for consistent builds, combining npm install commands to resolve package dependencies correctly, and reverting the change from CMD to ENTRYPOINT to preserve user flexibility.

Comment thread Dockerfile
Comment thread Dockerfile Outdated
Comment thread Dockerfile Outdated
Comment thread Dockerfile Outdated
Comment thread Dockerfile
@gemini-cli gemini-cli Bot added priority/p2 Important but can be addressed in a future release. area/platform Issues related to Build infra, Release mgmt, Testing, Eval infra, Capacity, Quota mgmt help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! labels Mar 30, 2026
@Famous077
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Dockerfile into a multi-stage build process, which improves layer caching and allows the entire build and packaging workflow to occur within the container. Feedback indicates that the GIT_COMMIT build argument is currently ineffective because the underlying build script does not check the environment variable and fails to retrieve git information since the .git directory is not present in the builder stage.

Comment thread Dockerfile
@Famous077
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the Docker build process by introducing a multi-stage build, which improves layer caching and reduces the final image size. It also updates the git commit information generation script to support an environment variable, allowing for consistent builds in environments without a .git directory. I have identified an issue in the Dockerfile where copying the scripts directory prematurely invalidates the cache; removing this redundant instruction will improve build performance.

Comment thread Dockerfile
@Famous077
Copy link
Copy Markdown
Contributor Author

I actually tried removing this line as suggested - the build breaks immediately with a MODULE_NOT_FOUND error for generate-notices.js. Turns out the vscode-ide-companion package has a prepare script that calls node ./scripts/generate-notices.js during npm ci, so the scripts folder needs to be present before the install runs.
I know it's not ideal for caching, but there's no clean way around it without modifying the prepare script itself (which would be a separate concern). Keeping this line intentionally.

@Famous077
Copy link
Copy Markdown
Contributor Author

Hey! While I was making this change, I realized the Docker build context was huge , over 620MB , because .git, node_modules, and all the package-level node_modules folders were getting transmitted to Docker with every single build. Silly as that may sound, that’s a lot of data to be uploading each time - and all those dependencies are being re-installed from scratch inside the container via npm ci anyway.
So I added a .dockerignore to skip them. Build context is now ~76MB , about 8x smaller. Builds feel noticeably faster and it also helps mitigate the disk space consumption that can accumulate during repetitive builds such as in Codespaces.
No effect on build output , everything the Dockerfile needs is still present, just the excess has been trimmed.

Comment thread scripts/generate-git-commit-info.js Outdated
// Check for GIT_COMMIT env var first (e.g. when building inside Docker
// without a .git directory available)
if (process.env.GIT_COMMIT) {
gitCommitInfo = process.env.GIT_COMMIT;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, here, the value from process.env.GIT_COMMIT be validated first before assignment.

The value from process.env.GIT_COMMIT here (line 48) is assigned to gitCommitInfo without any validation.

That variable is later interpolated into a single-quoted string in the generated TypeScript file on line 71:

export const GIT_COMMIT_INFO = '${gitCommitInfo}';

This generated file is imported at runtime by:

  • packages/core/src/telemetry/clearcut-logger/clearcut-logger.ts (line 64) — telemetry
  • packages/cli/src/ui/components/AboutBox.tsx (line 9) — version UI
  • packages/cli/src/ui/commands/bugCommand.ts — bug reports

If someone passes a value containing a single quote (e.g., --build-arg GIT_COMMIT="abc'def"), the generated file becomes:

export const GIT_COMMIT_INFO = 'abc'def';  // ← syntax error

A more adversarial input like --build-arg GIT_COMMIT="'; process.exit(1); //" would produce valid but injected code:

export const GIT_COMMIT_INFO = ''; process.exit(1); //';

Since a real git commit hash is always hexadecimal, a small guard here would prevent both accidental breakage and injection:

const envCommit = process.env.GIT_COMMIT;
if (envCommit && /^[0-9a-f]+$/i.test(envCommit)) {
  gitCommitInfo = envCommit;
}

This way, invalid inputs are silently ignored.

@Famous077
Copy link
Copy Markdown
Contributor Author

Hi @DavidAPierce , Could you please take a look when you have time? I’d really value your perspective and any suggestions for improvement.

@DavidAPierce DavidAPierce mentioned this pull request Apr 21, 2026
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/platform Issues related to Build infra, Release mgmt, Testing, Eval infra, Capacity, Quota mgmt help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! priority/p2 Important but can be addressed in a future release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to build it with Dockerfile

2 participants