Skip to content

Conversation

@lucas-tucker
Copy link
Contributor

[Doc] Kubernetes in Docker (kind) tutorial

This is an additional markdown doc with corresponding shell scripts to set up a single node cluster using Kubernetes in Docker. In particular,

  • The kind CLI tool is installed on either x64 (64-bit) or ARM64
  • A single node cluster is configured that is easily modifiable
  • A simple test is run to confirm GPU usability on the cluster's Docker containers (nodes)

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @lucas-tucker, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new documentation tutorial along with supporting shell scripts to guide users through setting up a single-node Kubernetes cluster using kind (Kubernetes in Docker) with GPU support. The primary goal is to provide a quick and reproducible environment for deploying GPU-accelerated workloads, specifically for the vLLM Production Stack.

Highlights

  • New Tutorial Document: A comprehensive markdown tutorial (tutorials/00-c-install-kind-kubernetes-env.md) is added, detailing the steps to set up a GPU-enabled Kubernetes environment using kind and helm. This includes instructions for installing necessary tools, configuring the cluster for GPU support, and verifying the setup.
  • kind CLI Installation Script: A new utility script (utils/install-kind.sh) is introduced to automate the installation of the kind command-line interface, supporting both x86_64 and ARM64 architectures and ensuring it's added to the user's PATH.
  • GPU-Enabled kind Cluster Setup Script: A dedicated script (utils/install-kind-cluster.sh) is added to configure Docker for NVIDIA GPU access, create a kind cluster with GPU passthrough, and deploy the NVIDIA gpu-operator Helm chart for GPU resource management within the cluster.
  • GPU Verification Steps: The tutorial includes clear instructions and commands to verify that the GPU configuration is successful and that Kubernetes can utilize the available GPUs, including running a test GPU workload.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

lucast2021 and others added 6 commits June 22, 2025 21:32
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new tutorial and accompanying scripts for setting up a Kubernetes cluster using kind with GPU support. The documentation is well-structured, and the scripts are a valuable addition for users looking to get started with kind on GPU-enabled machines. I've provided a few suggestions to enhance the robustness of the shell scripts and address some minor inconsistencies within the documentation. My key feedback points include:

  • Adding set -euo pipefail to the cluster installation script to ensure it exits safely on errors.
  • Correcting the use of sudo in the kind installation script for a user-level installation.
  • Improving the architecture detection logic to be more robust.
  • Fixing a broken link and an incorrect tool name in the tutorial's table of contents to avoid confusion.

Overall, this is a great contribution. Addressing these points will improve the user experience and make the scripts more reliable.

- [Steps](#steps)
- [Step 1: Installing kind](#step-1-installing-kind)
- [Step 2: Installing Helm](#step-2-installing-helm)
- [Step 3: Installing Minikube with GPU Support](#step-3-installing-minikube-with-gpu-support)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The text and link for Step 3 in the Table of Contents incorrectly refer to "Minikube", but this tutorial is about "kind". This is confusing for the user and the link is broken because the corresponding header is for kind.

Please update the text and the anchor to correctly refer to "kind".

Suggested change
- [Step 3: Installing Minikube with GPU Support](#step-3-installing-minikube-with-gpu-support)
- [Step 3: Installing kind with GPU Support](#step-3-installing-kind-with-gpu-support)

@@ -0,0 +1,31 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This script performs several administrative actions using sudo. It's crucial to ensure that the script stops immediately if any command fails to avoid leaving the system in an inconsistent state. Please add set -euo pipefail at the beginning of the script for robustness.

  • set -e: Exit immediately if a command exits with a non-zero status.
  • set -u: Treat unset variables as an error when performing parameter expansion.
  • set -o pipefail: The return value of a pipeline is the status of the last command to exit with a non-zero status.
Suggested change
#!/bin/bash
#!/bin/bash
set -euo pipefail

# For ARM64
[ $(uname -m) = aarch64 ] && curl -Lo kind https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-arm64
chmod +x kind
sudo mv kind "$KIND_PATH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The script uses sudo to move the kind binary to $HOME/bin/kind. The $HOME/bin directory is user-specific and should be owned by the user. Using sudo here is incorrect, can cause permission issues, and unnecessarily requires elevated privileges for what should be a user-level installation. Please remove sudo from this command.

Suggested change
sudo mv kind "$KIND_PATH"
mv kind "$KIND_PATH"

sudo usermod -aG docker $USER && newgrp docker
```

1. Execute the script [`utils/install-kind-cluster.sh`](../utils/install-kind-cluster.sh):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The instruction refers to the script as utils/install-kind-cluster.sh. However, the user is expected to be in the utils directory from Step 1. Referring to it as install-kind-cluster.sh would be less confusing and more consistent with the command block below.

Suggested change
1. Execute the script [`utils/install-kind-cluster.sh`](../utils/install-kind-cluster.sh):
1. Execute the script [`install-kind-cluster.sh`](../utils/install-kind-cluster.sh):

Comment on lines 22 to 25
# Install kubectl (from tutorial https://kind.sigs.k8s.io/docs/user/quick-start/)
[ $(uname -m) = x86_64 ] && curl -Lo kind https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-amd64
# For ARM64
[ $(uname -m) = aarch64 ] && curl -Lo kind https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-arm64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current architecture check works for x86_64 and aarch64, but it's not very robust as it will silently do nothing on other architectures. Using a case statement is cleaner, more readable, and allows for explicit error handling for unsupported architectures.

Additionally, there is a typo in the comment on line 22 (kubectl should be kind). I've corrected this in the suggestion.

Suggested change
# Install kubectl (from tutorial https://kind.sigs.k8s.io/docs/user/quick-start/)
[ $(uname -m) = x86_64 ] && curl -Lo kind https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-amd64
# For ARM64
[ $(uname -m) = aarch64 ] && curl -Lo kind https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-arm64
# Install kind (from tutorial https://kind.sigs.k8s.io/docs/user/quick-start/)
case "$(uname -m)" in
x86_64)
curl -Lo kind "https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-amd64"
;;
aarch64)
curl -Lo kind "https://kind.sigs.k8s.io/dl/v0.29.0/kind-linux-arm64"
;;
*)
echo "Unsupported architecture: $(uname -m)" >&2
exit 1
;;
esac

Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
Signed-off-by: Lucas T <lucastbooms@gmail.com>
@zerofishnoodles
Copy link
Collaborator

Hi Lucas, this looks so good, thanks a lot for contributing. Could you take a look at the gemini code review result? Other than that, I think it is good to go.

Signed-off-by: Lucas T <lucastbooms@gmail.com>
Copy link
Collaborator

@zerofishnoodles zerofishnoodles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YuhanLiu11 YuhanLiu11 merged commit e7dcdd5 into vllm-project:main Jun 24, 2025
6 checks passed
Senne-Mennes pushed a commit to Senne-Mennes/production-stack that referenced this pull request Oct 22, 2025
* created two files for use in kind portion of 00 tutorial

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* new branch due to commit issue

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* added install-kind script

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* created kind cluster script

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* removed original kind md

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* removed initial sh script

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* whitespace issues

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* additional formatting

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* additional formatting -- will use pre-commit hooks in future

Signed-off-by: Lucas T <lucastbooms@gmail.com>

* gemini suggestions

Signed-off-by: Lucas T <lucastbooms@gmail.com>

---------

Signed-off-by: Lucas T <lucastbooms@gmail.com>
Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Yuhan Liu <32589867+YuhanLiu11@users.noreply.github.com>
Signed-off-by: senne.mennes@capgemini.com <senne.mennes@capgemini.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants