Skip to content

feat: docker gpu image CI builds#3103

Merged
ggerganov merged 1 commit intoggml-org:masterfrom
canardleteer:feat/docker-gpu-ci
Sep 14, 2023
Merged

feat: docker gpu image CI builds#3103
ggerganov merged 1 commit intoggml-org:masterfrom
canardleteer:feat/docker-gpu-ci

Conversation

@canardleteer
Copy link
Copy Markdown
Contributor

@canardleteer canardleteer commented Sep 9, 2023

Enables the GPU enabled container images to be built and pushed alongside the CPU containers, and liberates the GPU Container images from local building for casual experimenters. This also generally addresses some of my CI concerns raised in #1461 & #3044

This doesn't validate the GPU enabled binary in the container, just that the declarations in place to build the
container and binary is functional, so it doesn't need any GPU infrastructure, and can be run as a Github Action. This generally
normalizes the delivery of GPU containers to match the CPU only ones.

As I'm not the maintainer of the primary repository, nor the owner of the DockerHub repository, I cannot run a full
validation on these as it only runs for master on @ggerganov's DockerHub credentials, I have a slightly different
variation of this Action (with push: false) that confirms the changes generally. You can view that validation in my repository: Branch with similar change, Action Validation.

Not Addressed By This Pull Request:

  • Multiple {CUDA,ROCm} Library Version Support
  • Tailored GPU Architecture Support
  • Pipeline support for validating the binaries in the images work
    • This is true of the current CPU image as well.

Known Issues:

  • The linux/arm64 build for CUDA is really slow, but hasn't timed out on me (yet). I don't know why, but I don't find the pipeline delay acceptable so have it disabled for now.

The value of opening up these builds and pushes:

  1. Making sure the Dockerfiles don't go out of date, and changes don't break builds.
  2. Containers tagged with a version:
    • Generally can be now be used from any GPU Cloud provider without a consumer having to build & push their own.
      • This is a huge value proposition for project popularity & adoption by GPU Cloud users.
  3. Containers tagged with a commit hash / branch name from an MR (not done in this MR):
    • Generally opens doors for much more robust CI infrastructure in/on containers, which I'd love to help with, but don't have time to at the moment (but feel free to loop me into conversations).
    • CAN be made available for testing via GPU enabled k8s clusters via an API trigger.
    • Tests COULD be launched in these GPU enabled containers via an API call before a merge.
      • Depending on how the GPU Cluster Access infrastructure & Grants evolve.
    • CAN reduce the need for VM with "always acquired" GPU infrastructure and/or maintenance of the GPU Docker runtimes, to validate GPU builds.

I don't think I will have time to help set up the third item on this list, but that shouldn't stop us from gaining value from the first 2. Most of the effort is setting up additional infrastructure for a third party (like myself) to validate the process, the code changes are just a matter of finessing the tags and triggers.

Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.
@canardleteer
Copy link
Copy Markdown
Contributor Author

Here's an example of how slow the Action is with linux/arm64, which is why it's disabled. Seems to take about 1 hour and 22 minutes, which isn't acceptable (imo), and why it's disabled:

@ggerganov ggerganov merged commit 980ab41 into ggml-org:master Sep 14, 2023
pkrmf pushed a commit to morlockstudios-com/llama.cpp that referenced this pull request Sep 26, 2023
Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.

Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.

Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com>
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.

Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants