Skip to content

[GLUTEN-9243][VL] Fix cuda docker image#9333

Merged
philo-he merged 6 commits intoapache:mainfrom
zhouyuan:wip_fix_cuda_image
Apr 18, 2025
Merged

[GLUTEN-9243][VL] Fix cuda docker image#9333
philo-he merged 6 commits intoapache:mainfrom
zhouyuan:wip_fix_cuda_image

Conversation

@zhouyuan
Copy link
Copy Markdown
Member

@zhouyuan zhouyuan commented Apr 15, 2025

What changes were proposed in this pull request?

The image ghcr.io/facebookincubator/velox-dev:adapters is not avaiable on ARM
followup on #9229

Signed-off-by: Yuan Zhou yuan.zhou@ibm.com

How was this patch tested?

pass GHA

The image `ghcr.io/facebookincubator/velox-dev:adapters` is not avaiable on ARM

Signed-off-by: Yuan Zhou <yuan.zhou@ibm.com>
Signed-off-by: Yuan Zhou <yuan.zhou@ibm.com>
Signed-off-by: Yuan Zhou <yuan.zhou@ibm.com>
@github-actions github-actions bot added the INFRA label Apr 15, 2025
@github-actions
Copy link
Copy Markdown

#9243

This reverts commit 3966966.
@zhouyuan zhouyuan marked this pull request as ready for review April 15, 2025 14:29
Signed-off-by: Yuan Zhou <yuan.zhou@ibm.com>
@github-actions github-actions bot added the BUILD label Apr 15, 2025
Copy link
Copy Markdown
Member

@philo-he philo-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@philo-he philo-he changed the title [GLUTEN-9243][VL] fix cuda image [GLUTEN-9243][VL] Fix cuda docker image Apr 15, 2025
Signed-off-by: Yuan Zhou <yuan.zhou@ibm.com>
@github-actions github-actions bot added the VELOX label Apr 16, 2025
if [ $ENABLE_GPU == "ON" ]; then
COMPILE_OPTION="$COMPILE_OPTION -DVELOX_ENABLE_GPU=ON -DVELOX_ENABLE_CUDF=ON"
# the cuda default options are for Centos9 image from Meta
COMPILE_OPTION="$COMPILE_OPTION -DVELOX_ENABLE_GPU=ON -DVELOX_ENABLE_CUDF=ON -DCMAKE_CUDA_ARCHITECTURES=70 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.8/bin/nvcc"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use the ENV here but without CUDA_COMPILER https://github.com/apache/incubator-gluten/pull/9333/files#diff-395a263b6f69bd7ef4991face3666e100acdd60e47f4b28e0737adbbf5fb945aR12,is all the environment CUDA COMPILER this path? Do we need to make it an environment variable in docker image?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my tests it will report cuda identifier is undefined in device code if not setting these two variables. Note the GHA runner does not have NV GPU installed.

CUDA_ARCHITECTURESis able to read from CMAKE_CUDA_ARCHITECTURES
https://cmake.org/cmake/help/v3.28/prop_tgt/CUDA_ARCHITECTURES.html
however it seems there's no similar variable like CUDA_COMPILERS

with:
images: ${{ env.DOCKERHUB_REPO }}
tags: centos-8-jdk8
tags: vcpkg-centos-8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the tag is changed, some docker images in docker hub maybe duplicated. I don't suggest to change the tag, the user may face compatible issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh
Copy link
Copy Markdown
Contributor

Thanks for your fix.

@philo-he philo-he merged commit 6dd3899 into apache:main Apr 18, 2025
45 checks passed
@zhouyuan
Copy link
Copy Markdown
Member Author

The fix does not work on GHA due to:

  • meta/velox missed one commit to allow skip building tests on GPU enabled
  • GHA runner disk space is not enough on building with meta centos9 adaptors
    Will try to make a fix again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants