Skip to content
This repository was archived by the owner on May 6, 2026. It is now read-only.

Kuberay#162

Merged
aojea merged 1 commit into
google:mainfrom
aojea:kuberay
Jul 15, 2025
Merged

Kuberay#162
aojea merged 1 commit into
google:mainfrom
aojea:kuberay

Conversation

@aojea
Copy link
Copy Markdown
Contributor

@aojea aojea commented Jul 14, 2025

Failing because of PyTorch incompatibility

ebb27178e50d10b97086d35a/virtualenv/lib/python3.9/site-packages/torch/cuda/init.py:287: UserWarning:
(Worker pid=4560, ip=10.48.4.6) NVIDIA B200 with CUDA capability sm_100 is not compatible with the current PyTorch installation.

@andrewsykim
Copy link
Copy Markdown

cc @chiayi to help with dependency issues when using ARM

Comment thread site/content/docs/user/kuberay.md
Comment thread site/content/docs/user/kuberay.md Outdated
@andrewsykim
Copy link
Copy Markdown

Could we also add an example using A3 Ultra? Do we suspect the example to be more or less the same?

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Jul 14, 2025

Could we also add an example using A3 Ultra? Do we suspect the example to be more or less the same?

exactly the same, you just need to define the DeviceClass and the ResourceClaimTemplate and reference it from the workload, the rest is done by the system

Change-Id: I5a2eccbbbb9b74fc4b3dc4a01bf7c07dab21b1fa
@aojea aojea changed the title [WIP] Kuberay Kuberay Jul 15, 2025
@aojea aojea merged commit 1322cc2 into google:main Jul 15, 2025
5 checks passed
@aojea aojea mentioned this pull request Jul 15, 2025
13 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants