[Bug] Allow zero replica for workers for Helm#968
[Bug] Allow zero replica for workers for Helm#968kevin85421 merged 2 commits intoray-project:masterfrom
Conversation
kevin85421
left a comment
There was a problem hiding this comment.
Thank you for the contribution! I am wondering is there any difference for your use case between disabled: true and replicas: 0?
as I understand, |
|
I think setting |
|
@kevin85421 Could you make a review? |
kevin85421
left a comment
There was a problem hiding this comment.
Test this PR manually using this gist.
# Step 0: Replace values.yaml with the gist
# (path: helm-chart/ray-cluster)
helm install ray-cluster .
# Step 1: Try to scale up the cluster
export HEAD_POD=$(kubectl get pods -o custom-columns=POD:metadata.name | grep raycluster-autoscaler-head)
kubectl exec $HEAD_POD -it -c ray-head -- python -c "import ray;ray.init();ray.autoscaler.sdk.request_resources(num_cpus=4)"
# Step 2: The RayCluster will scale from 0 worker to 3 workers.
|
@kevin85421 is this available on 0.5.2? |
I see it's only on 0.6.0. Is it stable or still WIP? |
Allow zero replica for workers for Helm
Why are these changes needed?
We are currently using Ray for computing heavily tasks on GKE. When initializing, it spawns a worker each worker group. Then, it triggers GKE scale up node. It's money cost.
This happens because ternary function in template file.
{{ 0 | 1 }} = 1kuberay/helm-chart/ray-cluster/templates/raycluster-cluster.yaml
Line 91 in 87dde22
kuberay/helm-chart/ray-cluster/templates/raycluster-cluster.yaml
Line 157 in 87dde22
workaround by setting default replica to zero.
Related issue number
Open #965
Checks