-
Notifications
You must be signed in to change notification settings - Fork 146
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Description:
Describe the desired behavior, what scenario it enables and how it
would be used.
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
name: qwen2--5-coder
namespace: default
spec:
timeouts:
request: 3m
schema:
name: OpenAI
backendRef:
name: qwen2--5-coder-lb
kind: Service
port: 8080
Here's one service backend we defined in the cluster and refer to the model: qwen2.5-coder, however, even the service doesn't exist or the service is not ready, when we query the model list, we can still see the model from v1/models/, results look like:
{
"data": [
{
"id": "qwen2-0.5b",
"created": 1745481246,
"object": "model",
"owned_by": "Envoy AI Gateway"
},
{
"id": "qwen2.5-coder",
"created": 1745481246,
"object": "model",
"owned_by": "Envoy AI Gateway"
}
],
"object": "list"
}
Then in our chatbot, we can still query the model while it's always failed. My question is can we remove the model from the model list unless the service is ready?
Thanks.
[optional Relevant Links:]
Any extra documentation required to understand the issue.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request