Skip to content

health check for AIServiceBackend #587

@kerthcet

Description

@kerthcet

Description:

Describe the desired behavior, what scenario it enables and how it
would be used.

apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
  name: qwen2--5-coder
  namespace: default
spec:
  timeouts:
    request: 3m
  schema:
    name: OpenAI
  backendRef:
    name: qwen2--5-coder-lb
    kind: Service
    port: 8080

Here's one service backend we defined in the cluster and refer to the model: qwen2.5-coder, however, even the service doesn't exist or the service is not ready, when we query the model list, we can still see the model from v1/models/, results look like:

{
  "data": [
    {
      "id": "qwen2-0.5b",
      "created": 1745481246,
      "object": "model",
      "owned_by": "Envoy AI Gateway"
    },
    {
      "id": "qwen2.5-coder",
      "created": 1745481246,
      "object": "model",
      "owned_by": "Envoy AI Gateway"
    }
  ],
  "object": "list"
}

Then in our chatbot, we can still query the model while it's always failed. My question is can we remove the model from the model list unless the service is ready?

Thanks.

[optional Relevant Links:]

Any extra documentation required to understand the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions