feat: Add support for multi-document YAML in InferenceService creation by Prateekbala · Pull Request #169 · kserve/models-web-app

Prateekbala · 2026-03-13T19:35:57Z

Description

This PR implements support for multi-document YAML when creating InferenceServices, enabling users to define and deploy InferenceService alongside their associated TrainedModel resources in a single multi-document YAML file and deploy everything in one go.

Problem

Previously, users could only submit a single InferenceService resource at a time. When using Multi-Model Serving with Triton, this required multiple separate deployments, making it inconvenient to manage related resources together. Users wanted the ability to define everything in a single multi-document YAML file (similar to kubectl apply -f) and deploy all resources in one operation.

Solution

Technical Implementation:

Replaced single-document YAML parsing with multi-document parsing using loadAll() to handle multiple K8s resources from a single YAML file
Implemented resource type routing to validate and segregate InferenceService and TrainedModel documents during parsing
Added batched resource creation using RxJS forkJoin() to deploy all resources in parallel for improved performance
Introduced granular validation logic for each resource type with specific error messages and field requirements
Added TrainedModel GVK definition in backend to enable TrainedModel resource creation
Implemented new API endpoint to handle TrainedModel POST requests
Refactored notification system with reusable error/success handlers for flexible user feedback
Enforced business logic: exactly one InferenceService (required) with zero or more TrainedModels (optional)
Implemented namespace propagation to all resources at deployment time

Closes #147

Usage Example

Users can now create a multi-document YAML:

---
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: my-model
spec:
  predictor:
    model:
      modelFormat:
        name: triton
      storageUri: gs://bucket/triton-model
---
apiVersion: serving.kserve.io/v1alpha1
kind: TrainedModel
metadata:
  name: model-variant-1
spec:
  inferenceService: my-model
  model:
    framework: triton
    storageUri: gs://bucket/model-variant-1
    memory: "1Gi"
---
apiVersion: serving.kserve.io/v1alpha1
kind: TrainedModel
metadata:
  name: model-variant-2
spec:
  inferenceService: my-model
  model:
    framework: triton
    storageUri: gs://bucket/model-variant-2
    memory: "1Gi"

Signed-off-by: Prateek Bala <prateekbala28@gmail.com>

Prateekbala · 2026-03-13T19:43:48Z

I've also added Cypress and Jest tests for this implementation.

juliusvonkohout · 2026-03-14T11:00:34Z

@LogicalGuy77

Griffin-Sullivan · 2026-03-16T13:50:21Z

+      if (kind === 'InferenceService') {
+        validationErrors.push(...this.validateInferenceService(resource));
+        inferenceServices.push(resource as InferenceServiceK8s);
+      } else if (kind === 'TrainedModel') {
+        validationErrors.push(...this.validateTrainedModel(resource));
+        trainedModels.push(resource as TrainedModelK8s);
+      } else {
+        validationErrors.push(
+          `Unsupported resource kind: "${
+            kind || 'unknown'
+          }". Only InferenceService and TrainedModel are supported.`,
+        );
+      }


The PR title is a little misleading. This is only allowing support for submitting InferenceServices and TrainedModels. I'm not sure we would want to do just that if we allow multi-document YAML

Griffin-Sullivan · 2026-03-16T13:51:36Z

+    if (inferenceServices.length > 1) {
+      validationErrors.push(
+        'Only one InferenceService document is allowed per submission.',
+      );


Why the limit to only one InferenceService?

Griffin-Sullivan · 2026-03-16T13:55:26Z

Not super familiar with the multiple TrainedModel CRs mapping to a single InferenceService. This is quite an old feature for Kserve, so I'm ok with adding the support here. I'd just like to get more description on this PR of why we are doing this and specifics of what we are supporting (ex: 1 isvc and multiple trainedmodels in one go, detailing that only the two CRDs can be submitted like this, plans for supporting more in the future?, etc)

Prateekbala · 2026-03-16T21:19:22Z

Not super familiar with the multiple TrainedModel CRs mapping to a single InferenceService. This is quite an old feature for Kserve, so I'm ok with adding the support here. I'd just like to get more description on this PR of why we are doing this and specifics of what we are supporting (ex: 1 isvc and multiple trainedmodels in one go, detailing that only the two CRDs can be submitted like this, plans for supporting more in the future?, etc)

Thanks for the review! . Let me clarify the scope and design thinking here.

What We're Supporting

-Current scope: Exactly 1 InferenceService + 0 or more TrainedModels in a single submission.

The validation is intentional—in the multi-model serving pattern, one Triton backend (1 ISVC) serves multiple models (N TrainedModels). Multiple InferenceServices in one submission would represent independent serving endpoints, which should be deployed separately.

Only InferenceService and TrainedModel CRDs are accepted. Other resource types will be rejected with a clear error message.

On Future Extensibility

I kept this scoped narrowly to solve the immediate need, but I'm curious about the right approach for the future. A few questions for your input:

Should we expand CRD support as use cases emerge?
What's the best way to handle this without creating tight coupling? Currently, validation is in the frontend. Should we consider:
- Backend-driven validation (backend defines what CRDs/patterns it supports)?
- A plugin-style validator pattern?
- Keep it explicit and add new handlers case-by-case?
Are there other multi-resource deployment patterns you'd want to support that we should design for now?

Looking forward to your thoughts!

Griffin-Sullivan · 2026-03-18T20:28:24Z

I'd encourage a look at the KServe docs https://kserve.github.io/website/docs/intro. I've never used TrainedModel before so I'm not familiar with who is using it and for what use cases especially in the UI.

For context, you'll probably find in the docs that KServe supports a lot of different deployments and some involve multiple resources at deployment. I think it might be better to do this feature from a more general point of view to support how flexible KServe is. This also takes into account what should this project do in the long term. Should we have opinionated deployments? Should we support everything KServe has? These are good things for us to answer in a proposal about multi-document deployments. I'm not against this feature, but it brings on tech debt / maintenance that we need to consider if it's worth it when we could do a larger feature from the start that is less specific to TrainedModel

Prateekbala · 2026-03-20T14:50:03Z

You're absolutely right. I agree this should have a more general approach rather than being TrainedModel-specific. The current implementation will create technical debt for maintainers. Let me come up with a better design that supports KServe's flexibility more broadly.

Thanks for your time!

Add support for multi-document YAML in InferenceService creation

7d2c146

Signed-off-by: Prateek Bala <prateekbala28@gmail.com>

Griffin-Sullivan reviewed Mar 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for multi-document YAML in InferenceService creation#169

feat: Add support for multi-document YAML in InferenceService creation#169
Prateekbala wants to merge 1 commit intokserve:masterfrom
Prateekbala:feat/multi-doc-YAML

Prateekbala commented Mar 13, 2026 •

edited

Loading

Uh oh!

Prateekbala commented Mar 13, 2026

Uh oh!

juliusvonkohout commented Mar 14, 2026

Uh oh!

Griffin-Sullivan Mar 16, 2026

Uh oh!

Griffin-Sullivan Mar 16, 2026

Uh oh!

Griffin-Sullivan commented Mar 16, 2026

Uh oh!

Prateekbala commented Mar 16, 2026 •

edited

Loading

Uh oh!

Griffin-Sullivan commented Mar 18, 2026 •

edited

Loading

Uh oh!

Prateekbala commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Prateekbala commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Usage Example

Uh oh!

Prateekbala commented Mar 13, 2026

Uh oh!

juliusvonkohout commented Mar 14, 2026

Uh oh!

Griffin-Sullivan Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Griffin-Sullivan Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Griffin-Sullivan commented Mar 16, 2026

Uh oh!

Prateekbala commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Griffin-Sullivan commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Prateekbala commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Prateekbala commented Mar 13, 2026 •

edited

Loading

Prateekbala commented Mar 16, 2026 •

edited

Loading

Griffin-Sullivan commented Mar 18, 2026 •

edited

Loading