Add design proposal for K8s-aware vMCP with dynamic backend discovery #2884

jhrozek · 2025-12-03T22:23:34Z

The vMCP implementation currently has duplicated logic between the operator and vMCP server. The operator discovers backends, resolves ExternalAuthConfigs, and embeds backend configuration into a ConfigMap. Meanwhile, the vMCP server has its own discovery code but only loads config statically from the ConfigMap at startup.

This duplication creates inconsistency risk, requires pod restarts for backend changes to take effect, and has grown the operator complex with auth resolution logic scattered across components.

This proposal flips responsibilities: the operator becomes a "dumb" infrastructure manager handling only Deployment/Service/RBAC, while vMCP becomes "smart" by running controller-runtime with informers to watch MCPServer/ExternalAuthConfig/Secret resources directly.

Relates-to: #2855

codecov · 2025-12-03T22:29:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.41%. Comparing base (892043f) to head (485eeb9).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2884      +/-   ##
==========================================
- Coverage   56.59%   56.41%   -0.19%     
==========================================
  Files         322      322              
  Lines       31439    31643     +204     
==========================================
+ Hits        17794    17851      +57     
- Misses      12110    12254     +144     
- Partials     1535     1538       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The vMCP implementation currently has duplicated logic between the operator and vMCP server. The operator discovers backends, resolves ExternalAuthConfigs, and embeds backend configuration into a ConfigMap. Meanwhile, the vMCP server has its own discovery code but only loads config statically from the ConfigMap at startup. This duplication creates inconsistency risk, requires pod restarts for backend changes to take effect, and has grown the operator complex with auth resolution logic scattered across components. This proposal flips responsibilities: the operator becomes a "dumb" infrastructure manager handling only Deployment/Service/RBAC, while vMCP becomes "smart" by running controller-runtime with informers to watch MCPServer/ExternalAuthConfig/Secret resources directly. Relates-to: #2855

JAORMX · 2025-12-04T16:53:29Z

docs/proposals/THV-2884-vmcp-k8s-aware-refactor.md

+changes, the informer notifies vMCP which re-resolves auth for affected
+backends. This provides a fast request path while handling secret rotation
+automatically. The existing code in `pkg/vmcp/workloads/k8s.go` already
+follows this pattern.


I feel shaky about this because the vMCP server is actually exposed to the internet and receives/serves customer traffic. The operator having extra powers is not too bad given that the operator is beholden to the k8s API and does not serve traffic nor is exposed. But this flip can be tricky. I rather defer this and keep the vMCP server just being aware of secrets via environment variables or files. Files might even be ideal since the process can leverage fsnotify or a similar strategy to react to changes.

JAORMX · 2025-12-04T16:54:33Z

docs/proposals/THV-2884-vmcp-k8s-aware-refactor.md

+CLI mode should remain static with one-time discovery at startup. Local
+development sessions are typically short-lived, and adding dynamic discovery
+to CLI would add complexity for limited benefit. This can be revisited as a
+separate work item if there's demand.


Is this section necessary? If the refactor of vMCP leads to a dynamic CLI mode, that would be a win. Let's not add constraints where they're not needed and work on a best-effort approach instead.

github-actions bot added the size/M Medium PR: 300-599 lines changed label Dec 3, 2025

jhrozek force-pushed the thv-k8s-rework branch from 018ebef to f98d0be Compare December 4, 2025 11:41

github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 4, 2025

jhrozek force-pushed the thv-k8s-rework branch from f98d0be to c8f6fc6 Compare December 4, 2025 12:59

github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 4, 2025

jhrozek force-pushed the thv-k8s-rework branch from c8f6fc6 to ac8ad1b Compare December 4, 2025 14:09

github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 4, 2025

jhrozek force-pushed the thv-k8s-rework branch from ac8ad1b to 485eeb9 Compare December 4, 2025 15:32

github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 4, 2025

JAORMX reviewed Dec 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add design proposal for K8s-aware vMCP with dynamic backend discovery #2884

Add design proposal for K8s-aware vMCP with dynamic backend discovery #2884

Uh oh!

jhrozek commented Dec 3, 2025

Uh oh!

codecov bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

JAORMX Dec 4, 2025

Uh oh!

JAORMX Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add design proposal for K8s-aware vMCP with dynamic backend discovery #2884

Are you sure you want to change the base?

Add design proposal for K8s-aware vMCP with dynamic backend discovery #2884

Uh oh!

Conversation

jhrozek commented Dec 3, 2025

Uh oh!

codecov bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JAORMX Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

JAORMX Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 3, 2025 •

edited

Loading