feat: dynamic model support via MODELS_JSON ConfigMap #612

jeremyeder · 2026-02-10T22:00:19Z

Summary

Add MODELS_JSON key to the existing operator-config ConfigMap (all overlays: kind, e2e, production, local-dev, minikube)
Backend parses MODELS_JSON env var and includes models array in GET /api/cluster-info response
Frontend reads models from useClusterInfo() hook and populates the session creation dropdown dynamically (falls back to hardcoded list if empty)
Operator passes MODELS_JSON through to runner pods as an env var
Runner builds VERTEX_MODEL_MAP from MODELS_JSON at startup (falls back to hardcoded map)

Adding or removing a model is now: edit ConfigMap → kubectl rollout restart → done. No code changes or image rebuilds needed.

Test plan

GET /api/cluster-info returns models array populated from ConfigMap
Frontend model dropdown shows models from API response
Default model selection respects default: true flag
Frontend falls back to hardcoded list when models is empty
Runner resolves Vertex model IDs from MODELS_JSON env var
Runner falls back to hardcoded map when MODELS_JSON is unset
Edit ConfigMap + rollout restart updates models without image rebuilds
go vet, gofmt, tsc --noEmit, next build all pass

🤖 Generated with Claude Code

Models were hardcoded in the frontend dropdown and runner's Vertex mapping, requiring code changes + image rebuilds to add a model. Now models are defined in the existing operator-config ConfigMap and flow through the stack: backend exposes them via /api/cluster-info, frontend populates the dropdown dynamically, operator passes them to runner pods, and the runner builds VERTEX_MODEL_MAP from the env var. Edit ConfigMap + rollout restart = models updated, no image rebuilds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-02-10T22:01:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

github-actions · 2026-02-10T22:02:03Z

Claude Code Review

Summary

This PR introduces dynamic model configuration via a MODELS_JSON ConfigMap key, allowing operators to add/remove models without code changes or image rebuilds. The implementation spans backend (Go), frontend (TypeScript/React), and runner (Python), with graceful fallback to hardcoded defaults.

Overall Assessment: Strong implementation with good fallback patterns and type safety. A few areas need attention before merge.

Issues by Severity

🔴 Critical Issues

1. Backend: Untyped JSON parsing exposes type safety risks

Location: components/backend/handlers/projects.go:141-149
Issue: Parsing MODELS_JSON into []map[string]interface{} without validation
Risk: Malformed ConfigMap data could cause frontend crashes or unexpected behavior
Impact: Medium - graceful degradation exists, but could lead to confusing UX
Fix: Add struct type and validation:

type ModelInfo struct {
    Name        string `json:"name"`
    DisplayName string `json:"displayName"`
    VertexID    string `json:"vertexId,omitempty"`
    Default     bool   `json:"default,omitempty"`
}

var models []ModelInfo
if raw := os.Getenv("MODELS_JSON"); raw \!= "" {
    if err := json.Unmarshal([]byte(raw), &models); err \!= nil {
        log.Printf("Warning: failed to parse MODELS_JSON: %v", err)
        models = []ModelInfo{}
    }
}
// Validate required fields
for _, m := range models {
    if m.Name == "" || m.DisplayName == "" {
        log.Printf("Warning: invalid model entry (missing name or displayName)")
    }
}

2. Frontend: Defaulting model value may cause form inconsistency

Location: components/frontend/src/components/create-session-dialog.tsx:79-92
Issue: defaultModel is computed but not reactive to clusterModels changes
Risk: If useClusterInfo() loads after form initialization, default stays claude-sonnet-4-5 even if cluster default differs
Fix: Use useEffect to reset form when clusterModels loads:

useEffect(() => {
  if (clusterModels.length > 0) {
    const defaultModel = clusterModels.find((m) => m.default)?.name ?? "claude-sonnet-4-5";
    form.setValue("model", defaultModel);
  }
}, [clusterModels, form]);

🟡 Major Issues

3. Python: Broad exception catching masks errors

Location: components/runners/claude-code-runner/auth.py:65
Issue: except Exception: catches all errors, including KeyError, TypeError
Best Practice: Catch specific exceptions for better debugging
Fix:

try:
    models = _json.loads(raw)
    return {m["name"]: m["vertexId"] for m in models if m.get("vertexId")}
except (_json.JSONDecodeError, KeyError, TypeError) as e:
    logger.warning(f"Failed to parse MODELS_JSON: {e}")
    return dict(_HARDCODED_VERTEX_MAP)

4. Missing validation: Empty string handling

Location: Backend projects.go:142, Runner auth.py:59
Issue: Both check if raw \!= "" but don't validate JSON isn't just whitespace
Risk: Whitespace-only env vars pass the check but fail parsing
Fix: Add strings.TrimSpace() in Go, raw.strip() in Python

5. No default model validation in frontend

Location: create-session-dialog.tsx:79
Issue: If no model has default: true, hardcoded fallback may not exist in cluster
Risk: Form defaults to a model not available in the cluster
Fix:

const defaultModel = clusterModels.find((m) => m.default)?.name 
  ?? clusterModels[0]?.name  // Use first model if no default
  ?? "claude-sonnet-4-5";   // Final fallback

🔵 Minor Issues

6. Python: Dictionary comprehension could filter models more explicitly

Location: auth.py:64
Issue: if m.get("vertexId") silently skips models without vertexId
Suggestion: Log skipped models for operator visibility:

vertex_map = {}
for m in models:
    if "vertexId" not in m or not m["vertexId"]:
        logger.info(f"Skipping model without vertexId: {m.get('name', 'unknown')}")
    else:
        vertex_map[m["name"]] = m["vertexId"]
return vertex_map

7. ConfigMap comments could be more descriptive

Location: All operator-config.yaml files
Current: # Available models for the platform (consumed by backend, operator, runner)
Suggestion: Add example showing required/optional fields:

# Available models for the platform (consumed by backend, operator, runner)
# Format: [{"name":"...","displayName":"...","vertexId":"...","default":true}]
# Required: name, displayName | Optional: vertexId (for Vertex AI), default (boolean)
MODELS_JSON: '...'

8. No error boundary for model dropdown rendering

Location: create-session-dialog.tsx:75-77
Issue: If clusterModels has malformed data, .map() could throw
Suggestion: Add validation before mapping:

const models = clusterModels.length > 0 && clusterModels.every(m => m.name && m.displayName)
  ? clusterModels.map((m) => ({ value: m.name, label: m.displayName }))
  : fallbackModels;

Positive Highlights

✅ Excellent fallback patterns - All three components gracefully degrade to hardcoded defaults
✅ Type safety in frontend - ModelInfo type properly defined in cluster.ts
✅ Consistent implementation - Backend, frontend, and runner all follow similar patterns
✅ Deployment coverage - ConfigMap updated across all overlays (kind, e2e, prod, minikube, local-dev)
✅ Non-breaking change - Existing deployments continue working with empty/missing MODELS_JSON
✅ Security conscious - ConfigMap marked optional: true in deployments
✅ Good separation of concerns - API layer (cluster.ts) separated from hook (use-cluster-info.ts)

Recommendations

Pre-Merge (Priority Order)

Add backend validation - Structured type for ModelInfo in Go (Critical Issue Outcome: Reduce Refinement Time with agent System #1)
Fix frontend default model - useEffect to handle async cluster info load (Critical Issue Epic: RAT Architecture & Design #2)
Narrow Python exception handling - Catch specific exceptions (Major Issue Epic: Data Source Integration #3)
Add whitespace validation - Trim env vars before parsing (Major Issue Epic: AI Agent Development #4)
Improve default model logic - Use first model if no default flag (Major Issue Epic: Jira Integration & Workflow #5)

Post-Merge Enhancements

Add E2E test validating model dropdown populates from ConfigMap
Add backend endpoint for validating MODELS_JSON schema (for operator tooling)
Document MODELS_JSON schema in components/manifests/README.md
Consider adding Prometheus metric for model selection distribution

Testing Checklist Progress

Based on PR description test plan:

⚠️ Partially addressed: Frontend default model (see Critical Issue Epic: RAT Architecture & Design #2)
⚠️ Missing: Validation that selected model exists in cluster after ConfigMap change
✅ Covered: Fallback behavior (code review confirms pattern)
⚠️ Recommend: Add test for malformed JSON in ConfigMap

Adherence to Project Standards

Backend (Go):

✅ Error handling patterns followed
⚠️ Type safety needs improvement (use structs, not map[string]interface{})
✅ No panic in production code
✅ Proper logging with context

Frontend (TypeScript/React):

✅ React Query patterns followed correctly
✅ Shadcn components used
✅ Type safety with ModelInfo type
✅ No any types
⚠️ Missing useEffect for reactive default handling

Security:

✅ No token leakage concerns
✅ Public endpoint (/cluster-info) appropriate for this data
✅ No RBAC bypass risks

Overall: 8/10 - Solid implementation, needs type safety and reactive default handling improvements before merge.

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: dynamic model support via MODELS_JSON ConfigMap #612

feat: dynamic model support via MODELS_JSON ConfigMap #612

Uh oh!

jeremyeder commented Feb 10, 2026

Uh oh!

codecov bot commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: dynamic model support via MODELS_JSON ConfigMap #612

Are you sure you want to change the base?

feat: dynamic model support via MODELS_JSON ConfigMap #612

Uh oh!

Conversation

jeremyeder commented Feb 10, 2026

Summary

Test plan

Uh oh!

codecov bot commented Feb 10, 2026

Codecov Report

Uh oh!

github-actions bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🔴 Critical Issues

🟡 Major Issues

🔵 Minor Issues

Positive Highlights

Recommendations

Pre-Merge (Priority Order)

Post-Merge Enhancements

Testing Checklist Progress

Adherence to Project Standards

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Feb 10, 2026 •

edited

Loading