fix(router): enforce session-to-route binding validation#324
fix(router): enforce session-to-route binding validation#324KinshukSS2 wants to merge 7 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request implements session target validation to prevent incorrect session reuse across different workloads. It adds WorkloadKind and WorkloadName metadata to the sandbox information and introduces a check to ensure the retrieved sandbox matches the requested target. Critical feedback was provided regarding the createSandbox implementation, where overwriting the ephemeral sandbox name and resource kind causes observability regressions and breaks JWT authentication. Additionally, it is recommended to use warning-level logging for session mismatches to improve security visibility.
| klog.V(2).InfoS("Session target metadata missing", "sessionID", sessionID, "requestedNamespace", namespace, "requestedName", name, "requestedKind", kind, "missingFields", strings.Join(missingFields, ",")) | ||
| return false | ||
| } | ||
| if sandbox.WorkloadKind != kind || sandbox.WorkloadName != name || sandbox.SandboxNamespace != namespace { | ||
| klog.V(2).InfoS("Session target mismatch", "sessionID", sessionID, "requestedNamespace", namespace, "requestedName", name, "requestedKind", kind, "actualNamespace", sandbox.SandboxNamespace, "actualWorkloadName", sandbox.WorkloadName, "actualWorkloadKind", sandbox.WorkloadKind) |
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull request overview
This PR hardens the Router session reuse flow by binding a session ID to the specific route identity (namespace/name/kind) it was originally created for, preventing replay of a valid session against a different workload route.
Changes:
- Persisted route identity on sessions by adding
WorkloadKind/WorkloadNametotypes.SandboxInfoand populating them from Workload Manager. - Enforced strict session-to-route validation in
SessionManager.GetSandboxBySession, returning a 409 Conflict on mismatch. - Added a dedicated API error/predicate for mismatch and surfaced
SESSION_TARGET_MISMATCHin Router JSON error responses; expanded unit tests accordingly.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/workloadmanager/workload_builder.go | Populates workload identity fields on sandboxEntry during sandbox creation. |
| pkg/workloadmanager/k8s_client.go | Extends sandboxEntry to carry workload identity for later propagation. |
| pkg/workloadmanager/sandbox_helper.go | Persists WorkloadKind/WorkloadName into stored SandboxInfo for session reuse validation. |
| pkg/workloadmanager/sandbox_helper_test.go | Updates tests to assert workload identity propagation. |
| pkg/common/types/sandbox.go | Adds WorkloadKind/WorkloadName fields to the persisted session/sandbox metadata model. |
| pkg/router/session_manager.go | Adds strict session target validation and constructs session metadata on create path. |
| pkg/router/session_manager_test.go | Adds mismatch/match/fail-closed unit tests; refactors mock handler to reduce complexity. |
| pkg/api/errors.go | Adds mismatch conflict error constructor and predicate helper. |
| pkg/api/errors_test.go | Adds coverage for the new mismatch predicate. |
| pkg/router/handlers.go | Emits SESSION_TARGET_MISMATCH error code in JSON responses for the new conflict type. |
| pkg/router/handlers_test.go | Asserts SESSION_TARGET_MISMATCH is present in the JSON response body. |
| // Construct Sandbox Info from response, storing the requested target metadata | ||
| // to enable proper session reuse validation. The ephemeral sandbox name (res.SandboxName) | ||
| // is not stored here to avoid the "fail-open" vulnerability where a session could be | ||
| // incorrectly reused for a different workload. | ||
| sandbox := &types.SandboxInfo{ | ||
| SandboxID: res.SandboxID, | ||
| Name: res.SandboxName, | ||
| SessionID: res.SessionID, | ||
| EntryPoints: res.EntryPoints, | ||
| SandboxID: res.SandboxID, | ||
| Name: res.SandboxName, | ||
| SandboxNamespace: namespace, | ||
| Kind: types.SandboxKind, | ||
| WorkloadKind: kind, | ||
| WorkloadName: name, | ||
| SessionID: res.SessionID, | ||
| EntryPoints: res.EntryPoints, |
| if len(missingFields) > 0 { | ||
| klog.V(2).InfoS("Session target metadata missing", "sessionID", sessionID, "requestedNamespace", namespace, "requestedName", name, "requestedKind", kind, "missingFields", strings.Join(missingFields, ",")) | ||
| return false | ||
| } | ||
| if sandbox.WorkloadKind != kind || sandbox.WorkloadName != name || sandbox.SandboxNamespace != namespace { | ||
| klog.V(2).InfoS("Session target mismatch", "sessionID", sessionID, "requestedNamespace", namespace, "requestedName", name, "requestedKind", kind, "actualNamespace", sandbox.SandboxNamespace, "actualWorkloadName", sandbox.WorkloadName, "actualWorkloadKind", sandbox.WorkloadKind) | ||
| return false |
| func IsSessionTargetMismatch(err error) bool { | ||
| if !apierrors.IsConflict(err) { | ||
| return false | ||
| } | ||
| statusErr, ok := err.(apierrors.APIStatus) | ||
| if !ok { | ||
| return false | ||
| } | ||
| status := statusErr.Status() | ||
| if status.Details == nil { | ||
| return false | ||
| } | ||
| if status.Details.Group != sessionResource.Group || status.Details.Kind != sessionResource.Resource { | ||
| return false | ||
| } | ||
| return strings.Contains(status.Message, "session target mismatch") | ||
| } |
eff4b20 to
a6ae2b8
Compare
Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
- Fix fail-open vulnerability: store requested metadata (namespace, name, kind) in SandboxInfo instead of ephemeral sandbox name for proper session reuse validation - Fix ephemeral name mismatch: sessionTargetMatches now compares against stored requested metadata instead of generated sandbox names - Add structured logging: log session target mismatches at V(2) level for better observability - Add comprehensive table-driven tests: cover perfect matches, mismatches, nil sandboxes, empty fields (wildcard matching), and edge cases Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
Reject reused session IDs that don't match the requested namespace/name/kind route tuple before forwarding. Changes: - pkg/common/types: add WorkloadKind and WorkloadName fields to SandboxInfo to persist the route-identity of each session at creation time, enabling deterministic comparison on reuse. - pkg/api/errors: add IsSessionTargetMismatch helper and NewSessionTargetMismatchError (409 Conflict) used by the session manager on mismatch. - pkg/router/session_manager: replace wildcard-permissive matching with strict three-field comparison (WorkloadKind, WorkloadName, SandboxNamespace). Fail-closed when any field is empty. Emit structured klog warning on mismatch. - pkg/router/handlers: surface SESSION_TARGET_MISMATCH error code in the JSON response body via IsSessionTargetMismatch check. - pkg/workloadmanager: propagate WorkloadKind/WorkloadName through sandboxEntry -> buildSandboxPlaceHolder/buildSandboxInfo so that sandboxes created by the workload manager carry the metadata needed for reuse validation. Resolves: session reuse isolation bug where a valid session from Runtime A could be replayed against Route B, bypassing tenant isolation and causing cross-runtime execution. Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
- handlers_test: add expectedBodyCode assertion to TestHandleInvoke_ErrorPaths/session-target-mismatch to verify the SESSION_TARGET_MISMATCH code is present in the JSON response. - session_manager_test: migrate all sessionTargetMatches fixtures to use WorkloadKind/WorkloadName fields; fix TestGetSandboxBySession_ TargetMatch_EmptyFields to assert conflict error (fail-closed); extract agentRuntimeCreateHandler helper to bring TestGetSandboxBySession_CreateSandbox_AgentRuntime_Success cyclomatic complexity below gocyclo threshold (was 17, now <=15). - sandbox_helper_test: populate WorkloadKind/WorkloadName in sandboxEntry fixtures and assert they are propagated to SandboxInfo. Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
Security-relevant events (missing metadata, session target mismatch) must be visible in production logs at all times. Promote both log lines from klog.V(2).InfoS (verbose/debug) to klog.Warningf so they are captured without enabling verbose logging. Addresses code review feedback from gemini-code-assist. Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
a6ae2b8 to
bd8b9a3
Compare
| // Construct Sandbox Info from response, storing the requested target metadata | ||
| // to enable proper session reuse validation. The ephemeral sandbox name (res.SandboxName) | ||
| // is not stored here to avoid the "fail-open" vulnerability where a session could be | ||
| // incorrectly reused for a different workload. | ||
| sandbox := &types.SandboxInfo{ | ||
| SandboxID: res.SandboxID, | ||
| Name: res.SandboxName, | ||
| SessionID: res.SessionID, | ||
| EntryPoints: res.EntryPoints, | ||
| SandboxID: res.SandboxID, | ||
| Name: name, | ||
| SandboxNamespace: namespace, | ||
| Kind: kind, | ||
| WorkloadKind: kind, | ||
| WorkloadName: name, | ||
| SessionID: res.SessionID, | ||
| EntryPoints: res.EntryPoints, |
| func IsSessionTargetMismatch(err error) bool { | ||
| if !apierrors.IsConflict(err) { | ||
| return false | ||
| } | ||
| statusErr, ok := err.(apierrors.APIStatus) | ||
| if !ok { | ||
| return false | ||
| } | ||
| status := statusErr.Status() | ||
| if status.Details == nil { | ||
| return false | ||
| } | ||
| if status.Details.Group != sessionResource.Group || status.Details.Kind != sessionResource.Resource { | ||
| return false | ||
| } | ||
| return strings.Contains(status.Message, "session target mismatch") | ||
| } |
| if sandbox.Name != "test-runtime" { | ||
| t.Errorf("expected Name test-runtime (the requested workload name), got %s", sandbox.Name) | ||
| } | ||
| if sandbox.SandboxNamespace != "default" { | ||
| t.Errorf("expected SandboxNamespace default, got %s", sandbox.SandboxNamespace) | ||
| } | ||
| if sandbox.Kind != types.AgentRuntimeKind { | ||
| t.Errorf("expected Kind %s, got %s", types.AgentRuntimeKind, sandbox.Kind) |
The Docker build on CI uses --mount=type=cache for the Go build cache. Stale cache entries from a prior broken push were causing a phantom 'duplicate field name workloadName' error even though the source file was correct. Adding explanatory comments to the sandboxEntry literals in buildSandboxByAgentRuntime and buildSandboxByCodeInterpreter changes the file hash, forcing the Go toolchain to recompile the package from scratch and bypass the poisoned cache. The comments also document why WorkloadKind/WorkloadName are set here. Signed-off-by: KinshukSS2 <kinshuk380@gmail.com>
2d5bf71 to
b6d9492
Compare
Problem
When a client sends
x-agentcube-session-id, the router retrieves the sandbox from the store and forwards the request without checking whether the session belongs to the namespace/name/kind in the URL. A valid session from Runtime A can be replayed against Route B, bypassing tenant and workload isolation.Solution
Added strict three-field validation
(namespace, name, kind)between the route and the session metadata stored at creation time. Mismatched sessions are rejected before forwarding with a409 Conflictresponse and aSESSION_TARGET_MISMATCHerror code.Changes
pkg/common/typesWorkloadKind/WorkloadNametoSandboxInfoto persist route-identity at session creationpkg/api/errorsNewSessionTargetMismatchError(409) andIsSessionTargetMismatchpredicatepkg/router/session_managersessionTargetMatches: fail-closed on empty metadata, strict 3-field comparison,klog.Warningon mismatchpkg/router/handlersSESSION_TARGET_MISMATCHcode in JSON response bodypkg/workloadmanagerWorkloadKind/WorkloadNamethroughsandboxEntry→buildSandboxPlaceHolder/buildSandboxInfoTesting
SESSION_TARGET_MISMATCHin JSON body, field propagation through workload manager builders.gocyclolint violation fixed by extractingagentRuntimeCreateHandlerhelper.