Skip to content

feat: Add machine identity REST support#481

Open
parmani-nv wants to merge 3 commits intoNVIDIA:mainfrom
parmani-nv:parmani/spiffe
Open

feat: Add machine identity REST support#481
parmani-nv wants to merge 3 commits intoNVIDIA:mainfrom
parmani-nv:parmani/spiffe

Conversation

@parmani-nv
Copy link
Copy Markdown
Contributor

Adds a REST surface that proxies carbide-core Forge so Tenant Admins can manage per-tenant JWT-SVID signing

Under /v2/org/{org}/carbide/site/{siteID}:

  • Authenticated (FORGE_TENANT_ADMIN, org from URL): PUT|GET|DELETE /identity/config, PUT|GET|DELETE /identity/token-delegation.
  • Public (no auth, mounted before versioned auth middleware so verifiers without credentials can still fetch metadata): GET /.well-known/jwks.json, GET /.well-known/openid-configuration, GET /.well-known/spiffe/jwks.json.

Implementation spans:

  • API: handlers, models, routes, and config in api/ (notably api/pkg/api/handler/identity.go, api/pkg/api/model/identity.go, api/pkg/api/routes.go, api/internal/config/).

  • Site Workflow: identity workflow and activities that talk to Forge gRPC from activities (site-workflow/pkg/workflow/identity.go, site-workflow/pkg/activity/identity.go).

  • Forge gRPC mocks for tests: new in-memory Forge test server and client test helpers (site-workflow/pkg/grpc/server/forge_test_server.go, site-workflow/pkg/grpc/client/testing.go) so identity activities and workflows can be exercised without a real carbide-core.

  • Site Agent: registers the new MachineIdentity manager on the Temporal worker (site-agent/pkg/components/managers/identity/, managerapi/machineidentity_api.go, manager.go, orchestrator wiring).

  • OpenAPI + CarbideCLI: new Machine Identity tag in openapi/spec.yaml (with regenerated docs/index.html) and matching CLI/TUI command coverage in cli/tui/commands.go, cli/tui/repl.go, plus CLI tests (cli/pkg/commands_test.go, cli/pkg/spec_test.go) so the generated CLI exposes the new endpoints out of the box.

  • Docs/scripts: supporting material under docs/machine-identity/ and optional shell harnesses scripts/machine-identity-e2e.sh and scripts/machine-identity-test-matrix.sh.

  • Feature - New feature or functionality (feat:)

  • Fix - Bug fixes (fix:)

  • Chore - Modification or removal of existing functionality (chore:)

  • Refactor - Refactoring of existing functionality (refactor:)

  • Docs - Changes in documentation or OpenAPI schema (docs:)

  • CI - Changes in GitHub workflows. Requires additional scrutiny (ci:)

  • Version - Issuing a new release version (version:)

  • API - API models or endpoints updated

  • Workflow - Workflow service updated

  • DB - DB DAOs or migrations updated

  • Site Manager - Site Manager updated

  • Cert Manager - Cert Manager updated

  • Site Agent - Site Agent updated

  • RLA - RLA service updated

  • Powershelf Manager - Powershelf Manager updated

  • NVSwitch Manager - NVSwitch Manager updated

  • This PR contains breaking changes

  • Unit tests added/updated

  • Integration tests added/updated

  • Manual testing performed

  • No testing required (docs, internal refactor, etc.)

New/expanded unit coverage:

  • api/pkg/api/handler/identity_test.go, api/pkg/api/model/identity_test.go, api/internal/config/config_test.go for handler/model/config behavior.

  • api/pkg/api/routes_test.go updated to count the six new /identity/config and /identity/token-delegation routes in the authenticated tier, and a new TestNewWellKnownRoutes enumerates the public .well-known/{jwks.json,openid-configuration,spiffe/jwks.json} routes by exact (path, method) so any accidental promotion of an unauthenticated route into the auth-protected group fails loudly.

  • site-workflow/pkg/activity/identity_test.go for the Forge-backed activities, driven by the new in-memory Forge mock (site-workflow/pkg/grpc/server/forge_test_server.go, site-workflow/pkg/grpc/client/testing.go).

  • cli/pkg/commands_test.go / cli/pkg/spec_test.go to verify the generated CLI surface.

  • Auth uses the tenant org from the URL; the site controller remains authoritative for config presence (NOT_FOUND when missing).

  • Per-site defaults (machineIdentity.defaults.<siteID>) can supply issuer (with {org} placeholder), tokenTtlSec, and subjectPrefix; missing required fields return 400.

  • Identity config PUT uses full-resolution CreatedAt/UpdatedAt comparison to decide 201 vs 200.

  • Token-delegation GET exposes only a SHA-256 hash of the client secret — the secret is never returned after creation.

  • Handlers use the same synchronous workflow pattern as other handlers in this package (ExecuteWorkflow + typed Get, timeout termination, UnwrapWorkflowError) rather than ExecuteSyncWorkflow, so responses and runaway-workflow cleanup stay consistent with the rest of the API.

  • The well-known routes live in a separate NewWellKnownRoutes function (api/pkg/api/routes.go) and are mounted on the root echo with the version prefix, before the versioned routeGroup's auth middleware in api/internal/server/server.go. Keeping them out of NewAPIRoutes is what allows JWT verifiers without credentials to fetch JWKS/OIDC metadata, and the dedicated TestNewWellKnownRoutes guards that boundary.

  • The db/cmd/migrations/migrations binary in the diff is a build artifact and should be removed before merge (no schema changes belong to this PR; leave DB unchecked).

  • CarbideCLI: the new Machine Identity OpenAPI tag means the generated CLI/TUI picks up identity/config, identity/token-delegation, and the .well-known reads automatically; new TUI command wiring lives in cli/tui/commands.go / cli/tui/repl.go.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Review Change Stack

Summary by CodeRabbit

  • New Features

    • Machine identity management: create/get/delete SPIFFE JWT‑SVID config and RFC 8693 token delegation
    • Public discovery endpoints: .well-known/jwks.json, .well-known/openid-configuration, .well-known/spiffe/jwks.json
    • CLI/TUI commands for machine-identity and well‑known endpoints
  • Behavior

    • PUTs return 201 on initial create and 200 on subsequent updates
    • JWKS returns {"keys":[]} when material is absent; OpenID discovery returns 404 when absent
  • Tests

    • Extensive handler, model, activity, workflow, CLI and integration tests added/updated

Walkthrough

This PR implements machine-identity support: REST endpoints (tenant-admin CRUD for identity config and token delegation), public .well-known JWKS/OpenID discovery, deterministic workflow IDs and PUT status logic, Temporal workflows and site activities, site-agent manager wiring, mock NICo test infra, CLI/TUI commands, and comprehensive tests.

Changes

Machine Identity Feature Implementation

Layer / File(s) Summary
API Data Models & OpenAPI Contract
api/pkg/api/model/identity.go, openapi/spec.yaml
Defines REST request/response models with validation and protobuf conversion; OpenAPI adds Machine Identity tag, management endpoints, public .well-known/* discovery routes, and schema components for identity config, token delegation, JWKS, and OIDC discovery.
API Route Registration
api/pkg/api/routes.go, api/internal/server/server.go
Registers authenticated site-scoped identity/token-delegation endpoints (PUT/GET/DELETE) and exposes unauthenticated .well-known/* discovery routes; mounts public routes before versioned auth middleware.
HTTP Handler Implementation
api/pkg/api/handler/identity.go, api/pkg/api/handler/util/common/common.go
Implements authenticated handlers for identity config and token delegation CRUD and public JWKS/OpenID endpoints; adds deterministic payloadHash() for Temporal workflow IDs, putStatusFromTimestamps() for 201/200 selection on PUT, JWKS JSON validation/normalization (empty → {"keys":[]}), and lowercases org in handler setup.
Temporal Workflow Proxies
site-workflow/pkg/workflow/identity.go
Adds workflows that proxy to site activities with shared activity options (30s start-to-close timeout and retry policy).
NICo Activity Wrappers
site-workflow/pkg/activity/identity.go
Adds activity wrappers that call NICo RPCs for Set/Get/Delete identity config and token delegation, and for JWKS/OpenID retrieval; each validates input, checks client connectivity, logs, and wraps errors.
Site Agent Manager Integration
site-agent/pkg/components/managers/identity/*, site-agent/pkg/components/managers/manager.go, site-agent/pkg/components/managers/managerapi/machineidentity_api.go
Introduces MachineIdentity manager API, implements Init/RegisterSubscriber/GetState, wires into Manager lifecycle and accessor, and registers workflows/activities on Temporal worker.
Mock NICo Server & gRPC Testing
site-workflow/pkg/grpc/server/nico_test_server.go, site-workflow/pkg/grpc/client/testing.go
Extends mock NICo with per-org identity state, ES256 key generation, JWKS/OpenID endpoints, SignMachineIdentity, MockNICoClient methods, and adds test seeding via LoadTestIdentity().
Test Coverage
api/pkg/api/model/identity_test.go, api/pkg/api/handler/identity_test.go, api/pkg/api/routes_test.go, site-workflow/pkg/activity/identity_test.go
Adds unit and integration tests covering model validation and proto mapping, handler behaviors (timeout→500 + workflow termination, JWKS normalization/validation, deterministic workflow IDs with payload hash, PUT 201/200 selection), activity contract enforcement, and route/CLI expectations.
CLI/TUI Commands & Autocomplete
cli/tui/commands.go, cli/tui/repl.go
Adds machine-identity CLI/TUI subcommands for config {get,update,delete}, token-delegation {get,update,delete}, well-known {jwks,spiffe-jwks,openid} with interactive prompts and REPL autocomplete updates.
CLI Integration Tests
cli/pkg/commands_test.go, cli/pkg/spec_test.go
Extends tests to include new machine-identity operation IDs and embedded spec command expectations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 74.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the primary change: adding REST API support for machine identity operations, which is the central feature of this substantial PR.
Description check ✅ Passed The description comprehensively explains the feature scope, implementation architecture, API endpoints, constraints, and testing coverage. It clearly relates to the changeset across multiple components.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@parmani-nv parmani-nv changed the title feat: Add machine identity REST surface (SPIFFE JWT-SVID + RFC 8693 t… feat: Add machine identity REST support May 7, 2026
@parmani-nv parmani-nv force-pushed the parmani/spiffe branch 2 times, most recently from 8a1bdf2 to e113d08 Compare May 7, 2026 22:30
…oken delegation)

Adds a REST surface that proxies carbide-core Forge so Tenant Admins can
manage per-tenant JWT-SVID signing configuration and RFC 8693 token
delegation without direct site access. Runtime workloads still obtain
tokens from the DPU IMDS over mTLS — these endpoints are not on the hot
path for token issuance.

Under `/v2/org/{org}/carbide/site/{siteID}`:

- Authenticated (`FORGE_TENANT_ADMIN`, org from URL):
  `PUT|GET|DELETE /identity/config`,
  `PUT|GET|DELETE /identity/token-delegation`.
- Public (no auth, mounted before the versioned auth middleware so
  verifiers without credentials can fetch metadata):
  `GET /.well-known/jwks.json`,
  `GET /.well-known/openid-configuration`,
  `GET /.well-known/spiffe/jwks.json`.

The site controller is authoritative for config presence and issuance
policy; REST proxies via Temporal workflows on the site task queue.
Token-delegation responses expose only a SHA-256 hash of the client
secret — the raw secret is never returned after creation.

Signed-off-by: Parham Armani <parmani@nvidia.com>
@parmani-nv parmani-nv marked this pull request as ready for review May 7, 2026 23:12
@parmani-nv parmani-nv requested a review from a team as a code owner May 7, 2026 23:12
@parmani-nv
Copy link
Copy Markdown
Contributor Author

/ok to test 9e428ea

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-05-07 23:14:48 UTC | Commit: 9e428ea

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/identity.go`:
- Around line 50-58: The payloadHash function currently truncates the SHA-256 to
8 bytes causing high collision risk; update payloadHash (the function that calls
proto.MarshalOptions{Deterministic: true}.Marshal and computes sha256.Sum256) to
return a longer digest — use at least 16 bytes (sum[:16]) or the full sum before
hex-encoding (hex.EncodeToString) so workflow IDs derived from payloads have
≥128 bits of entropy and avoid collisions when
WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING is used.
- Around line 771-778: The current check only ensures the presence of probe.Keys
but not that it is a JSON array; update the unmarshalling so that the local
probe variable validates that "keys" is an array (e.g., change probe.Keys to a
slice type like []json.RawMessage or unmarshal probe.Keys into
[]json.RawMessage) and return the same BadGateway error when the unmarshal into
an array fails or the array is empty; keep the same logging path
(logger.Error().Err(err).Str("orgName", org).Msg(...)) and error response via
cutil.NewAPIErrorResponse so malformed non-array JWKS payloads are rejected
before being forwarded.

In `@cli/tui/commands.go`:
- Line 2628: The echoed LogCmd calls currently print an incorrect command
sequence (missing the "config" and "well-known" components and with arguments
out of order), so update each LogCmd invocation that echoes machine-identity
commands (e.g., the call LogCmd(s, "machine-identity", "get", site.ID)) to
reflect the real CLI order and names — e.g., call LogCmd(s, "config",
"well-known", "machine-identity", "get", site.ID) (and similarly for other
actions like list/rotate) for every occurrence referenced (the calls at the
other sites noted in the comment), ensuring the strings and argument order match
the actual invokable command hierarchy.
- Around line 2683-2689: The code currently parses ttlStr into u64 and accepts
any uint32; enforce the documented bounds 300–86400 client-side by validating
the parsed value before setting body["tokenTtlSec"]. After
strconv.ParseUint(ttlStr, 10, 32) in the same block, check that u64 >= 300 &&
u64 <= 86400 and return a descriptive fmt.Errorf (e.g., "token TTL must be
between 300 and 86400 seconds") if it falls outside that range; only then set
body["tokenTtlSec"] = uint32(u64).

In `@openapi/spec.yaml`:
- Around line 12604-12611: The SPIFFE JWKS response documentation is missing a
Cache-Control response header like the OIDC JWKS endpoint; update the responses
for the SPIFFE JWKS path (the 200 response that references the
components/schemas/JWKS) to include a Cache-Control header entry (same semantics
as the /.well-known/jwks.json documentation) so clients receive explicit caching
guidance; modify the OpenAPI operation that returns the SPIFFE JWKS to add a
headers: Cache-Control with description and schema type string matching the
existing OIDC JWKS header documentation.
- Around line 21517-21520: The response schema IdentityConfig.tokenTtlSec
currently hardcodes minimum: 300 and maximum: 86400 which conflicts with
IdentityConfigUpdateRequest.tokenTtlSec (minimum: 1) — remove the fixed
minimum/maximum from the IdentityConfig.tokenTtlSec schema so the response does
not advertise server-enforced bounds, or alternatively, if those bounds are true
system-wide invariants, update IdentityConfigUpdateRequest.tokenTtlSec to match
and add a short description documenting the rationale; locate and edit the
tokenTtlSec entry in the IdentityConfig and IdentityConfigUpdateRequest schemas
to keep them consistent.

In `@site-workflow/pkg/activity/identity_test.go`:
- Around line 141-168: The test should assert the exact redaction and hashing
contract: after calling m.SetTokenDelegationOnSite verify
resp.GetClientSecretBasic() is non-nil, assert that basic.GetClientSecret() (the
raw field) is empty/nil/zero, and assert that basic.GetClientSecretHash() equals
the expected SHA-256 hex of "super-secret" (compute SHA-256 and compare to
basic.GetClientSecretHash()), while still checking it does not contain the raw
secret; this locks down the use of SHA-256 and prevents regressions in
SetTokenDelegationOnSite, resp.GetClientSecretBasic(), basic.GetClientSecret()
and basic.GetClientSecretHash().

In `@site-workflow/pkg/grpc/server/nico_test_server.go`:
- Around line 1530-1539: The mock server path handling in the branch where
in.GetClientSecretBasic() != nil validates client_id but fails to reject an
empty client_secret; add a validation that checks
strings.TrimSpace(basic.GetClientSecret()) and if empty return
status.Errorf(codes.InvalidArgument, "client_secret is required for
client_secret_basic"); keep the existing response construction
(cwssaws.TokenDelegationResponse_ClientSecretBasic and
cwssaws.ClientSecretBasicResponse) and continue to use
clientSecretDisplayHash(basic.GetClientSecret()) only after the new non-empty
check passes.
- Around line 1423-1473: SetIdentityConfiguration currently accepts empty issuer
or zero TokenTtlSec and persists the issuer template verbatim; add validation
and normalization before persisting: reject empty issuer and TokenTtlSec <= 0
(return InvalidArgument), and normalize the issuer template (e.g., via a new
helper normalizeIssuerTemplate or validateIssuerTemplate) to ensure it contains
the required "{org}" placeholder and does not leak a literal "{org}" into stored
Issuer; set resp.Issuer to the normalized value (instead of in.GetIssuer()) and
keep the rest of the flow (calls to normalizeAllowedAudiences,
generateES256KeyMaterial, and updates to f.identityConfigs/f.identityKeys)
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b2e282d8-4af0-4f01-9c76-df52a7d3334d

📥 Commits

Reviewing files that changed from the base of the PR and between f02eabc and 9e428ea.

📒 Files selected for processing (28)
  • api/internal/server/server.go
  • api/pkg/api/handler/identity.go
  • api/pkg/api/handler/identity_test.go
  • api/pkg/api/handler/util/common/common.go
  • api/pkg/api/model/identity.go
  • api/pkg/api/model/identity_test.go
  • api/pkg/api/routes.go
  • api/pkg/api/routes_test.go
  • cli/pkg/commands_test.go
  • cli/pkg/spec_test.go
  • cli/tui/commands.go
  • cli/tui/repl.go
  • docs/index.html
  • openapi/spec.yaml
  • site-agent/pkg/components/managers/identity/access.go
  • site-agent/pkg/components/managers/identity/init.go
  • site-agent/pkg/components/managers/identity/subscriber.go
  • site-agent/pkg/components/managers/manager.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/machineidentity_api.go
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-workflow/pkg/activity/identity.go
  • site-workflow/pkg/activity/identity_test.go
  • site-workflow/pkg/grpc/client/testing.go
  • site-workflow/pkg/grpc/server/nico_test_server.go
  • site-workflow/pkg/workflow/identity.go

Comment on lines +50 to +58
// payloadHash returns a deterministic short hex digest of the proto message.
func payloadHash(m proto.Message) (string, error) {
b, err := proto.MarshalOptions{Deterministic: true}.Marshal(m)
if err != nil {
return "", err
}
sum := sha256.Sum256(b)
return hex.EncodeToString(sum[:8]), nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a longer digest for payload-derived workflow IDs.

Only 8 bytes of the SHA-256 are kept here. Because the PUT handlers use WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING, a collision aliases two different request bodies to the same workflow execution and can return the wrong result. Keep at least 128 bits, or just use the full digest.

Suggested fix
 func payloadHash(m proto.Message) (string, error) {
 	b, err := proto.MarshalOptions{Deterministic: true}.Marshal(m)
 	if err != nil {
 		return "", err
 	}
 	sum := sha256.Sum256(b)
-	return hex.EncodeToString(sum[:8]), nil
+	return hex.EncodeToString(sum[:16]), nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// payloadHash returns a deterministic short hex digest of the proto message.
func payloadHash(m proto.Message) (string, error) {
b, err := proto.MarshalOptions{Deterministic: true}.Marshal(m)
if err != nil {
return "", err
}
sum := sha256.Sum256(b)
return hex.EncodeToString(sum[:8]), nil
}
// payloadHash returns a deterministic short hex digest of the proto message.
func payloadHash(m proto.Message) (string, error) {
b, err := proto.MarshalOptions{Deterministic: true}.Marshal(m)
if err != nil {
return "", err
}
sum := sha256.Sum256(b)
return hex.EncodeToString(sum[:16]), nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/identity.go` around lines 50 - 58, The payloadHash
function currently truncates the SHA-256 to 8 bytes causing high collision risk;
update payloadHash (the function that calls proto.MarshalOptions{Deterministic:
true}.Marshal and computes sha256.Sum256) to return a longer digest — use at
least 16 bytes (sum[:16]) or the full sum before hex-encoding
(hex.EncodeToString) so workflow IDs derived from payloads have ≥128 bits of
entropy and avoid collisions when WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING is
used.

Comment thread api/pkg/api/handler/identity.go
Comment thread cli/tui/commands.go
if err != nil {
return err
}
LogCmd(s, "machine-identity", "get", site.ID)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix machine-identity LogCmd output to match real command names.

Several echoed commands are not invokable as printed (missing config or well-known, and order mismatch), which hurts copy/paste usability.

Suggested fix
- LogCmd(s, "machine-identity", "get", site.ID)
+ LogCmd(s, "machine-identity", "config", "get", site.ID)

- LogCmd(s, "machine-identity", "update", site.ID)
+ LogCmd(s, "machine-identity", "config", "update", site.ID)

- LogCmd(s, "machine-identity", "delete", site.ID)
+ LogCmd(s, "machine-identity", "config", "delete", site.ID)

- LogCmd(s, "machine-identity", "jwks", "get", site.ID)
+ LogCmd(s, "machine-identity", "well-known", "jwks", site.ID)

- LogCmd(s, "machine-identity", "spiffe-jwks", "get", site.ID)
+ LogCmd(s, "machine-identity", "well-known", "spiffe-jwks", site.ID)

- LogCmd(s, "machine-identity", "openid-configuration", "get", site.ID)
+ LogCmd(s, "machine-identity", "well-known", "openid", site.ID)

Also applies to: 2701-2701, 2724-2724, 2826-2826, 2840-2840, 2854-2854

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/tui/commands.go` at line 2628, The echoed LogCmd calls currently print an
incorrect command sequence (missing the "config" and "well-known" components and
with arguments out of order), so update each LogCmd invocation that echoes
machine-identity commands (e.g., the call LogCmd(s, "machine-identity", "get",
site.ID)) to reflect the real CLI order and names — e.g., call LogCmd(s,
"config", "well-known", "machine-identity", "get", site.ID) (and similarly for
other actions like list/rotate) for every occurrence referenced (the calls at
the other sites noted in the comment), ensuring the strings and argument order
match the actual invokable command hierarchy.

Comment thread cli/tui/commands.go
Comment on lines +2683 to +2689
if strings.TrimSpace(ttlStr) != "" {
u64, perr := strconv.ParseUint(strings.TrimSpace(ttlStr), 10, 32)
if perr != nil {
return fmt.Errorf("invalid token TTL: %w", perr)
}
body["tokenTtlSec"] = uint32(u64)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Enforce the documented tokenTtlSec bounds client-side.

The prompt states 300-86400, but current parsing accepts any uint32, pushing avoidable validation failures to the server.

Suggested fix
 if strings.TrimSpace(ttlStr) != "" {
 	u64, perr := strconv.ParseUint(strings.TrimSpace(ttlStr), 10, 32)
 	if perr != nil {
 		return fmt.Errorf("invalid token TTL: %w", perr)
 	}
+	if u64 < 300 || u64 > 86400 {
+		return fmt.Errorf("token TTL must be between 300 and 86400 seconds")
+	}
 	body["tokenTtlSec"] = uint32(u64)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if strings.TrimSpace(ttlStr) != "" {
u64, perr := strconv.ParseUint(strings.TrimSpace(ttlStr), 10, 32)
if perr != nil {
return fmt.Errorf("invalid token TTL: %w", perr)
}
body["tokenTtlSec"] = uint32(u64)
}
if strings.TrimSpace(ttlStr) != "" {
u64, perr := strconv.ParseUint(strings.TrimSpace(ttlStr), 10, 32)
if perr != nil {
return fmt.Errorf("invalid token TTL: %w", perr)
}
if u64 < 300 || u64 > 86400 {
return fmt.Errorf("token TTL must be between 300 and 86400 seconds")
}
body["tokenTtlSec"] = uint32(u64)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/tui/commands.go` around lines 2683 - 2689, The code currently parses
ttlStr into u64 and accepts any uint32; enforce the documented bounds 300–86400
client-side by validating the parsed value before setting body["tokenTtlSec"].
After strconv.ParseUint(ttlStr, 10, 32) in the same block, check that u64 >= 300
&& u64 <= 86400 and return a descriptive fmt.Errorf (e.g., "token TTL must be
between 300 and 86400 seconds") if it falls outside that range; only then set
body["tokenTtlSec"] = uint32(u64).

Comment thread openapi/spec.yaml
Comment thread openapi/spec.yaml Outdated
Comment on lines +141 to +168
t.Run("success with client_secret_basic (hash returned, raw never)", func(t *testing.T) {
req := &cwssaws.TokenDelegationRequest{
OrganizationId: "acme-corp",
Config: &cwssaws.TokenDelegation{
TokenEndpoint: "https://auth.acme.com/oauth2/token",
SubjectTokenAudience: "acme-exchange",
AuthMethodConfig: &cwssaws.TokenDelegation_ClientSecretBasic{
ClientSecretBasic: &cwssaws.ClientSecretBasic{
ClientId: "client-123",
ClientSecret: "super-secret",
},
},
},
}
resp, err := m.SetTokenDelegationOnSite(ctx, req)
require.NoError(t, err)
require.NotNil(t, resp)
assert.Equal(t, "acme-corp", resp.GetOrganizationId())
assert.Equal(t, "https://auth.acme.com/oauth2/token", resp.GetTokenEndpoint())
basic := resp.GetClientSecretBasic()
require.NotNil(t, basic, "response oneof should carry hashed client_secret")
assert.Equal(t, "client-123", basic.GetClientId())
assert.NotEmpty(t, basic.GetClientSecretHash())

// Critical security invariant: the raw secret must never appear anywhere
// in the response proto.
assert.NotContains(t, basic.GetClientSecretHash(), "super-secret")
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert the exact secret-redaction contract.

This subtest only proves that the returned value is non-empty and does not literally contain the raw secret. A regression to a different hash algorithm, or a response that still populates client_secret, would still pass. Since this API promises “secret never returned” and a SHA-256 hash specifically, lock that down here with exact assertions.

Proposed test hardening
 import (
 	"context"
+	"crypto/sha256"
+	"encoding/hex"
 	"testing"
@@
 		basic := resp.GetClientSecretBasic()
 		require.NotNil(t, basic, "response oneof should carry hashed client_secret")
 		assert.Equal(t, "client-123", basic.GetClientId())
-		assert.NotEmpty(t, basic.GetClientSecretHash())
+		assert.Empty(t, basic.GetClientSecret())
+		expected := sha256.Sum256([]byte("super-secret"))
+		assert.Equal(t, hex.EncodeToString(expected[:]), basic.GetClientSecretHash())
 
 		// Critical security invariant: the raw secret must never appear anywhere
 		// in the response proto.
 		assert.NotContains(t, basic.GetClientSecretHash(), "super-secret")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/activity/identity_test.go` around lines 141 - 168, The test
should assert the exact redaction and hashing contract: after calling
m.SetTokenDelegationOnSite verify resp.GetClientSecretBasic() is non-nil, assert
that basic.GetClientSecret() (the raw field) is empty/nil/zero, and assert that
basic.GetClientSecretHash() equals the expected SHA-256 hex of "super-secret"
(compute SHA-256 and compare to basic.GetClientSecretHash()), while still
checking it does not contain the raw secret; this locks down the use of SHA-256
and prevents regressions in SetTokenDelegationOnSite,
resp.GetClientSecretBasic(), basic.GetClientSecret() and
basic.GetClientSecretHash().

Comment on lines +1423 to +1473
func (f *NICoServerImpl) SetIdentityConfiguration(ctx context.Context, req *cwssaws.IdentityConfigRequest) (*cwssaws.IdentityConfigResponse, error) {
if req == nil || req.GetOrganizationId() == "" || req.GetConfig() == nil {
return nil, status.Errorf(codes.InvalidArgument, "Invalid request argument")
}
in := req.GetConfig()
if strings.TrimSpace(in.GetDefaultAudience()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "default_audience is required")
}
allowed, err := normalizeAllowedAudiences(in.GetDefaultAudience(), in.GetAllowedAudiences())
if err != nil {
return nil, status.Errorf(codes.InvalidArgument, "%s", err.Error())
}

orgID := req.GetOrganizationId()
now := timestamppb.Now()

existing, isUpdate := f.identityConfigs[orgID]
resp := &cwssaws.IdentityConfigResponse{
OrganizationId: orgID,
Config: &cwssaws.IdentityConfig{
Enabled: in.GetEnabled(),
Issuer: in.GetIssuer(),
DefaultAudience: in.GetDefaultAudience(),
AllowedAudiences: allowed,
TokenTtlSec: in.GetTokenTtlSec(),
SubjectPrefix: in.SubjectPrefix,
RotateKey: false,
},
UpdatedAt: now,
}

if isUpdate && !in.GetRotateKey() {
// Update path: keep the existing key + created-at.
resp.KeyId = existing.GetKeyId()
resp.CreatedAt = existing.GetCreatedAt()
} else {
// First-create or rotate-key: generate a fresh ES256 keypair.
newKey, err := generateES256KeyMaterial()
if err != nil {
return nil, status.Errorf(codes.Internal, "failed to generate signing key: %v", err)
}
f.identityKeys[orgID] = newKey
resp.KeyId = newKey.kid
if isUpdate {
resp.CreatedAt = existing.GetCreatedAt()
} else {
resp.CreatedAt = now
}
}
f.identityConfigs[orgID] = resp
return resp, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the mock identity-config contract aligned with the controller.

SetIdentityConfiguration currently accepts an empty issuer / zero TTL and stores the issuer template verbatim. That lets mock-backed tests succeed on input the real controller is supposed to reject, and it can leak a literal {org} into OIDC discovery and JWT iss output instead of the resolved org-specific issuer. Normalize and validate those fields here before persisting the config.

Suggested fix
 func (f *NICoServerImpl) SetIdentityConfiguration(ctx context.Context, req *cwssaws.IdentityConfigRequest) (*cwssaws.IdentityConfigResponse, error) {
 	if req == nil || req.GetOrganizationId() == "" || req.GetConfig() == nil {
 		return nil, status.Errorf(codes.InvalidArgument, "Invalid request argument")
 	}
 	in := req.GetConfig()
 	if strings.TrimSpace(in.GetDefaultAudience()) == "" {
 		return nil, status.Errorf(codes.InvalidArgument, "default_audience is required")
 	}
+	issuer := strings.ReplaceAll(strings.TrimSpace(in.GetIssuer()), "{org}", req.GetOrganizationId())
+	if issuer == "" {
+		return nil, status.Errorf(codes.InvalidArgument, "issuer is required")
+	}
+	if in.GetTokenTtlSec() == 0 {
+		return nil, status.Errorf(codes.InvalidArgument, "token_ttl_sec is required")
+	}
 	allowed, err := normalizeAllowedAudiences(in.GetDefaultAudience(), in.GetAllowedAudiences())
 	if err != nil {
 		return nil, status.Errorf(codes.InvalidArgument, "%s", err.Error())
 	}
 
 	orgID := req.GetOrganizationId()
 	now := timestamppb.Now()
@@
 		OrganizationId: orgID,
 		Config: &cwssaws.IdentityConfig{
 			Enabled:          in.GetEnabled(),
-			Issuer:           in.GetIssuer(),
+			Issuer:           issuer,
 			DefaultAudience:  in.GetDefaultAudience(),
 			AllowedAudiences: allowed,
 			TokenTtlSec:      in.GetTokenTtlSec(),
 			SubjectPrefix:    in.SubjectPrefix,
 			RotateKey:        false,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (f *NICoServerImpl) SetIdentityConfiguration(ctx context.Context, req *cwssaws.IdentityConfigRequest) (*cwssaws.IdentityConfigResponse, error) {
if req == nil || req.GetOrganizationId() == "" || req.GetConfig() == nil {
return nil, status.Errorf(codes.InvalidArgument, "Invalid request argument")
}
in := req.GetConfig()
if strings.TrimSpace(in.GetDefaultAudience()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "default_audience is required")
}
allowed, err := normalizeAllowedAudiences(in.GetDefaultAudience(), in.GetAllowedAudiences())
if err != nil {
return nil, status.Errorf(codes.InvalidArgument, "%s", err.Error())
}
orgID := req.GetOrganizationId()
now := timestamppb.Now()
existing, isUpdate := f.identityConfigs[orgID]
resp := &cwssaws.IdentityConfigResponse{
OrganizationId: orgID,
Config: &cwssaws.IdentityConfig{
Enabled: in.GetEnabled(),
Issuer: in.GetIssuer(),
DefaultAudience: in.GetDefaultAudience(),
AllowedAudiences: allowed,
TokenTtlSec: in.GetTokenTtlSec(),
SubjectPrefix: in.SubjectPrefix,
RotateKey: false,
},
UpdatedAt: now,
}
if isUpdate && !in.GetRotateKey() {
// Update path: keep the existing key + created-at.
resp.KeyId = existing.GetKeyId()
resp.CreatedAt = existing.GetCreatedAt()
} else {
// First-create or rotate-key: generate a fresh ES256 keypair.
newKey, err := generateES256KeyMaterial()
if err != nil {
return nil, status.Errorf(codes.Internal, "failed to generate signing key: %v", err)
}
f.identityKeys[orgID] = newKey
resp.KeyId = newKey.kid
if isUpdate {
resp.CreatedAt = existing.GetCreatedAt()
} else {
resp.CreatedAt = now
}
}
f.identityConfigs[orgID] = resp
return resp, nil
func (f *NICoServerImpl) SetIdentityConfiguration(ctx context.Context, req *cwssaws.IdentityConfigRequest) (*cwssaws.IdentityConfigResponse, error) {
if req == nil || req.GetOrganizationId() == "" || req.GetConfig() == nil {
return nil, status.Errorf(codes.InvalidArgument, "Invalid request argument")
}
in := req.GetConfig()
if strings.TrimSpace(in.GetDefaultAudience()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "default_audience is required")
}
issuer := strings.ReplaceAll(strings.TrimSpace(in.GetIssuer()), "{org}", req.GetOrganizationId())
if issuer == "" {
return nil, status.Errorf(codes.InvalidArgument, "issuer is required")
}
if in.GetTokenTtlSec() == 0 {
return nil, status.Errorf(codes.InvalidArgument, "token_ttl_sec is required")
}
allowed, err := normalizeAllowedAudiences(in.GetDefaultAudience(), in.GetAllowedAudiences())
if err != nil {
return nil, status.Errorf(codes.InvalidArgument, "%s", err.Error())
}
orgID := req.GetOrganizationId()
now := timestamppb.Now()
existing, isUpdate := f.identityConfigs[orgID]
resp := &cwssaws.IdentityConfigResponse{
OrganizationId: orgID,
Config: &cwssaws.IdentityConfig{
Enabled: in.GetEnabled(),
Issuer: issuer,
DefaultAudience: in.GetDefaultAudience(),
AllowedAudiences: allowed,
TokenTtlSec: in.GetTokenTtlSec(),
SubjectPrefix: in.SubjectPrefix,
RotateKey: false,
},
UpdatedAt: now,
}
if isUpdate && !in.GetRotateKey() {
// Update path: keep the existing key + created-at.
resp.KeyId = existing.GetKeyId()
resp.CreatedAt = existing.GetCreatedAt()
} else {
// First-create or rotate-key: generate a fresh ES256 keypair.
newKey, err := generateES256KeyMaterial()
if err != nil {
return nil, status.Errorf(codes.Internal, "failed to generate signing key: %v", err)
}
f.identityKeys[orgID] = newKey
resp.KeyId = newKey.kid
if isUpdate {
resp.CreatedAt = existing.GetCreatedAt()
} else {
resp.CreatedAt = now
}
}
f.identityConfigs[orgID] = resp
return resp, nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/grpc/server/nico_test_server.go` around lines 1423 - 1473,
SetIdentityConfiguration currently accepts empty issuer or zero TokenTtlSec and
persists the issuer template verbatim; add validation and normalization before
persisting: reject empty issuer and TokenTtlSec <= 0 (return InvalidArgument),
and normalize the issuer template (e.g., via a new helper
normalizeIssuerTemplate or validateIssuerTemplate) to ensure it contains the
required "{org}" placeholder and does not leak a literal "{org}" into stored
Issuer; set resp.Issuer to the normalized value (instead of in.GetIssuer()) and
keep the rest of the flow (calls to normalizeAllowedAudiences,
generateES256KeyMaterial, and updates to f.identityConfigs/f.identityKeys)
unchanged.

Comment on lines +1530 to +1539
if basic := in.GetClientSecretBasic(); basic != nil {
if strings.TrimSpace(basic.GetClientId()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "client_id is required for client_secret_basic")
}
resp.AuthMethodConfig = &cwssaws.TokenDelegationResponse_ClientSecretBasic{
ClientSecretBasic: &cwssaws.ClientSecretBasicResponse{
ClientId: basic.GetClientId(),
ClientSecretHash: clientSecretDisplayHash(basic.GetClientSecret()),
},
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject blank client_secret_basic.clientSecret in the mock server.

When client_secret_basic is present, this path validates client_id but not client_secret. A blank secret is currently accepted and hashed, so workflow/activity tests can go green on input that production should reject as INVALID_ARGUMENT.

Suggested fix
 	if basic := in.GetClientSecretBasic(); basic != nil {
 		if strings.TrimSpace(basic.GetClientId()) == "" {
 			return nil, status.Errorf(codes.InvalidArgument, "client_id is required for client_secret_basic")
 		}
+		if strings.TrimSpace(basic.GetClientSecret()) == "" {
+			return nil, status.Errorf(codes.InvalidArgument, "client_secret is required for client_secret_basic")
+		}
 		resp.AuthMethodConfig = &cwssaws.TokenDelegationResponse_ClientSecretBasic{
 			ClientSecretBasic: &cwssaws.ClientSecretBasicResponse{
 				ClientId:         basic.GetClientId(),
 				ClientSecretHash: clientSecretDisplayHash(basic.GetClientSecret()),
 			},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if basic := in.GetClientSecretBasic(); basic != nil {
if strings.TrimSpace(basic.GetClientId()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "client_id is required for client_secret_basic")
}
resp.AuthMethodConfig = &cwssaws.TokenDelegationResponse_ClientSecretBasic{
ClientSecretBasic: &cwssaws.ClientSecretBasicResponse{
ClientId: basic.GetClientId(),
ClientSecretHash: clientSecretDisplayHash(basic.GetClientSecret()),
},
}
if basic := in.GetClientSecretBasic(); basic != nil {
if strings.TrimSpace(basic.GetClientId()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "client_id is required for client_secret_basic")
}
if strings.TrimSpace(basic.GetClientSecret()) == "" {
return nil, status.Errorf(codes.InvalidArgument, "client_secret is required for client_secret_basic")
}
resp.AuthMethodConfig = &cwssaws.TokenDelegationResponse_ClientSecretBasic{
ClientSecretBasic: &cwssaws.ClientSecretBasicResponse{
ClientId: basic.GetClientId(),
ClientSecretHash: clientSecretDisplayHash(basic.GetClientSecret()),
},
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/grpc/server/nico_test_server.go` around lines 1530 - 1539,
The mock server path handling in the branch where in.GetClientSecretBasic() !=
nil validates client_id but fails to reject an empty client_secret; add a
validation that checks strings.TrimSpace(basic.GetClientSecret()) and if empty
return status.Errorf(codes.InvalidArgument, "client_secret is required for
client_secret_basic"); keep the existing response construction
(cwssaws.TokenDelegationResponse_ClientSecretBasic and
cwssaws.ClientSecretBasicResponse) and continue to use
clientSecretDisplayHash(basic.GetClientSecret()) only after the new non-empty
check passes.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-nsm 64 2 20 33 9 0
nico-psm 56 4 29 13 2 8
nico-rest-api 57 4 30 13 2 8
nico-rest-cert-manager 54 4 28 13 1 8
nico-rest-db 55 4 28 13 2 8
nico-rest-site-agent 54 4 28 13 1 8
nico-rest-site-manager 54 4 28 13 1 8
nico-rest-workflow 56 4 29 13 2 8
nico-rla 55 4 28 13 2 8
TOTAL 505 34 248 137 22 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

…wasn't a requirement by SDD nor required by the standards

Signed-off-by: Parham Armani <parmani@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openapi/spec.yaml`:
- Around line 21592-21599: The JWKS schema currently defines the keys property
but does not mark it required; update the JWKS schema (symbol: JWKS) to require
the keys property so that generated clients/validators get {"keys": []} when
empty—add a required: ["keys"] entry under the JWKS object and ensure the
existing keys property remains an array (items as currently defined) so the
contract stays consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ab7d0f91-f688-411a-9141-1c1aa44184f2

📥 Commits

Reviewing files that changed from the base of the PR and between 9e428ea and 6fe35e5.

📒 Files selected for processing (2)
  • docs/index.html
  • openapi/spec.yaml

Comment thread openapi/spec.yaml
Comment on lines +21592 to +21599
JWKS:
title: JWKS
description: RFC 7517 JSON Web Key Set. Pass-through from the site controller.
type: object
properties:
keys:
type: array
items:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Require JWKS.keys to match the documented constant-shape response.

At Line [21597], keys is defined but not required. The path docs promise {"keys":[]} when empty, so the schema should require keys to keep generated clients and validators aligned with the contract.

Proposed fix
     JWKS:
       title: JWKS
       description: RFC 7517 JSON Web Key Set. Pass-through from the site controller.
       type: object
+      required:
+        - keys
       properties:
         keys:
           type: array

As per coding guidelines openapi/spec.yaml: Review the OpenAPI specification, check for consistency and correctness.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
JWKS:
title: JWKS
description: RFC 7517 JSON Web Key Set. Pass-through from the site controller.
type: object
properties:
keys:
type: array
items:
JWKS:
title: JWKS
description: RFC 7517 JSON Web Key Set. Pass-through from the site controller.
type: object
required:
- keys
properties:
keys:
type: array
items:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openapi/spec.yaml` around lines 21592 - 21599, The JWKS schema currently
defines the keys property but does not mark it required; update the JWKS schema
(symbol: JWKS) to require the keys property so that generated clients/validators
get {"keys": []} when empty—add a required: ["keys"] entry under the JWKS object
and ensure the existing keys property remains an array (items as currently
defined) so the contract stays consistent.

Signed-off-by: Parham Armani <parmani@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/identity.go`:
- Around line 127-131: The code currently treats any error from
common.GetSiteFromIDString as a client/absent condition; change it to check for
the repository sentinel cdb.ErrDoesNotExist (using direct equality) and preserve
the existing 400/404-style response for that case, but for any other non-nil err
return a 500-style response (use http.StatusInternalServerError via
cutil.NewAPIErrorResponse), and ensure the logger records the full error and
context; apply the same change to both call sites that use GetSiteFromIDString
(the blocks that set site, err and return with logger, span, ctx,
cutil.NewAPIErrorResponse).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6b4a87aa-4907-40b1-9557-4450a9bd6f18

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe35e5 and 5ef3852.

📒 Files selected for processing (2)
  • api/pkg/api/handler/identity.go
  • api/pkg/api/handler/identity_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/pkg/api/handler/identity_test.go

Comment on lines +127 to +131
site, err = common.GetSiteFromIDString(ctx, nil, siteID, dbSession)
if err != nil {
logger.Warn().Err(err).Str("Site ID", siteID).Msg("error getting site from request")
return "", nil, nil, span, logger, ctx, cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Error retrieving Site in request", nil)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Differentiate “site absent” from real site-lookup failures.

Line 127-131 and Line 170-174 treat every GetSiteFromIDString failure as client/absent state. That can mask DB/backend errors as 400, 404, or empty JWKS, which is misleading during outages.

💡 Suggested fix
@@
-	site, err = common.GetSiteFromIDString(ctx, nil, siteID, dbSession)
-	if err != nil {
-		logger.Warn().Err(err).Str("Site ID", siteID).Msg("error getting site from request")
-		return "", nil, nil, span, logger, ctx, cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Error retrieving Site in request", nil)
-	}
+	site, err = common.GetSiteFromIDString(ctx, nil, siteID, dbSession)
+	if err != nil {
+		if err == cdb.ErrDoesNotExist {
+			logger.Warn().Err(err).Str("Site ID", siteID).Msg("site not found")
+			return "", nil, nil, span, logger, ctx, cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Error retrieving Site in request", nil)
+		}
+		logger.Error().Err(err).Str("Site ID", siteID).Msg("failed to retrieve Site")
+		return "", nil, nil, span, logger, ctx, cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to retrieve Site", nil)
+	}
@@
-	site, err := common.GetSiteFromIDString(ctx, nil, siteID, dbSession)
-	if err != nil {
-		logger.Warn().Err(err).Msg("error getting Site from request")
-		return "", nil, nil, logger, ctx, errPublicIdentityAbsent
-	}
+	site, err := common.GetSiteFromIDString(ctx, nil, siteID, dbSession)
+	if err != nil {
+		if err == cdb.ErrDoesNotExist {
+			logger.Warn().Err(err).Msg("site not found in public identity lookup")
+			return "", nil, nil, logger, ctx, errPublicIdentityAbsent
+		}
+		logger.Error().Err(err).Msg("failed to retrieve Site for public identity lookup")
+		return "", nil, nil, logger, ctx, cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to retrieve Site", nil)
+	}

Based on learnings: cdb.ErrDoesNotExist is a non-wrapped DB sentinel in this repository, so direct equality checks are intentional and safe.

Also applies to: 170-174

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/identity.go` around lines 127 - 131, The code currently
treats any error from common.GetSiteFromIDString as a client/absent condition;
change it to check for the repository sentinel cdb.ErrDoesNotExist (using direct
equality) and preserve the existing 400/404-style response for that case, but
for any other non-nil err return a 500-style response (use
http.StatusInternalServerError via cutil.NewAPIErrorResponse), and ensure the
logger records the full error and context; apply the same change to both call
sites that use GetSiteFromIDString (the blocks that set site, err and return
with logger, span, ctx, cutil.NewAPIErrorResponse).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant