Skip to content

refactor: Migrate SSH Key Group API handlers to WithTx#499

Open
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:with-tx-sshkeygroup
Open

refactor: Migrate SSH Key Group API handlers to WithTx#499
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:with-tx-sshkeygroup

Conversation

@chet
Copy link
Copy Markdown
Contributor

@chet chet commented May 6, 2026

Description

Apply the WithTx pattern from #462 to the SSH Key Group API handlers.

Takes into consideration a few previous code reviews for integrating these, ensuring:

  • We split assignment from error condition checking (thanks @thossain-nv).
  • We use the TerminateWorkflowOnTimeOut helper and not duplicate code (thanks @thossain-nv).
  • We make sure we're consistently using outer-scope vars with WithTx and not a weird mix of WithTx and WithTxResult.

Signed-off-by: Chet Nichols III chetn@nvidia.com

Type of Change

  • Feature - New feature or functionality (feat:)
  • Fix - Bug fixes (fix:)
  • Chore - Modification or removal of existing functionality (chore:)
  • Refactor - Refactoring of existing functionality (refactor:)
  • Docs - Changes in documentation or OpenAPI schema (docs:)
  • CI - Changes in GitHub workflows. Requires additional scrutiny (ci:)
  • Version - Issuing a new release version (version:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Manager - Site Manager updated
  • Cert Manager - Cert Manager updated
  • Site Agent - Site Agent updated
  • RLA - RLA service updated
  • Powershelf Manager - Powershelf Manager updated
  • NVSwitch Manager - NVSwitch Manager updated

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@chet chet requested a review from a team as a code owner May 6, 2026 20:59
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Review Change Stack

Summary by CodeRabbit

  • Refactor
    • SSH Key Group create/update/delete flows rewritten to use atomic transaction handling and advisory locking for safer concurrency.
  • Bug Fixes
    • Fixed race conditions and stale-version conflicts; improved validation of tenant-scoped site and SSH key associations; corrected association state transitions.
  • Improvements
    • Responses now include richer, up-to-date status details and version updates; create/update/delete reliably trigger post-commit sync/delete workflows and immediate removal when no site associations exist.

Walkthrough

Replaces manual database/sql transaction handling with cdb.WithTx across Create, Update, and Delete SSH Key Group handlers; adds advisory locking inside transactions, richer validation and association state updates, status detail rows and version management inside TX, and moves workflow triggers for sync/delete to post-commit execution.

Changes

SSH Key Group Handler Refactor

Layer / File(s) Summary
Transaction Pattern
api/pkg/api/handler/sshkeygroup.go (imports, ~line 20)
Removed direct database/sql transaction use; handlers now use cdb.WithTx transactional callbacks.
Create (In-TX creation & status)
api/pkg/api/handler/sshkeygroup.go (Create: ~171–357)
Tenant-scoped site and SSH key validations, insert SSH Key Group, create associations and status-detail rows, generate version, and conditionally mark Synced when no site associations — all inside the transaction.
Create (Post-commit)
api/pkg/api/handler/sshkeygroup.go (post-Tx ~360)
Trigger sync workflows for created/loaded site associations and build API response from committed state.
Update (Locked TX & patch diffing)
api/pkg/api/handler/sshkeygroup.go (Update: ~525–880)
Run update mutations inside cdb.WithTx with advisory lock, re-read/version-check, compute added/removed/updated sets for site and SSH-key associations, update statuses/status-details, regenerate version when needed, and compute per-association workflow targets for post-commit execution.
Delete (Locked TX & conditional immediate delete)
api/pkg/api/handler/sshkeygroup.go (Delete: ~1555–1644)
Set group status to Deleting and write deleting status details inside TX; load associations and if none remain, delete group and SSH-key associations immediately; otherwise collect targets for post-commit deletion workflows.
Delete (Post-commit)
api/pkg/api/handler/sshkeygroup.go (post-Tx ~1646)
After commit, trigger delete workflows for collected association targets and return accepted response reflecting committed state.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main refactoring effort: migrating SSH Key Group handlers to use the WithTx pattern for transaction management.
Description check ✅ Passed The description is directly related to the changeset, clearly explaining the WithTx pattern application, prior review feedback incorporation, and technical considerations addressed.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-05-06 21:01:29 UTC | Commit: c403074

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
api/pkg/api/handler/sshkeygroup.go (1)

498-501: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Move the version check under the advisory lock.

The optimistic concurrency guard is currently evaluated on a pre-transaction snapshot. Two concurrent PATCHes can both pass Line 498, then serialize on the lock and still apply stale writes. Reload the SSH Key Group after Line 543 and compare Version inside the locked transaction before mutating anything.

Suggested fix
-	// Verify version with current one
-	if *skg.Version != *apiRequest.Version {
-		return cutil.NewAPIErrorResponse(c, http.StatusForbidden, "Version for SSH Key Group in request does not match with current SSH Key Group. Please fetch latest object before updating.", nil)
-	}
-
 	skgDAO := cdbm.NewSSHKeyGroupDAO(uskgh.dbSession)
@@
 	err = cdb.WithTx(ctx, uskgh.dbSession, func(tx *cdb.Tx) error {
 		// Acquire an advisory lock on the SSH Key Group on which there could be contention
 		// this lock is released when the transaction commits or rollsback
 		derr := tx.TryAcquireAdvisoryLock(ctx, cdb.GetAdvisoryLockIDFromString(skg.ID.String()), nil)
 		if derr != nil {
 			logger.Error().Err(derr).Msg("Failed to acquire advisory lock on SSH Key Group")
 			return cutil.NewAPIError(http.StatusInternalServerError, "Failed to update SSH Key Group, could not acquire DB lock", nil)
 		}
+
+		lockedSKG, derr := common.GetSSHKeyGroupFromIDString(ctx, tx, sshKeyGroupStrID, uskgh.dbSession, nil)
+		if derr != nil {
+			logger.Error().Err(derr).Msg("error retrieving SSH Key Group from DB by ID")
+			return cutil.NewAPIError(http.StatusInternalServerError, "Failed to update SSH Key Group, DB error", nil)
+		}
+		if *lockedSKG.Version != *apiRequest.Version {
+			return cutil.NewAPIError(http.StatusForbidden, "Version for SSH Key Group in request does not match with current SSH Key Group. Please fetch latest object before updating.", nil)
+		}
+		skg = lockedSKG

Also applies to: 540-547

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/sshkeygroup.go` around lines 498 - 501, Remove the
optimistic version check currently performed on skg before the advisory lock and
instead, after you acquire the advisory lock and are inside the transaction
(i.e., immediately after the lock acquisition), re-load the SSH Key Group from
the DB (using the same loader used earlier that fetched skg) and compare the
reloaded entity's Version with apiRequest.Version; if they differ return the
same cutil.NewAPIErrorResponse(..., http.StatusForbidden, ...) and abort the
update. Apply this same change to the other update path noted (the code region
currently around the other version check), so all version checks happen inside
the locked transaction against a freshly reloaded SSH Key Group before any
mutation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Around line 785-827: The response is built from dbskgsd which is fetched
before the syncRequired branch inserts a new status-detail; after creating the
new SSHKeyGroupStatusSyncing entry (sdDAO.CreateFromParams) you must re-query
status details into dbskgsd (call sdDAO.GetAllByEntityID with the same params
used earlier) so the response includes the newly created status-detail; place
this refresh immediately after the CreateFromParams success path (and keep
existing error handling).
- Around line 574-576: The code only loads existing associations when
apiRequest.SiteIDs or apiRequest.SSHKeyIDs are present, causing partial PATCHes
to miss the other side's current associations; always fetch both sets up front:
call skgsaDAO.GetAll(...) to populate existingSiteAssociationIDMap and
skgkaDAO.GetAll(...) to populate existingKeyAssociationIDMap before any
apiRequest field checks, then use apiRequest.SiteIDs / apiRequest.SSHKeyIDs only
to decide adds/removals and to set Syncing and syncRequired accordingly (leave
the rest of the sync/state logic unchanged).

---

Outside diff comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Around line 498-501: Remove the optimistic version check currently performed
on skg before the advisory lock and instead, after you acquire the advisory lock
and are inside the transaction (i.e., immediately after the lock acquisition),
re-load the SSH Key Group from the DB (using the same loader used earlier that
fetched skg) and compare the reloaded entity's Version with apiRequest.Version;
if they differ return the same cutil.NewAPIErrorResponse(...,
http.StatusForbidden, ...) and abort the update. Apply this same change to the
other update path noted (the code region currently around the other version
check), so all version checks happen inside the locked transaction against a
freshly reloaded SSH Key Group before any mutation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6b34d897-2de1-4c0d-9ab4-b97df6202781

📥 Commits

Reviewing files that changed from the base of the PR and between ea1ce0f and c403074.

📒 Files selected for processing (1)
  • api/pkg/api/handler/sshkeygroup.go

Comment thread api/pkg/api/handler/sshkeygroup.go Outdated
Comment thread api/pkg/api/handler/sshkeygroup.go
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-nsm 64 2 20 33 9 0
nico-psm 56 4 29 13 2 8
nico-rest-api 57 4 30 13 2 8
nico-rest-cert-manager 54 4 28 13 1 8
nico-rest-db 55 4 28 13 2 8
nico-rest-site-agent 54 4 28 13 1 8
nico-rest-site-manager 54 4 28 13 1 8
nico-rest-workflow 56 4 29 13 2 8
nico-rla 55 4 28 13 2 8
TOTAL 505 34 248 137 22 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@chet chet force-pushed the with-tx-sshkeygroup branch from c403074 to 0856106 Compare May 6, 2026 21:18
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Around line 540-547: The optimistic version check must be repeated inside the
transaction after acquiring the advisory lock to avoid lost updates: after
tx.TryAcquireAdvisoryLock (inside the cdb.WithTx callback) re-read the current
SSH Key Group row/version (use the same store/DB accessor used elsewhere in this
handler) and compare its version to the incoming request's expected version; if
they differ return an API error (e.g. via cutil.NewAPIError) and abort the
transaction so stale updates are rejected. Ensure you reference the same skg.ID
used for the lock and return the same HTTP 409/appropriate error on mismatch.
- Around line 737-749: The loop that flips existingSiteAssociationIDMap entries
to Syncing must skip associations already in the deleting state: inside the for
stID, sga := range existingSiteAssociationIDMap loop (before calling
skgsaDAO.UpdateFromParams), add a guard that if sga.Status ==
cdbm.SSHKeyGroupSiteAssociationStatusDeleting then continue (do not call
UpdateFromParams); this prevents re-activating entries during key-only PATCHes
(when apiRequest.SiteIDs == nil) and ensures only non-deleting associations are
transitioned to cdbm.SSHKeyGroupSiteAssociationStatusSyncing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e6479dc1-e771-488e-9bb6-5b5e31534286

📥 Commits

Reviewing files that changed from the base of the PR and between c403074 and 0856106.

📒 Files selected for processing (1)
  • api/pkg/api/handler/sshkeygroup.go

Comment thread api/pkg/api/handler/sshkeygroup.go
Comment thread api/pkg/api/handler/sshkeygroup.go
@chet chet force-pushed the with-tx-sshkeygroup branch from 0856106 to dc75c14 Compare May 7, 2026 04:46
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
api/pkg/api/handler/sshkeygroup.go (2)

662-662: 💤 Low value

Omit unnecessary blank identifier in range expression.

Same as above—idiomatic Go elides the blank identifier.

♻️ Suggested refinement
-		for sgaID, _ := range deletingSiteAssociationIDMap {
+		for sgaID := range deletingSiteAssociationIDMap {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/sshkeygroup.go` at line 662, Replace the range loop that
uses an unnecessary blank identifier over deletingSiteAssociationIDMap
(currently written as "for sgaID, _ := range deletingSiteAssociationIDMap") with
the idiomatic form "for sgaID := range deletingSiteAssociationIDMap" in the SSH
key group deletion logic (refer to the loop using deletingSiteAssociationIDMap
in sshkeygroup.go) so the blank identifier is removed while preserving sgaID
usage.

612-612: 💤 Low value

Omit unnecessary blank identifier in range expression.

The blank identifier _ is superfluous when only the key is required; idiomatic Go uses for stID := range.

♻️ Suggested refinement
-		for stID, _ := range newSiteAssociationIDMap {
+		for stID := range newSiteAssociationIDMap {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/sshkeygroup.go` at line 612, The range loop over
newSiteAssociationIDMap uses an unnecessary blank identifier; update the loop in
sshkeygroup.go to use the idiomatic form by iterating with "for stID := range
newSiteAssociationIDMap" (replace "for stID, _ := range
newSiteAssociationIDMap") so only the key is captured; ensure any code inside
the loop still references stID as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Around line 1562-1569: The error message inside the transaction advisory lock
failure is incorrect for the Delete flow; update the cutil.NewAPIError call in
the cdb.WithTx block (where tx.TryAcquireAdvisoryLock is called) to reference
"delete" instead of "update" (e.g., "Failed to delete SSH Key Group, could not
acquire data store lock on Group") so the API error and logs accurately reflect
the Delete handler; keep the rest of the logic (logger.Error() and error return)
unchanged.

---

Nitpick comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Line 662: Replace the range loop that uses an unnecessary blank identifier
over deletingSiteAssociationIDMap (currently written as "for sgaID, _ := range
deletingSiteAssociationIDMap") with the idiomatic form "for sgaID := range
deletingSiteAssociationIDMap" in the SSH key group deletion logic (refer to the
loop using deletingSiteAssociationIDMap in sshkeygroup.go) so the blank
identifier is removed while preserving sgaID usage.
- Line 612: The range loop over newSiteAssociationIDMap uses an unnecessary
blank identifier; update the loop in sshkeygroup.go to use the idiomatic form by
iterating with "for stID := range newSiteAssociationIDMap" (replace "for stID, _
:= range newSiteAssociationIDMap") so only the key is captured; ensure any code
inside the loop still references stID as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b99a8fe4-9cb8-4035-b12d-d221c55f60e0

📥 Commits

Reviewing files that changed from the base of the PR and between 0856106 and dc75c14.

📒 Files selected for processing (1)
  • api/pkg/api/handler/sshkeygroup.go

Comment thread api/pkg/api/handler/sshkeygroup.go
Apply the WithTx pattern from NVIDIA#462 to the SSH Key Group API handlers.

Takes into consideration a few previous code reviews for integrating these, ensuring:
- We split assignment from error condition checking (thanks @thossain-nv).
- We use the `TerminateWorkflowOnTimeOut` helper and not duplicate code (thanks @thossain-nv).
- We make sure we're consistently using outer-scope vars with `WithTx` and not a weird mix of `WithTx` and `WithTxResult`.

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet chet force-pushed the with-tx-sshkeygroup branch from dc75c14 to bee9493 Compare May 7, 2026 06:06
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
api/pkg/api/handler/sshkeygroup.go (1)

1560-1560: 💤 Low value

Variable name skgsasToSync is misleading in the Delete handler context.

This variable holds associations that will trigger ExecuteDeleteSSHKeyGroupWorkflow (line 1649), not sync workflows. Renaming to skgsasToDelete would improve clarity and align with the variable naming in the Update handler.

♻️ Suggested rename for clarity
-	var skgsasToSync []cdbm.SSHKeyGroupSiteAssociation
+	var skgsasToDelete []cdbm.SSHKeyGroupSiteAssociation

Then update references at lines 1591, 1598, 1617, and 1647.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/sshkeygroup.go` at line 1560, Rename the misleading
variable skgsasToSync in the Delete handler to skgsasToDelete to reflect that
these associations will be deleted (used to trigger
ExecuteDeleteSSHKeyGroupWorkflow), and update every reference to that variable
in the Delete handler (all places where it’s declared, appended to, iterated
over, and passed into ExecuteDeleteSSHKeyGroupWorkflow). Ensure the new name is
used consistently and mirrors the naming used in the Update handler to improve
clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Around line 854-874: Add a clarifying comment in the block that branches on
newSSHKeyIDMap/deletingKeyAssociationIDMap (the code iterating dbskgsas and
populating skgsasToSync/skgsasToDelete) explaining that when keys are
added/removed (newSSHKeyIDMap or deletingKeyAssociationIDMap non-empty) delete
workflows are intentionally deferred to the inventory reconciliation process;
note that associations with Status == SSHKeyGroupSiteAssociationStatusDeleting
are persisted earlier but skipped here, so immediate skgsasToDelete population
is skipped and cleanup will occur asynchronously via inventory reconciliation
rather than in this commit path.

---

Nitpick comments:
In `@api/pkg/api/handler/sshkeygroup.go`:
- Line 1560: Rename the misleading variable skgsasToSync in the Delete handler
to skgsasToDelete to reflect that these associations will be deleted (used to
trigger ExecuteDeleteSSHKeyGroupWorkflow), and update every reference to that
variable in the Delete handler (all places where it’s declared, appended to,
iterated over, and passed into ExecuteDeleteSSHKeyGroupWorkflow). Ensure the new
name is used consistently and mirrors the naming used in the Update handler to
improve clarity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 74189f44-aaf1-4a9d-bdfd-57693d0d74bf

📥 Commits

Reviewing files that changed from the base of the PR and between dc75c14 and bee9493.

📒 Files selected for processing (1)
  • api/pkg/api/handler/sshkeygroup.go

Comment on lines +854 to 874
// Determine workflow targets to trigger after commit
// If keys are added or removed, trigger workflow to sync SSH Key Group across all Sites, except for the ones that are deleted
if len(newSSHKeyIDMap) > 0 || len(deletingKeyAssociationIDMap) > 0 {
for _, skgsa := range dbskgsas {
if skgsa.Status == cdbm.SSHKeyGroupSiteAssociationStatusDeleting {
continue
}

skgsasToSync = append(skgsasToSync, skgsa)
}
} else {
for _, skgsa := range dbskgsas {
if newSiteAssociationIDMap[skgsa.SiteID.String()] {
skgsasToSync = append(skgsasToSync, skgsa)
}
} else {
for _, skgsa := range dbskgsas {
if newSiteAssociationIDMap[skgsa.SiteID.String()] {
skgsasToSync = append(skgsasToSync, skgsa)
}

if deletingSiteAssociationIDMap[skgsa.ID] {
skgsasToDelete = append(skgsasToDelete, skgsa)
if deletingSiteAssociationIDMap[skgsa.ID] {
skgsasToDelete = append(skgsasToDelete, skgsa)
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if similar handlers trigger delete workflows in mixed-change scenarios
# Expected: Understand the pattern used elsewhere for delete workflow triggering

rg -nP --type=go -C5 'ExecuteDeleteSSHKeyGroupWorkflow|skgsasToDelete' api/pkg/api/handler/

Repository: NVIDIA/infra-controller-rest

Length of output: 3536


🏁 Script executed:

# Check the context around line 663 where associations are marked as Deleting
sed -n '650,680p' api/pkg/api/handler/sshkeygroup.go

Repository: NVIDIA/infra-controller-rest

Length of output: 1467


🏁 Script executed:

# Search for inventory reconciliation or eventual cleanup mechanism
rg -n 'inventory|reconciliation|eventual' api/pkg/api/handler/sshkeygroup.go

Repository: NVIDIA/infra-controller-rest

Length of output: 400


🏁 Script executed:

# Check for any comments or TODOs about delete workflow or deleting associations
rg -n -B2 -A2 'TODO|FIXME|eventually' api/pkg/api/handler/sshkeygroup.go

Repository: NVIDIA/infra-controller-rest

Length of output: 54


🏁 Script executed:

# Get broader context of the PATCH handler to understand the full flow
sed -n '500,550p' api/pkg/api/handler/sshkeygroup.go

Repository: NVIDIA/infra-controller-rest

Length of output: 2378


Add clarifying comment: Delete workflows deferred to inventory reconciliation when keys change and sites are removed simultaneously.

The current control flow defers delete workflow execution to the inventory reconciliation mechanism when newSSHKeyIDMap or deletingKeyAssociationIDMap is non-empty. Associations marked as SSHKeyGroupSiteAssociationStatusDeleting are explicitly persisted to the database (lines 663–680) but skipped from skgsasToDelete population, preventing delete workflow execution in this code path.

While this behavior appears intentional (as evidenced by "unsynced groups will be triggered by inventory" comments elsewhere in the codebase), the deferral strategy should be explicitly documented at lines 854–874 to clarify that cleanup occurs asynchronously via inventory reconciliation rather than immediately. Add a comment explaining this design decision to prevent future maintainers from misinterpreting the skipped delete workflow execution as a bug.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/sshkeygroup.go` around lines 854 - 874, Add a clarifying
comment in the block that branches on newSSHKeyIDMap/deletingKeyAssociationIDMap
(the code iterating dbskgsas and populating skgsasToSync/skgsasToDelete)
explaining that when keys are added/removed (newSSHKeyIDMap or
deletingKeyAssociationIDMap non-empty) delete workflows are intentionally
deferred to the inventory reconciliation process; note that associations with
Status == SSHKeyGroupSiteAssociationStatusDeleting are persisted earlier but
skipped here, so immediate skgsasToDelete population is skipped and cleanup will
occur asynchronously via inventory reconciliation rather than in this commit
path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant