Skip to content

refactor: Migrate Infiniband Partition API handlers to WithTx#497

Merged
chet merged 3 commits into
NVIDIA:mainfrom
chet:with-tx-infinibandpartition
May 7, 2026
Merged

refactor: Migrate Infiniband Partition API handlers to WithTx#497
chet merged 3 commits into
NVIDIA:mainfrom
chet:with-tx-infinibandpartition

Conversation

@chet
Copy link
Copy Markdown
Contributor

@chet chet commented May 6, 2026

Description

Apply WithTx from #462 to the Infiniband Partiton handlers, ensuring the split assign/condition is [re]factored correctly, as well as using the workflow termination helper (TerminateWorkflowOnTimeOut).

Coderabbit feedback addressed locally:

  • Also propagates StatusDetail creation failure as an API error in the Delete handler (which matches the Create handler) instead of just logging.

Signed-off-by: Chet Nichols III chetn@nvidia.com

Type of Change

  • Feature - New feature or functionality (feat:)
  • Fix - Bug fixes (fix:)
  • Chore - Modification or removal of existing functionality (chore:)
  • Refactor - Refactoring of existing functionality (refactor:)
  • Docs - Changes in documentation or OpenAPI schema (docs:)
  • CI - Changes in GitHub workflows. Requires additional scrutiny (ci:)
  • Version - Issuing a new release version (version:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Manager - Site Manager updated
  • Cert Manager - Cert Manager updated
  • Site Agent - Site Agent updated
  • RLA - RLA service updated
  • Powershelf Manager - Powershelf Manager updated
  • NVSwitch Manager - NVSwitch Manager updated

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Apply `WithTx` from NVIDIA#462 to the Infiniband Partiton handlers, ensuring the split assign/condition is [re]factored correctly, as well as using the workflow termination helper!

Coderabbit feedback addressed locally:
- Propagates `StatusDetail` creation failure as an API error in the Delete handler (which matches the Create handler) instead of just logging.

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet chet requested a review from a team as a code owner May 6, 2026 20:39
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Walkthrough

The InfiniBandPartition handlers (Create, Update, Delete) are refactored from direct database transactions to a unified workflow-driven, transaction-wrapped pattern. Each handler now uses cdb.WithTx closures, introduces StatusDetail DAO for tracking, orchestrates Temporal site workflows, and manages timeouts through deferred termination logic.

Changes

Workflow-Driven Handler Refactor

Layer / File(s) Summary
Data Access Layer
api/pkg/api/handler/infinibandpartition.go (lines 213–335, 824–915, 1029–1126)
StatusDetail DAO (sdDAO) is initialized within transaction scope to track and persist status transitions alongside InfiniBandPartition record mutations.
Transaction Orchestration
api/pkg/api/handler/infinibandpartition.go (lines 213–335, 824–915, 1029–1126)
cdb.WithTx closures replace raw database/sql transactions. Create, Update, and Delete handlers now bind record mutations and DAO operations within transactional boundaries.
Workflow Initiation & Waiting
api/pkg/api/handler/infinibandpartition.go (lines 213–335, 824–915, 1029–1126)
Temporal site client is retrieved and workflows (CreateInfiniBandPartitionV2, UpdateInfiniBandPartition, DeleteInfiniBandPartitionV2) are started and awaited within transaction context. Status Detail entries are created prior to workflow submission.
Timeout & Termination Handling
api/pkg/api/handler/infinibandpartition.go (lines 213–335, 824–915, 1029–1126)
Timeout responses trigger deferred workflow termination via TerminateWorkflowOnTimeOut, executed outside transaction scope. Error paths distinguish between transient timeouts and operational failures.
Status Aggregation & Response Assembly
api/pkg/api/handler/infinibandpartition.go (lines 213–335, 824–915, 1029–1126)
Post-workflow completion, current status details are collected and aggregated into the API response. Create returns 201 with entity; Update and Delete return 202 (Accepted) pending asynchronous completion.
Import Cleanup
api/pkg/api/handler/infinibandpartition.go (line 21)
Removed database/sql import, reflecting elimination of manual SQL transaction patterns.

Sequence Diagram

sequenceDiagram
    participant Handler as CreateInfiniBandPartition<br/>Handler
    participant TxMgr as Transaction<br/>Manager (WithTx)
    participant DAO as InfiniBandPartition<br/>& StatusDetail DAO
    participant Temporal as Temporal<br/>Workflow Engine
    participant Site as Site<br/>Service

    Handler->>TxMgr: Begin transaction scope
    TxMgr->>DAO: Insert InfiniBandPartition record
    DAO-->>TxMgr: Record created with ID
    TxMgr->>DAO: Insert Status Detail (Pending)
    DAO-->>TxMgr: Status tracked
    TxMgr->>Temporal: Get site client
    Temporal-->>TxMgr: Client ready
    TxMgr->>Temporal: Start CreateInfiniBandPartitionV2 workflow
    Temporal-->>TxMgr: Workflow ID, awaiting completion
    TxMgr->>Temporal: Wait for workflow result (with timeout)
    par Workflow Execution
        Temporal->>Site: Execute provisioning operations
        Site-->>Temporal: Provisioning complete
    end
    alt Workflow Completes
        Temporal-->>TxMgr: Success result
        TxMgr->>DAO: Collect final Status Details
        DAO-->>TxMgr: Status details retrieved
        TxMgr-->>TxMgr: Commit transaction
        TxMgr-->>Handler: Success with created entity & status
        Handler-->>Handler: Return 201 Created
    else Timeout Occurs
        Temporal-->>TxMgr: Timeout
        TxMgr-->>TxMgr: Commit & prepare termination
        TxMgr->>Temporal: Schedule TerminateWorkflow (deferred)
        TxMgr-->>Handler: Accepted (202)
        Handler-->>Handler: Return 202 Accepted
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: migrating InfiniBand Partition API handlers to use the WithTx pattern, which aligns with the refactor's core objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description accurately describes the refactoring scope: applying WithTx pattern to InfiniBand Partition handlers, addressing CodeRabbit feedback regarding StatusDetail error handling in the Delete handler, and integrating workflow termination helpers.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-05-06 20:41:16 UTC | Commit: 90cc9ad

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
api/pkg/api/handler/infinibandpartition.go (2)

244-245: ⚡ Quick win

Redundant pointer allocation and dereference.

*cdb.GetStrPtr(cdbm.InfiniBandPartitionStatusPending) allocates a string pointer on the heap only to immediately dereference it. Use the constant directly, consistent with the established pattern in other handlers (e.g., vpc.go).

♻️ Proposed simplification
-		newSSD, derr := sdDAO.CreateFromParams(ctx, tx, ibp.ID.String(), *cdb.GetStrPtr(cdbm.InfiniBandPartitionStatusPending),
+		newSSD, derr := sdDAO.CreateFromParams(ctx, tx, ibp.ID.String(), cdbm.InfiniBandPartitionStatusPending,
 			cdb.GetStrPtr("received InfiniBand Partition creation request, pending"))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/infinibandpartition.go` around lines 244 - 245, Replace
the redundant allocate-and-dereference pattern used for the status argument in
the sdDAO.CreateFromParams call: remove the unnecessary *cdb.GetStrPtr(...) and
pass the constant cdbm.InfiniBandPartitionStatusPending directly (keep the other
parameter using cdb.GetStrPtr("received InfiniBand Partition creation request,
pending") unchanged); update the call that assigns newSSD from
sdDAO.CreateFromParams so it uses cdbm.InfiniBandPartitionStatusPending instead
of *cdb.GetStrPtr(cdbm.InfiniBandPartitionStatusPending).

1050-1055: ⚡ Quick win

Same redundant pointer pattern as Create handler.

Apply the same simplification here for consistency.

♻️ Proposed simplification
 		// Create status detail
-		if _, derr := sdDAO.CreateFromParams(ctx, tx, ibp.ID.String(), *cdb.GetStrPtr(cdbm.InfiniBandPartitionStatusDeleting),
+		if _, derr := sdDAO.CreateFromParams(ctx, tx, ibp.ID.String(), cdbm.InfiniBandPartitionStatusDeleting,
 			cdb.GetStrPtr("Received request for deletion, pending processing")); derr != nil {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/infinibandpartition.go` around lines 1050 - 1055, The
inline cdb.GetStrPtr calls used when creating the Status Detail are redundant;
mirror the Create handler simplification by creating local variables (e.g.,
statusPtr and msgPtr) using cdb.GetStrPtr for
cdbm.InfiniBandPartitionStatusDeleting and the deletion message, then pass those
locals to sdDAO.CreateFromParams instead of calling cdb.GetStrPtr inline; update
the call around sdDAO.CreateFromParams, keep the same error handling
(logger.Error().Err(derr).Msg...) and return cutil.NewAPIError as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@api/pkg/api/handler/infinibandpartition.go`:
- Around line 244-245: Replace the redundant allocate-and-dereference pattern
used for the status argument in the sdDAO.CreateFromParams call: remove the
unnecessary *cdb.GetStrPtr(...) and pass the constant
cdbm.InfiniBandPartitionStatusPending directly (keep the other parameter using
cdb.GetStrPtr("received InfiniBand Partition creation request, pending")
unchanged); update the call that assigns newSSD from sdDAO.CreateFromParams so
it uses cdbm.InfiniBandPartitionStatusPending instead of
*cdb.GetStrPtr(cdbm.InfiniBandPartitionStatusPending).
- Around line 1050-1055: The inline cdb.GetStrPtr calls used when creating the
Status Detail are redundant; mirror the Create handler simplification by
creating local variables (e.g., statusPtr and msgPtr) using cdb.GetStrPtr for
cdbm.InfiniBandPartitionStatusDeleting and the deletion message, then pass those
locals to sdDAO.CreateFromParams instead of calling cdb.GetStrPtr inline; update
the call around sdDAO.CreateFromParams, keep the same error handling
(logger.Error().Err(derr).Msg...) and return cutil.NewAPIError as before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 265be689-7865-42cd-9317-0a1076d1f583

📥 Commits

Reviewing files that changed from the base of the PR and between ea1ce0f and 90cc9ad.

📒 Files selected for processing (1)
  • api/pkg/api/handler/infinibandpartition.go

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-nsm 64 2 20 33 9 0
nico-psm 56 4 29 13 2 8
nico-rest-api 57 4 30 13 2 8
nico-rest-cert-manager 54 4 28 13 1 8
nico-rest-db 55 4 28 13 2 8
nico-rest-site-agent 54 4 28 13 1 8
nico-rest-site-manager 54 4 28 13 1 8
nico-rest-workflow 56 4 29 13 2 8
nico-rla 55 4 28 13 2 8
TOTAL 505 34 248 137 22 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Copy link
Copy Markdown
Contributor

@thossain-nv thossain-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you @chet

@chet chet merged commit 56f1638 into NVIDIA:main May 7, 2026
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants