refactor: Migrate NVLink Logical Partition API handlers to WithTx#494
refactor: Migrate NVLink Logical Partition API handlers to WithTx#494chet wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReplace manual ChangesNVLink Logical Partition Transaction & Workflow Migration
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as API Handler
participant DB
participant Temporal as Site Temporal
participant WFUtil as wfutil
Client->>API: Create/Update/Delete request
API->>DB: cdb.WithTx / cdb.WithTxResult (start tx)
DB-->>API: tx context
API->>DB: insert/update NVLinkLogicalPartition + StatusDetail (within tx)
API->>Temporal: Schedule & wait Site workflow (within tx)
Temporal-->>API: workflow result or timeout
API->>DB: tx commit/rollback (via cdb helper)
alt workflow timed out
API->>Temporal: TerminateWorkflow (deferred, post-rollback)
end
API->>WFUtil: reconcile status from workflow result (post-tx)
WFUtil-->>DB: update partition status/details
API->>Client: return entity + aggregated status details
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-05-06 19:00:34 UTC | Commit: 38d169e |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/pkg/api/handler/nvlinklogicalpartition.go`:
- Around line 1251-1256: The delete path currently only checks the error from
sdDAO.CreateFromParams and may allow a nil result to pass, so update the block
using sdDAO.CreateFromParams (the call that passes nvllp.ID.String(),
cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusDeleting), and the message
"Received request for deletion, pending processing") to also validate the
returned result is non-nil; if the returned status-detail is nil (even with nil
error) log an error with logger.Error().Msg and return the same
cutil.NewAPIError(http.StatusInternalServerError, "Failed to create Status
Detail for NVLink Logical Partition deletion", nil) to abort the transaction.
Ensure you reference the same variables (ctx, tx) and preserve existing error
logging for non-nil errors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: efc04cfb-ab90-4db9-9cbe-f7f294d9de00
📒 Files selected for processing (1)
api/pkg/api/handler/nvlinklogicalpartition.go
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
…ain-nv Match the convention in the rest of the codebase by splitting the combined `if X := Y(); X != nil { ... }` patterns introduced by the WithTx migration into two statements. Called out by CodeRabbit on PR NVIDIA#494. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Chet Nichols III <chetn@nvidia.com>
38d169e to
10c660c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/pkg/api/handler/nvlinklogicalpartition.go`:
- Around line 210-215: Handlers currently call timeoutResp() and return its
result before checking the transaction helper error from cdb.WithTx /
cdb.WithTxResult, which masks DB/tx failures; change each handler (the
timeoutResp variable usage and the cdb.WithTx / cdb.WithTxResult call sites) to
first check the tx helper error (err) and if err != nil return that error (or
wrap/preserve it), otherwise if err == nil then call and return timeoutResp();
do the same for all listed locations so transaction rollback/unwind failures
surface before returning the timeout/TerminateWorkflow response.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1dd3754b-0861-4494-a6e4-503efdd207c5
📒 Files selected for processing (1)
api/pkg/api/handler/nvlinklogicalpartition.go
1457877 to
5094e9a
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
api/pkg/api/handler/nvlinklogicalpartition.go (2)
237-238: 💤 Low valueMinor inefficiency: unnecessary pointer allocation and dereference.
The expression
*cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusPending)allocates a pointer to the status constant and immediately dereferences it. IfCreateFromParamsaccepts a plainstringfor the status parameter, pass the constant directly.Proposed simplification
- ssd, derr = sdDAO.CreateFromParams(ctx, tx, nvllp.ID.String(), *cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusPending), + ssd, derr = sdDAO.CreateFromParams(ctx, tx, nvllp.ID.String(), cdbm.NVLinkLogicalPartitionStatusPending, cdb.GetStrPtr("received NVLink Logical Partition creation request, pending"))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/pkg/api/handler/nvlinklogicalpartition.go` around lines 237 - 238, The call to sdDAO.CreateFromParams is needlessly allocating and immediately dereferencing a pointer for the status argument using *cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusPending); change that argument to pass the plain constant cdbm.NVLinkLogicalPartitionStatusPending directly (leave other params like nvllp.ID.String() and the description as-is) to avoid the unnecessary pointer allocation and dereference.
1276-1277: 💤 Low valueMinor inefficiency: same unnecessary pointer allocation pattern.
Same observation as in the Create handler—
*cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusDeleting)can be simplified if the parameter accepts a plain string.Proposed simplification
- ssd, derr := sdDAO.CreateFromParams(ctx, tx, nvllp.ID.String(), *cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusDeleting), + ssd, derr := sdDAO.CreateFromParams(ctx, tx, nvllp.ID.String(), cdbm.NVLinkLogicalPartitionStatusDeleting, cdb.GetStrPtr("Received request for deletion, pending processing"))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/pkg/api/handler/nvlinklogicalpartition.go` around lines 1276 - 1277, In sdDAO.CreateFromParams call in nvlinklogicalpartition.go (the line creating `ssd`), remove the unnecessary pointer allocation `*cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusDeleting)` and pass the plain string constant `cdbm.NVLinkLogicalPartitionStatusDeleting` (or a local string variable) directly; likewise stop calling `cdb.GetStrPtr` for that status value—update the `ssd, derr := sdDAO.CreateFromParams(ctx, tx, nvllp.ID.String(), ...)` invocation to use the plain string status and keep the message parameter as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@api/pkg/api/handler/nvlinklogicalpartition.go`:
- Around line 237-238: The call to sdDAO.CreateFromParams is needlessly
allocating and immediately dereferencing a pointer for the status argument using
*cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusPending); change that argument
to pass the plain constant cdbm.NVLinkLogicalPartitionStatusPending directly
(leave other params like nvllp.ID.String() and the description as-is) to avoid
the unnecessary pointer allocation and dereference.
- Around line 1276-1277: In sdDAO.CreateFromParams call in
nvlinklogicalpartition.go (the line creating `ssd`), remove the unnecessary
pointer allocation `*cdb.GetStrPtr(cdbm.NVLinkLogicalPartitionStatusDeleting)`
and pass the plain string constant `cdbm.NVLinkLogicalPartitionStatusDeleting`
(or a local string variable) directly; likewise stop calling `cdb.GetStrPtr` for
that status value—update the `ssd, derr := sdDAO.CreateFromParams(ctx, tx,
nvllp.ID.String(), ...)` invocation to use the plain string status and keep the
message parameter as-is.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 93d546a6-7e9b-4aa8-b0a8-22b24020fb78
📒 Files selected for processing (1)
api/pkg/api/handler/nvlinklogicalpartition.go
Applies the `WithTx` pattern from NVIDIA#462 to the NVLink Logical Partition handlers. Integrates `TerminateWorkflowOnTimeout` as well for extra squeaky clean-ness. CodeRabbit feedback addressed: - Propagate `StatusDetail` creation failure as an API error in the Delete handler (matches the Create handler) instead of just logging. Signed-off-by: Chet Nichols III <chetn@nvidia.com>
5094e9a to
95359e6
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/pkg/api/handler/nvlinklogicalpartition.go (1)
205-317: 🏗️ Heavy liftAdd regression coverage for the new
WithTxwrite paths.These branches now depend on transaction-helper rollback semantics, delayed timeout termination, and special-case workflow error handling. Please add focused coverage for at least: workflow timeout in create/update/delete, description-only update payload generation, delete
StatusDetailcreation failure, and delete NICo object-not-found handling. This refactor is otherwise easy to regress silently.Also applies to: 1011-1102, 1234-1334
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/pkg/api/handler/nvlinklogicalpartition.go` around lines 205 - 317, Add regression tests exercising the new cdb.WithTx write paths: create focused tests that (1) simulate Temporal client (scp.GetClientByID / stc.ExecuteWorkflow) timeouts when running "CreateNVLinkLogicalPartition" so the wferr path that sets timeoutResp and calls common.TerminateWorkflowOnTimeOut after tx rollback is exercised, (2) verify update flows that only change Description produce the correct payload and DB updates, (3) simulate sdDAO.CreateFromParams failing during delete to assert proper error handling/reporting, and (4) simulate delete NICo object-not-found responses to ensure that path is handled as expected; use mocks for nvllpDAO, sdDAO, scp/stc, and common.UnwrapWorkflowError to drive the error branches and assert timeoutResp is non-nil and TerminateWorkflowOnTimeOut is invoked after WithTx returns.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@api/pkg/api/handler/nvlinklogicalpartition.go`:
- Around line 205-317: Add regression tests exercising the new cdb.WithTx write
paths: create focused tests that (1) simulate Temporal client (scp.GetClientByID
/ stc.ExecuteWorkflow) timeouts when running "CreateNVLinkLogicalPartition" so
the wferr path that sets timeoutResp and calls common.TerminateWorkflowOnTimeOut
after tx rollback is exercised, (2) verify update flows that only change
Description produce the correct payload and DB updates, (3) simulate
sdDAO.CreateFromParams failing during delete to assert proper error
handling/reporting, and (4) simulate delete NICo object-not-found responses to
ensure that path is handled as expected; use mocks for nvllpDAO, sdDAO, scp/stc,
and common.UnwrapWorkflowError to drive the error branches and assert
timeoutResp is non-nil and TerminateWorkflowOnTimeOut is invoked after WithTx
returns.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: dfc2b363-d7b4-456c-be6d-bca30021c4dd
📒 Files selected for processing (1)
api/pkg/api/handler/nvlinklogicalpartition.go
Description
Applies the
WithTxpattern from #462 to the NVLink Logical Partition handlers. IntegratesTerminateWorkflowOnTimeoutas well for extra squeaky clean-ness.CodeRabbit feedback addressed:
StatusDetailcreation failure as an API error in the Delete handler (matches the Create handler) instead of just logging.Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Services Affected
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes