refactor: Migrate VPC Prefix API handlers to WithTx#498
Conversation
Summary by CodeRabbit
WalkthroughVPC prefix Create/Update/Delete handlers were refactored to use cdb.WithTx closure transactions, remove direct database/sql usage, centralize advisory locking and status-detail writes inside the closure, and execute synchronous Temporal workflows with deferred timeout termination after the transaction unwinds. ChangesVPC Prefix Handler Refactoring with Temporal Workflow Integration
Sequence DiagramsequenceDiagram
participant Handler as Handler
participant WithTx as WithTx\n(Transaction)
participant DB as Database
participant Status as Status\nDetails
participant Temporal as Temporal\nClient
participant Workflow as Workflow\nExecutor
Handler->>WithTx: Begin closure with TX
WithTx->>DB: Acquire advisory lock (if needed)
Note over DB: Lock held for duration
rect rgba(100, 150, 200, 0.5)
Note over WithTx,DB: Core Operation (Create/Update/Delete)
WithTx->>DB: Execute entity operation (insert/update/delete), create/read status detail
WithTx->>Status: create/read status detail
DB-->>WithTx: Operation result
end
WithTx->>Temporal: Retrieve Temporal client
WithTx->>Workflow: Construct workflow request
WithTx->>Workflow: Execute synchronously (with timeout)
Workflow-->>WithTx: Workflow status/result
WithTx->>Handler: Closure returns (tx unwinds)
alt timeout
Handler->>Temporal: TerminateWorkflowOnTimeOut deferred
end
Handler-->>Client: API response with entity + status detail
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-05-06 20:51:26 UTC | Commit: 49fedcb |
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
thossain-nv
left a comment
There was a problem hiding this comment.
Looks good, thanks @chet
Applies `WithTx` from NVIDIA#462 to the VPC Prefix API handlers. Takes into consideration a few previous code reviews for integrating these, ensuring: - We split assignment from error condition checking (thanks @thossain-nv). - We use the `TerminateWorkflowOnTimeOut` helper and not duplicate code (thanks @thossain-nv). - We make sure we're consistently using outer-scope vars with `WithTx` and not a weird mix of `WithTx` and `WithTxResult`. Signed-off-by: Chet Nichols III <chetn@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/pkg/api/handler/vpcprefix.go`:
- Around line 1038-1077: The delete handler uses the unbounded ctx and omits
WorkflowExecutionTimeout, which can leave DB transactions/locks open; wrap the
call to stc.ExecuteWorkflow in a bounded context via ctxBound, cancel :=
context.WithTimeout(ctx, cutil.WorkflowContextTimeout) with defer cancel(), pass
ctxBound into stc.ExecuteWorkflow and we.Get (replace uses of ctx), and add
workflowOptions.WorkflowExecutionTimeout = cutil.WorkflowExecutionTimeout so the
ExecuteWorkflow/WorkflowExecution has the same timeouts as create/update
handlers (refer to symbols workflowOptions, stc.ExecuteWorkflow, we.Get,
cutil.WorkflowContextTimeout, cutil.WorkflowExecutionTimeout).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 67b69b5f-2121-4e68-ba12-4f2e0c5b7143
📒 Files selected for processing (1)
api/pkg/api/handler/vpcprefix.go
| workflowOptions := temporalClient.StartWorkflowOptions{ | ||
| ID: "vpc-prefix-delete-" + vpcPrefix.ID.String(), | ||
| TaskQueue: queue.SiteTaskQueue, | ||
| } | ||
|
|
||
| logger.Error().Err(err).Msg("failed to delete VPC Prefix, timeout occurred executing workflow on Site.") | ||
| logger.Info().Msg("triggering VPC prefix delete workflow") | ||
|
|
||
| // Create a new context deadlines | ||
| newctx, newcancel := context.WithTimeout(context.Background(), cutil.WorkflowContextNewAfterTimeout) | ||
| defer newcancel() | ||
| // Trigger Site workflow to delete VPC prefix VPC prefix | ||
| we, derr := stc.ExecuteWorkflow(ctx, workflowOptions, "DeleteVpcPrefix", deleteVpcPrefixRequest) | ||
| if derr != nil { | ||
| logger.Error().Err(derr).Msg("failed to synchronously start Temporal workflow to delete VPC prefix") | ||
| return cutil.NewAPIError(http.StatusInternalServerError, fmt.Sprintf("Failed to start sync workflow to delete VPC prefix on Site: %s", derr), nil) | ||
| } | ||
|
|
||
| // Initiate termination workflow | ||
| serr := stc.TerminateWorkflow(newctx, wid, "", "timeout occurred executing delete VPC prefix workflow") | ||
| if serr != nil { | ||
| logger.Error().Err(serr).Msg("failed to terminate Temporal workflow for deleting VPC prefix") | ||
| return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("Failed to terminate synchronous VPC prefix deletion workflow after timeout, Cloud and Site data may be de-synced: %s", serr), nil) | ||
| wid := we.GetID() | ||
| logger.Info().Str("Workflow ID", wid).Msg("executed synchronous delete VPC prefix workflow") | ||
|
|
||
| // Execute the workflow synchronously | ||
| wferr := we.Get(ctx, nil) | ||
| // Handle skippable errors | ||
| if wferr != nil { | ||
| // If this was a 404 back from NICo, we can treat the object as already having been deleted and allow things to proceed. | ||
| var applicationErr *tp.ApplicationError | ||
| if errors.As(wferr, &applicationErr) && slices.Contains(swe.ObjectNotFoundErrTypes(), applicationErr.Type()) { | ||
| logger.Warn().Msg(swe.ErrTypeNICoObjectNotFound + " received from Site") | ||
| // Reset error to nil | ||
| wferr = nil | ||
| } | ||
| } | ||
|
|
||
| logger.Info().Str("Workflow ID", wid).Msg("initiated terminate synchronous delete VPC prefix workflow successfully") | ||
| // Check if wferr is still nil now that we've handled any skippable errors. | ||
| if wferr != nil { | ||
| var timeoutErr *tp.TimeoutError | ||
| if errors.As(wferr, &timeoutErr) || ctx.Err() != nil { | ||
| logger.Error().Err(wferr).Msg("failed to delete VPC Prefix, timeout occurred executing workflow on Site.") | ||
| timeoutResp = func() error { | ||
| return common.TerminateWorkflowOnTimeOut(c, logger, stc, wid, wferr, "VPCPrefix", "Delete") | ||
| } | ||
| return cutil.NewAPIError(http.StatusInternalServerError, "VPC Prefix delete workflow timed out", nil) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
sed -n '277,321p;818,865p;1038,1082p' api/pkg/api/handler/vpcprefix.goRepository: NVIDIA/infra-controller-rest
Length of output: 6458
Align delete workflow timeout handling with create and update handlers.
The delete handler omits WorkflowExecutionTimeout and executes the workflow against the unbounded handler context, whereas both create and update handlers establish a bounded workflow context via context.WithTimeout(ctx, cutil.WorkflowContextTimeout) and set WorkflowExecutionTimeout: cutil.WorkflowExecutionTimeout. A stalled Site workflow will hold the DB transaction and advisory lock open until the handler context is canceled, degrading system reliability and blocking other operations.
Suggested fix
workflowOptions := temporalClient.StartWorkflowOptions{
- ID: "vpc-prefix-delete-" + vpcPrefix.ID.String(),
- TaskQueue: queue.SiteTaskQueue,
+ ID: "vpc-prefix-delete-" + vpcPrefix.ID.String(),
+ WorkflowExecutionTimeout: cutil.WorkflowExecutionTimeout,
+ TaskQueue: queue.SiteTaskQueue,
}
logger.Info().Msg("triggering VPC prefix delete workflow")
+ workflowCtx, cancel := context.WithTimeout(ctx, cutil.WorkflowContextTimeout)
+ defer cancel()
+
// Trigger Site workflow to delete VPC prefix VPC prefix
- we, derr := stc.ExecuteWorkflow(ctx, workflowOptions, "DeleteVpcPrefix", deleteVpcPrefixRequest)
+ we, derr := stc.ExecuteWorkflow(workflowCtx, workflowOptions, "DeleteVpcPrefix", deleteVpcPrefixRequest)
if derr != nil {
logger.Error().Err(derr).Msg("failed to synchronously start Temporal workflow to delete VPC prefix")
return cutil.NewAPIError(http.StatusInternalServerError, fmt.Sprintf("Failed to start sync workflow to delete VPC prefix on Site: %s", derr), nil)
}
wid := we.GetID()
logger.Info().Str("Workflow ID", wid).Msg("executed synchronous delete VPC prefix workflow")
// Execute the workflow synchronously
- wferr := we.Get(ctx, nil)
+ wferr := we.Get(workflowCtx, nil)
// Handle skippable errors
if wferr != nil {
// If this was a 404 back from NICo, we can treat the object as already having been deleted and allow things to proceed.
var applicationErr *tp.ApplicationError
if errors.As(wferr, &applicationErr) && slices.Contains(swe.ObjectNotFoundErrTypes(), applicationErr.Type()) {
logger.Warn().Msg(swe.ErrTypeNICoObjectNotFound + " received from Site")
// Reset error to nil
wferr = nil
}
}
// Check if wferr is still nil now that we've handled any skippable errors.
if wferr != nil {
var timeoutErr *tp.TimeoutError
- if errors.As(wferr, &timeoutErr) || ctx.Err() != nil {
+ if errors.As(wferr, &timeoutErr) || wferr == context.DeadlineExceeded || workflowCtx.Err() != nil {
logger.Error().Err(wferr).Msg("failed to delete VPC Prefix, timeout occurred executing workflow on Site.")
timeoutResp = func() error {
return common.TerminateWorkflowOnTimeOut(c, logger, stc, wid, wferr, "VPCPrefix", "Delete")
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@api/pkg/api/handler/vpcprefix.go` around lines 1038 - 1077, The delete
handler uses the unbounded ctx and omits WorkflowExecutionTimeout, which can
leave DB transactions/locks open; wrap the call to stc.ExecuteWorkflow in a
bounded context via ctxBound, cancel := context.WithTimeout(ctx,
cutil.WorkflowContextTimeout) with defer cancel(), pass ctxBound into
stc.ExecuteWorkflow and we.Get (replace uses of ctx), and add
workflowOptions.WorkflowExecutionTimeout = cutil.WorkflowExecutionTimeout so the
ExecuteWorkflow/WorkflowExecution has the same timeouts as create/update
handlers (refer to symbols workflowOptions, stc.ExecuteWorkflow, we.Get,
cutil.WorkflowContextTimeout, cutil.WorkflowExecutionTimeout).
Description
Applies
WithTxfrom #462 to the VPC Prefix API handlers.Takes into consideration a few previous code reviews for integrating these, ensuring:
TerminateWorkflowOnTimeOuthelper and not duplicate code (thanks @thossain-nv).WithTxand not a weird mix ofWithTxandWithTxResult.Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Services Affected
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes