Skip to content

Conversation

@yasun1
Copy link
Contributor

@yasun1 yasun1 commented Nov 26, 2025

Summary

Implements complete Kubernetes-style status semantics for cluster and nodepool resources, including
OpenAPI spec synchronization and proper timestamp tracking.

Changes

1. OpenAPI Spec Synchronization

  • Rename timestamp fields: CreatedAt/UpdatedAtCreatedTime/UpdatedTime
  • Add last_report_time: Track when each adapter last reported
  • Updated migrations: Align database schema with OpenAPI spec

2. Adapter Status Upsert Semantics

  • Implement Upsert (pkg/dao/adapter_status.go):
    • POST creates new or updates existing adapter status
    • Unique key: (resource_type, resource_id, adapter)
    • Preserve created_time on update
    • Update last_report_time on every POST
  • Preserve LastTransitionTime (pkg/dao/adapter_status.go):
    • Only update when condition status changes (True/False/Unknown)
    • Implements Kubernetes condition semantic

3. Adapter-to-Condition Mapping

  • MapAdapterToConditionType() (pkg/services/status_aggregation.go):
    • validatorValidatorSuccessful
    • dnsDnsSuccessful
    • gcp-provisionerGcpProvisionerSuccessful
    • Configurable suffix mapping for future customization

4. Generation Tracking

  • Auto-increment generation (pkg/dao/cluster.go):
    • Compare spec on Replace using bytes.Equal()
    • Increment generation when spec changes
    • Preserve generation when spec unchanged
  • Generation matching in phase calculation:
    • Phase = "Ready" only when observed_generation === cluster.generation

5. Status Aggregation Improvements

  • Use min() strategy (pkg/services/cluster.go, pkg/services/node_pool.go):
    • status.last_updated_time = min(adapters[].last_report_time)
    • Enables Sentinel to detect ANY stale adapter
  • Phase transition tracking:
    • Only update status.last_transition_time when phase changes
    • Preserve timestamp when phase unchanged

6. Phase Calculation Logic

  • ComputePhase() (pkg/services/status_aggregation.go):
    • Check all required adapters (configurable list)
    • Verify available === "True" for each
    • Verify observed_generation === cluster.generation for each
    • Return "Ready" only when all conditions met
    • NodePool uses generation=0 (no generation field)

Testing

✅ All 24 integration tests pass
✅ All unit tests pass

Summary by CodeRabbit

Release Notes

  • API Updates
    • Timestamp fields renamed: created_at/updated_atcreated_time/updated_time across all resources
    • Status model now uses conditions instead of adapters for improved clarity
    • New last_updated_time and last_report_time fields added for enhanced tracking
    • Adapter status endpoints now support create-or-update operations

✏️ Tip: You can customize this high-level summary in your review settings.

@yasun1 yasun1 changed the title feat: sync with openapi spec updates and implement adapter status mapping feat: implement Kubernetes-style status semantics and generation tracking Nov 26, 2025
@yasun1 yasun1 force-pushed the sync-openapi branch 4 times, most recently from 8a999b0 to b7a77ca Compare November 26, 2025 09:12
if info.available == "False" {
anyFailed = true
allReady = false
} else if info.available != "True" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems other info.available value won't affect anything. Do we expect to raise any error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log some warning messages for this at the initial.

@xueli181114
Copy link
Collaborator

/approve

@coderabbitai
Copy link

coderabbitai bot commented Nov 27, 2025

Walkthrough

The PR refactors timestamp naming from created_at/updated_at to created_time/updated_time, replaces adapter-based status tracking with condition-driven status aggregation, introduces new OpenAPI schemas for condition handling and status management, and implements upsert semantics for adapter status operations across DAOs, services, and handlers.

Changes

Cohort / File(s) Summary
OpenAPI schema
openapi/openapi.yaml
Shifts timestamp fields (CreatedAt→CreatedTime, UpdatedAt→UpdatedTime), introduces new condition-driven schemas (AdapterCondition, AdapterStatus, AdapterStatusBase, AdapterStatusCreateRequest, AdapterStatusList, ConditionBase, ConditionRequest, ResourceCondition), replaces adapter-based status blocks with condition arrays in ClusterStatus/NodePoolStatus, updates list schemas with pagination metadata and new timing fields.
Type definitions
pkg/api/metadata_types.go
Renames Meta struct fields: CreatedAt→CreatedTime, UpdatedAt→UpdatedTime.
Cluster model
pkg/api/cluster_types.go
Adds StatusLastUpdatedTime and StatusConditions fields; removes StatusUpdatedAt and StatusAdapters. Updates ToOpenAPI() to use CreatedTime/UpdatedTime and unmarshals ResourceCondition instead of adapters. Updates ClusterFromOpenAPICreate to marshal StatusConditions.
NodePool model
pkg/api/node_pool_types.go
Adds StatusLastUpdatedTime and StatusConditions fields; removes StatusUpdatedAt and StatusAdapters. Updates ToOpenAPI() and NodePoolFromOpenAPICreate similarly to cluster type.
AdapterStatus model
pkg/api/adapter_status_types.go
Adds LastReportTime and CreatedTime fields. Updates ToOpenAPI() to use AdapterCondition and nested data structure. Updates AdapterStatusFromOpenAPICreate to construct AdapterCondition instances with LastTransitionTime.
Database migrations
pkg/db/migrations/202511111044_add_clusters.go, pkg/db/migrations/202511111055_add_node_pools.go, pkg/db/migrations/202511111105_add_adapter_status.go
Rename timestamp columns (created_at→created_time, updated_at→updated_time), rename status fields (status_updated_at→status_last_updated_time, status_adapters→status_conditions), add last_report_time to adapter_statuses.
DAO layer
pkg/dao/adapter_status.go
Adds Upsert method to AdapterStatusDao interface and implements on sqlAdapterStatusDao. Implements preserveLastTransitionTime helper to maintain unchanged condition transition times.
DAO layer
pkg/dao/cluster.go
Adds byte comparison of cluster spec; increments Generation when spec changes, preserves it when unchanged.
Service layer - status aggregation
pkg/services/status_aggregation.go, pkg/services/status_aggregation_test.go
New file implementing MapAdapterToConditionType and ComputePhase functions. Maps adapter names to PascalCase condition types with suffixes. Computes resource phase ("Ready", "Failed", "NotReady") by validating required adapters and checking Available status.
Service layer - adapter status
pkg/services/adapter_status.go
Adds Upsert method to AdapterStatusService interface and implements in sqlAdapterStatusService, delegating to DAO Upsert.
Service layer - aggregation
pkg/services/cluster.go, pkg/services/node_pool.go
Replace ConditionAvailable with ResourceCondition in UpdateClusterStatusFromAdapters and equivalent nodepool aggregation. Unmarshal conditions as AdapterCondition. Map adapter status via MapAdapterToConditionType. Compute phase using ComputePhase. Populate ResourceCondition with CreatedTime/LastUpdatedTime from adapter fields. Calculate StatusLastUpdatedTime from minimum adapter LastReportTime. Marshal conditions into StatusConditions instead of adapters.
Handlers
pkg/handlers/cluster_status.go, pkg/handlers/nodepool_status.go
Replace per-resource existence checks and separate create/update branches with single Upsert call. Simplify control flow by delegating upsert semantics to service layer.
Tests
pkg/handlers/cluster_nodepools_test.go, test/integration/clusters_test.go
Update field references from CreatedAt/UpdatedAt to CreatedTime/UpdatedTime in assertions and fixtures.
Integration tests
test/integration/adapter_status_test.go
Update Conditions payload type from []openapi.Condition to []openapi.ConditionRequest. Update Data payload structure from map[string]interface{} to nested map[string]map[string]interface{}.
Configuration
.gitignore, Makefile
Add data/generated/ to ignored paths. Create destination directory in generate targets.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Handler
    participant Service
    participant DAO
    participant DB
    
    rect rgba(100, 200, 100, 0.1)
    Note over Client,DB: Old Flow: Create or Update Adapter Status
    Client->>Handler: POST adapter status
    Handler->>Service: FindByResourceAndAdapter
    Service->>DAO: FindByResourceAndAdapter
    DAO->>DB: Query
    alt Exists
        Handler->>Service: Create or Replace
    else Not Found
        Handler->>Service: Create
    end
    Service->>DAO: Create/Replace
    DAO->>DB: Insert/Update
    end
    
    rect rgba(100, 150, 200, 0.1)
    Note over Client,DB: New Flow: Upsert Adapter Status
    Client->>Handler: POST adapter status
    Handler->>Service: Upsert
    Service->>DAO: Upsert
    DAO->>DB: FindByResourceAndAdapter
    alt Exists
        DAO->>DB: Preserve ID & CreatedTime
        DAO->>DB: Update LastReportTime & Conditions
        Note over DAO: preserveLastTransitionTime logic
    else Not Found
        DAO->>DB: Create with new ID & CreatedTime
    end
    DAO->>Service: Return updated status
    Service->>Handler: Updated status
    Handler->>Client: 200 OK with status
    end
Loading
sequenceDiagram
    participant Aggregator as Cluster/NodePool<br/>Aggregator
    participant DAO as AdapterStatus<br/>DAO
    participant Compute as ComputePhase
    participant DB
    
    Aggregator->>DAO: Get adapter statuses
    DAO->>DB: Query all adapter statuses<br/>for resource
    DB-->>DAO: Return statuses
    DAO-->>Aggregator: AdapterStatusList
    
    Aggregator->>Aggregator: Unmarshal Conditions<br/>as AdapterCondition
    Aggregator->>Aggregator: MapAdapterToConditionType<br/>for each adapter
    Aggregator->>Aggregator: Build ResourceCondition<br/>array
    Aggregator->>Compute: ComputePhase(adapters,<br/>required, generation)
    
    rect rgba(100, 200, 100, 0.1)
    Note over Compute: Phase Computation
    Compute->>Compute: Find "Available" status<br/>for required adapters
    Compute->>Compute: Validate ObservedGeneration
    Compute->>Compute: Determine Ready/Failed/NotReady
    end
    
    Compute-->>Aggregator: Phase (Ready|Failed|NotReady)
    Aggregator->>Aggregator: Compute minLastUpdatedTime
    Aggregator->>Aggregator: Marshal ResourceCondition<br/>to StatusConditions
    Aggregator->>DAO: Replace with new status
    DAO->>DB: Update resource
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas requiring extra attention:
    • pkg/services/status_aggregation.go: New phase computation logic with ObservedGeneration validation and adapter-to-condition mapping; verify correctness of "Ready"/"Failed"/"NotReady" determination
    • pkg/dao/adapter_status.go Upsert method: Verify preserveLastTransitionTime correctly maintains condition transition semantics; check error handling and transaction rollback paths
    • pkg/services/cluster.go and pkg/services/node_pool.go: Complex unmarshaling and type conversion from AdapterCondition to ResourceCondition; verify field mapping accuracy (ObservedGeneration source, LastTransitionTime flow)
    • Migration files: Verify column renames don't break existing data and rollback paths are correct
    • pkg/handlers/cluster_status.go and pkg/handlers/nodepool_status.go: Verify upsert semantics replace old create/update branching correctly; confirm aggregation is triggered post-upsert
    • Status aggregation integration: Ensure phase transitions and LastTransitionTime updates only fire on actual phase changes

Poem

🐰✨ Conditions bloom where adapters once stood tall,
Timestamps tick with CreatedTime through the hall,
Upsert magic merges changes into one,
Phases compute from conditions—Ready or not, we run!
From adapters' tangled web to conditions so clean,
The fairest status model ever seen! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.18% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: implementing Kubernetes-style status semantics and generation tracking across the codebase.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
test/integration/adapter_status_test.go (1)

26-47: AdapterStatusCreateRequest payloads correctly updated to new schema

Switching Conditions to []openapi.ConditionRequest and Data to map[string]map[string]interface{}{...{"value": ...}} aligns these integration tests with the new AdapterStatusCreateRequest/OpenAPI model and the AdapterStatusFromOpenAPICreate helper that consumes them. Cluster and nodepool flows are both exercised with the new shapes. As a small enhancement you could add one assertion that verifies a Data field round-trips in a GET response, to catch future schema regressions, but the current coverage is already solid.

Also applies to: 61-72, 99-112, 135-148, 171-183, 279-291, 312-323

pkg/services/status_aggregation_test.go (1)

5-25: Adapter-to-condition mapping tests are solid; consider minimizing global map side effects

The table-driven test plus the two custom-suffix tests give good coverage of MapAdapterToConditionType, including hyphenated names and override behavior. Since the tests mutate adapterConditionSuffixMap, you might optionally snapshot and restore the previous value (rather than just deleting the key) to avoid subtle surprises if other tests or code later start depending on the exact map contents, but with no t.Parallel this is low risk today.

Also applies to: 28-39, 42-52

pkg/services/cluster.go (1)

176-211: Consider extracting common status aggregation logic.

The pattern for computing minLastUpdatedTime (lines 181-188) and updating status fields (lines 194-216) is duplicated in node_pool.go. Consider extracting a helper function to reduce duplication.

This is a minor refactor suggestion that can be addressed in a follow-up PR if desired.

pkg/services/status_aggregation.go (1)

139-153: Consider adding a warning log for generation mismatch in the fallback path.

When checking all adapters (no required adapters specified), a generation mismatch at line 141 sets allReady = false but doesn't log which adapter is stale, unlike the required adapters path (lines 169-170). This makes debugging harder.

 		for adapterName, info := range adapterMap {
 			// Check generation matching only if resourceGeneration > 0
 			if resourceGeneration > 0 && info.observedGeneration != resourceGeneration {
+				log.Warning(fmt.Sprintf("Adapter '%s' has stale generation: observed=%d, expected=%d",
+					adapterName, info.observedGeneration, resourceGeneration))
 				allReady = false
 			}
openapi/openapi.yaml (1)

431-437: Consider adding maxItems constraint to conditions arrays.

The static analysis tool flagged that the conditions array lacks a maxItems constraint. While condition arrays are typically bounded by the number of condition types, adding a reasonable limit (e.g., 20-50) provides defense-in-depth against malformed requests.

This also applies to other conditions arrays in ClusterStatus (line 779) and NodePoolStatus (line 1123).

         conditions:
           type: array
           items:
             $ref: '#/components/schemas/AdapterCondition'
+          maxItems: 50
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d1dc4f2 and 40ec0e7.

⛔ Files ignored due to path filters (1)
  • data/generated/openapi/openapi.go is excluded by !**/generated/**
📒 Files selected for processing (20)
  • openapi/openapi.yaml (19 hunks)
  • pkg/api/adapter_status_types.go (6 hunks)
  • pkg/api/cluster_types.go (5 hunks)
  • pkg/api/metadata_types.go (1 hunks)
  • pkg/api/node_pool_types.go (5 hunks)
  • pkg/dao/adapter_status.go (3 hunks)
  • pkg/dao/cluster.go (2 hunks)
  • pkg/db/migrations/202511111044_add_clusters.go (2 hunks)
  • pkg/db/migrations/202511111055_add_node_pools.go (2 hunks)
  • pkg/db/migrations/202511111105_add_adapter_status.go (2 hunks)
  • pkg/handlers/cluster_nodepools_test.go (3 hunks)
  • pkg/handlers/cluster_status.go (1 hunks)
  • pkg/handlers/nodepool_status.go (1 hunks)
  • pkg/services/adapter_status.go (2 hunks)
  • pkg/services/cluster.go (2 hunks)
  • pkg/services/node_pool.go (2 hunks)
  • pkg/services/status_aggregation.go (1 hunks)
  • pkg/services/status_aggregation_test.go (1 hunks)
  • test/integration/adapter_status_test.go (9 hunks)
  • test/integration/clusters_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (9)
pkg/dao/cluster.go (1)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/services/status_aggregation.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusList (34-34)
pkg/logger/logger.go (1)
  • NewOCMLogger (36-44)
pkg/handlers/nodepool_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/services/node_pool.go (1)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/services/adapter_status.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/errors/errors.go (1)
  • ServiceError (94-101)
pkg/dao/adapter_status.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/api/adapter_status_types.go (1)
pkg/api/metadata_types.go (2)
  • Meta (37-42)
  • Metadata (28-34)
pkg/services/status_aggregation_test.go (1)
pkg/services/status_aggregation.go (1)
  • MapAdapterToConditionType (60-83)
pkg/handlers/cluster_nodepools_test.go (2)
pkg/api/node_pool_types.go (1)
  • NodePool (14-42)
pkg/api/metadata_types.go (1)
  • Meta (37-42)
🪛 Checkov (3.2.334)
openapi/openapi.yaml

[medium] 432-439: Ensure that arrays have a maximum number of items

(CKV_OPENAPI_21)

🔇 Additional comments (29)
pkg/db/migrations/202511111105_add_adapter_status.go (1)

21-42: AdapterStatus schema and timestamps look consistent with the intended semantics

created_time/updated_time plus last_report_time as NOT NULL TIMESTAMPTZ with DEFAULT NOW() and the unique (resource_type, resource_id, adapter) index match the described upsert + condition model. I don’t see migration-level issues here.

pkg/db/migrations/202511111055_add_node_pools.go (1)

20-51: NodePools schema aligns with the new generation/condition-based status model

The switch to created_time/updated_time and the status_* fields (status_last_updated_time, status_conditions, status_observed_generation) looks consistent with the cluster schema and the described readiness logic. Indexes on deleted_at, owner_id, and status_phase plus the FK to clusters(id) all look appropriate.

pkg/db/migrations/202511111044_add_clusters.go (1)

19-48: Cluster schema changes look correct; double‑check migration history vs existing databases

The move to created_time/updated_time and status_last_updated_time/status_conditions matches the new condition-based status model and is consistent with the node_pools and adapter_status schemas.

If any environment has already applied a previous version of this migration with the old column names, updating the Go migration code alone won’t adjust existing tables—gormigrate will treat this ID as already applied. In that case you’d need a follow‑up migration (e.g., ALTER TABLE ... RENAME COLUMN ...) to bring live databases in line with the new schema. If this is still pre‑prod and those migrations were never run, the current change is fine as‑is.

pkg/api/metadata_types.go (1)

37-42: LGTM! Field renames align with the new timestamp naming convention.

The renaming from CreatedAt/UpdatedAt to CreatedTime/UpdatedTime is consistent with the OpenAPI spec updates across the PR.

Note: The embedded comment in pkg/api/cluster_types.go (line 14) and pkg/api/node_pool_types.go (line 15) still references the old field names (CreatedAt, UpdatedAt). Consider updating those comments for consistency.

pkg/api/cluster_types.go (3)

30-31: LGTM! Status field migration to condition-based model.

The transition from StatusAdapters to StatusConditions aligns with the Kubernetes-style status semantics objective.


81-124: LGTM! Proper OpenAPI conversion with nil-safe handling.

The conversion correctly handles nil pointer cases by defaulting to zero-value time.Time{}. The silent error handling on json.Unmarshal (line 84) follows the existing pattern in this file.


141-155: LGTM! Correct initialization of empty conditions for new clusters.

pkg/api/node_pool_types.go (3)

36-37: LGTM! Consistent with Cluster status model changes.

The NodePool status fields mirror the Cluster implementation, maintaining consistency across resource types.


91-145: LGTM! ToOpenAPI conversion follows the same pattern as Cluster.

Consistent nil-safe handling for optional time fields.


162-179: LGTM! Empty conditions initialization matches Cluster pattern.

pkg/api/adapter_status_types.go (3)

24-27: LGTM! New timestamp fields for adapter status tracking.

CreatedTime and LastReportTime enable the Kubernetes-style semantics described in the PR objectives. The gorm:"not null" constraint is appropriate since these fields are always populated in AdapterStatusFromOpenAPICreate.


70-89: LGTM! Consistent nil-safe pattern for time fields.


97-142: LGTM! Proper timestamp initialization for new adapter statuses.

The logic correctly:

  1. Uses req.ObservedTime if provided, otherwise falls back to time.Now()
  2. Sets LastTransitionTime for all conditions on initial creation
  3. Initializes both CreatedTime and LastReportTime to the same value

The upsert logic in the DAO layer will handle preserving LastTransitionTime on updates when condition status doesn't change (as per Kubernetes semantics).

pkg/dao/cluster.go (1)

50-74: LGTM! Generation tracking follows Kubernetes semantics.

The implementation correctly:

  1. Fetches the existing cluster to compare specs
  2. Uses bytes.Equal for JSON comparison (appropriate for datatypes.JSON)
  3. Increments generation only when spec changes
  4. Preserves generation when spec is unchanged

Minor consideration: In high-concurrency scenarios, there's a theoretical TOCTOU window between Get (line 54) and Save (line 69) where another request could modify the cluster. If this becomes a concern, consider using optimistic locking via a version column or database-level constraints. For typical API usage patterns, this is likely acceptable.

test/integration/clusters_test.go (1)

41-42: Timestamp assertions correctly updated to CreatedTime/UpdatedTime

Using CreatedTime/UpdatedTime with BeTemporally("~", ...) aligns the test with the new Meta schema and keeps the comparison robust against small timing differences.

pkg/handlers/cluster_nodepools_test.go (1)

152-158: Meta initialization matches renamed timestamp fields

Updating mock api.Cluster and api.NodePool Meta to use CreatedTime/UpdatedTime (with a shared now) is consistent with the new Meta struct and keeps these tests compiling and semantically aligned with the API.

Also applies to: 162-168, 194-201, 214-221, 225-233

test/integration/adapter_status_test.go (1)

192-264: Idempotency and paging edge-case tests exercise Upsert semantics well

The idempotency test (single adapter entry updated in-place) and the paging edge-case tests (empty dataset, page beyond range, single-item dataset, and no-duplicate pagination walk) give good coverage of the new Upsert + pagination behavior for adapter statuses. The expectations match the intended API semantics and should quickly surface regressions in the Upsert/Find logic.

Also applies to: 266-358

pkg/handlers/nodepool_status.go (1)

81-86: NodePool status Create correctly switched to AdapterStatus Upsert flow

Constructing the status via AdapterStatusFromOpenAPICreate("NodePool", ...) and delegating to adapterStatusService.Upsert cleanly centralizes creation/update logic and keeps the handler thin. This also keeps nodepool behavior consistent with the cluster status handler.

pkg/services/adapter_status.go (1)

11-21: Service-level Upsert wiring is straightforward and consistent

Adding Upsert to AdapterStatusService and having sqlAdapterStatusService.Upsert delegate directly to the DAO keeps the service layer simple and consistent with existing Create/Replace patterns. Using handleCreateError for Upsert failures is reasonable given the POST-upsert semantics.

Also applies to: 59-65

pkg/handlers/cluster_status.go (1)

81-85: Cluster status Create correctly unified on AdapterStatus Upsert

Using AdapterStatusFromOpenAPICreate("Cluster", ...) followed by adapterStatusService.Upsert aligns the cluster status path with the new upsert semantics and keeps the handler focused on validation and wiring, not persistence details.

pkg/dao/adapter_status.go (2)

65-104: Use gorm.ErrRecordNotFound instead of string-matching "record not found"

The Upsert flow correctly preserves ID/CreatedTime, bumps LastReportTime, and reuses preserveLastTransitionTime, but detecting a missing record via err.Error() == "record not found" is brittle and may break if GORM changes the error message or wraps it.

Prefer checking the sentinel error:

-import (
-    "context"
-    "encoding/json"
-    "time"
-
-    "gorm.io/datatypes"
-    "gorm.io/gorm/clause"
-    ...
-)
+import (
+    "context"
+    "encoding/json"
+    "errors"
+    "time"
+
+    "gorm.io/datatypes"
+    "gorm.io/gorm"
+    "gorm.io/gorm/clause"
+    ...
+)
@@
-    existing, err := d.FindByResourceAndAdapter(ctx, adapterStatus.ResourceType, adapterStatus.ResourceID, adapterStatus.Adapter)
-
-    if err != nil {
-        // If not found, create new
-        if err.Error() == "record not found" {
-            return d.Create(ctx, adapterStatus)
-        }
-        // Other errors
-        db.MarkForRollback(ctx, err)
-        return nil, err
-    }
+    existing, err := d.FindByResourceAndAdapter(ctx, adapterStatus.ResourceType, adapterStatus.ResourceID, adapterStatus.Adapter)
+    if err != nil {
+        // If not found, create new
+        if errors.Is(err, gorm.ErrRecordNotFound) {
+            return d.Create(ctx, adapterStatus)
+        }
+        // Other errors
+        db.MarkForRollback(ctx, err)
+        return nil, err
+    }

This keeps behavior the same while being robust to GORM’s error-wrapping.
[ suggest_recommended_refactor ]


163-210: preserveLastTransitionTime correctly implements Kubernetes-style condition semantics

The helper safely unmarshals old/new condition JSON, preserves LastTransitionTime when a condition’s Status is unchanged, and gracefully falls back to the new JSON on any (un)marshal failures. This matches the intended K8s behavior while avoiding hard failures on malformed historical data.

pkg/services/cluster.go (1)

213-216: LGTM - Phase transition handling follows Kubernetes semantics.

The logic correctly updates StatusLastTransitionTime only when the phase changes, which aligns with Kubernetes condition semantics.

pkg/services/status_aggregation.go (2)

60-83: LGTM - PascalCase conversion logic is correct.

The function correctly handles hyphenated adapter names and applies the configurable suffix. The empty part check at line 73 prevents issues with consecutive hyphens.


186-193: LGTM - Phase determination logic correctly implements Kubernetes semantics.

The phase computation correctly prioritizes: allReady → "Ready", anyFailed → "Failed", otherwise → "NotReady". This aligns with standard Kubernetes condition aggregation patterns.

pkg/services/node_pool.go (1)

176-178: LGTM - NodePool correctly bypasses generation checking.

The comment at line 177 clearly explains that NodePool doesn't have a generation field, so passing 0 to ComputePhase disables generation validation. This is a reasonable design decision.

openapi/openapi.yaml (3)

1148-1175: LGTM - ResourceCondition schema is well-designed.

The schema properly extends ConditionBase with additional fields (observed_generation, created_time, last_updated_time) needed for tracking adapter-specific state. The description clearly explains the semantic difference from AdapterCondition.


790-840: LGTM - Clean separation between request and response condition schemas.

ConditionRequest correctly omits last_transition_time since it's API-managed, while ConditionBase includes it for responses. This follows Kubernetes conventions where the server tracks transition times.


771-788: LGTM - ClusterStatus documentation clearly explains staleness detection.

The description at lines 776-778 clearly explains that last_updated_time uses the minimum (earliest) timestamp to ensure Sentinel can detect if any adapter has stopped reporting. This is a thoughtful design choice for reliability monitoring.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/handlers/nodepool_status.go (1)

90-95: Missing error logging for status aggregation failure.

The comment states "Log error but don't fail the request" but the error is silently discarded. This makes debugging production issues difficult.

Apply this diff to actually log the error:

 // Trigger status aggregation
 _, aggregateErr := h.nodePoolService.UpdateNodePoolStatusFromAdapters(ctx, nodePoolID)
 if aggregateErr != nil {
 	// Log error but don't fail the request
 	// The status will be computed on next update
+	log := logger.NewOCMLogger(ctx)
+	log.Warningf("Failed to aggregate nodepool status: %v", aggregateErr)
 }

You'll need to import "github.com/openshift-hyperfleet/hyperfleet-api/pkg/logger".

♻️ Duplicate comments (2)
pkg/services/cluster.go (1)

163-167: Nil pointer dereference risk when building ResourceCondition.

Lines 164-165 dereference adapterStatus.CreatedTime and adapterStatus.LastReportTime without nil checks, which could cause a runtime panic if either pointer is nil.

Apply this diff to add nil checks before dereferencing:

 		if availableCondition == nil {
 			// No Available condition, skip this adapter
 			continue
 		}

+		// Skip if required timestamps are missing
+		if adapterStatus.CreatedTime == nil || adapterStatus.LastReportTime == nil {
+			continue
+		}
+
 		// Convert to ResourceCondition
 		condResource := openapi.ResourceCondition{
 			Type:               MapAdapterToConditionType(adapterStatus.Adapter),
pkg/services/node_pool.go (1)

163-167: Same nil pointer dereference risk as in cluster.go.

Lines 164-165 dereference adapterStatus.CreatedTime and adapterStatus.LastReportTime without nil checks, which could cause a runtime panic.

Apply the same fix:

 		if availableCondition == nil {
 			// No Available condition, skip this adapter
 			continue
 		}

+		// Skip if required timestamps are missing
+		if adapterStatus.CreatedTime == nil || adapterStatus.LastReportTime == nil {
+			continue
+		}
+
 		// Convert to ResourceCondition
 		condResource := openapi.ResourceCondition{
🧹 Nitpick comments (3)
openapi/openapi.yaml (1)

432-439: Add maxItems constraints to array fields for API governance.

Arrays without maxItems constraints can accept unbounded payloads, creating a potential DoS vector. Add reasonable maxItems to all array fields:

  • conditions arrays (lines 432-439, 781-782, 1123-1126): reasonable max of ~10 (typical: 3-5 condition types per adapter)
  • items arrays in list responses (line 584-585): constrained by pageSize max (20)

Apply these diffs to add maxItems constraints:

# AdapterStatus.conditions (line 431-433)
       conditions:
         type: array
+        maxItems: 10
         items:
           $ref: '#/components/schemas/AdapterCondition'

# AdapterStatusList.items (line 583-585)
       items:
         type: array
+        maxItems: 20
         items:
           $ref: '#/components/schemas/AdapterStatus'

# ClusterStatus.conditions (line 780-782)
       conditions:
         type: array
+        maxItems: 10
         items:
           $ref: '#/components/schemas/ResourceCondition'

# NodePoolStatus.conditions (line 1123-1126)
       conditions:
         type: array
+        maxItems: 10
         items:
           $ref: '#/components/schemas/ResourceCondition'

Also applies to: 584-585, 781-782, 1123-1126

pkg/api/node_pool_types.go (1)

77-95: Consider logging unmarshal errors for observability.

Lines 82, 88, and 94 silently discard JSON unmarshal errors. While using zero values as fallback is reasonable, completely ignoring errors may hide data corruption issues in production.

Consider at minimum logging a warning:

 // Unmarshal Spec
 var spec map[string]interface{}
 if len(np.Spec) > 0 {
-	_ = json.Unmarshal(np.Spec, &spec)
+	if err := json.Unmarshal(np.Spec, &spec); err != nil {
+		// Log but continue with empty spec
+	}
 }
pkg/services/node_pool.go (1)

118-225: Consider extracting shared status aggregation logic.

UpdateNodePoolStatusFromAdapters is nearly identical to UpdateClusterStatusFromAdapters in cluster.go. The only differences are the resource type, required adapters list, and generation handling.

A shared helper could reduce duplication:

func aggregateStatusFromAdapters[T any](
    ctx context.Context,
    resource T,
    adapterStatuses api.AdapterStatusList,
    requiredAdapters []string,
    generation int32,
    getPhase func(T) string,
    setFields func(T, string, int32, datatypes.JSON, *time.Time, *time.Time),
) error

This is optional given the current scope, but worth considering for maintainability as more resource types are added.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40ec0e7 and e2697bf.

⛔ Files ignored due to path filters (1)
  • data/generated/openapi/openapi.go is excluded by !**/generated/**
📒 Files selected for processing (20)
  • openapi/openapi.yaml (19 hunks)
  • pkg/api/adapter_status_types.go (6 hunks)
  • pkg/api/cluster_types.go (5 hunks)
  • pkg/api/metadata_types.go (1 hunks)
  • pkg/api/node_pool_types.go (5 hunks)
  • pkg/dao/adapter_status.go (3 hunks)
  • pkg/dao/cluster.go (2 hunks)
  • pkg/db/migrations/202511111044_add_clusters.go (2 hunks)
  • pkg/db/migrations/202511111055_add_node_pools.go (2 hunks)
  • pkg/db/migrations/202511111105_add_adapter_status.go (2 hunks)
  • pkg/handlers/cluster_nodepools_test.go (3 hunks)
  • pkg/handlers/cluster_status.go (1 hunks)
  • pkg/handlers/nodepool_status.go (1 hunks)
  • pkg/services/adapter_status.go (2 hunks)
  • pkg/services/cluster.go (2 hunks)
  • pkg/services/node_pool.go (2 hunks)
  • pkg/services/status_aggregation.go (1 hunks)
  • pkg/services/status_aggregation_test.go (1 hunks)
  • test/integration/adapter_status_test.go (9 hunks)
  • test/integration/clusters_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • pkg/db/migrations/202511111055_add_node_pools.go
  • test/integration/adapter_status_test.go
  • pkg/handlers/cluster_nodepools_test.go
  • pkg/api/metadata_types.go
  • pkg/services/status_aggregation.go
  • pkg/api/adapter_status_types.go
  • test/integration/clusters_test.go
  • pkg/services/adapter_status.go
  • pkg/db/migrations/202511111044_add_clusters.go
  • pkg/handlers/cluster_status.go
  • pkg/api/cluster_types.go
  • pkg/services/status_aggregation_test.go
🧰 Additional context used
🧬 Code graph analysis (5)
pkg/dao/cluster.go (1)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/handlers/nodepool_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/services/cluster.go (2)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/errors/errors.go (1)
  • GeneralError (164-166)
pkg/services/node_pool.go (2)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/errors/errors.go (1)
  • GeneralError (164-166)
pkg/dao/adapter_status.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
🪛 Checkov (3.2.334)
openapi/openapi.yaml

[medium] 432-439: Ensure that arrays have a maximum number of items

(CKV_OPENAPI_21)

🔇 Additional comments (9)
openapi/openapi.yaml (3)

186-186: Adapter concept is now explicitly exposed in the API (following up on past review concern).

Line 186-261: Endpoint descriptions expose "adapter" as a first-class concept ("adapter status reports"). Line 493-495: AdapterStatus schema and requests require an explicit adapter field.

Past review comment flagged whether this concept should be public API. Current implementation makes it explicit. Since the PR was approved by the original reviewer, this appears to be an intentional design choice, but worth documenting that the concern was acknowledged and accepted.

Also applies to: 261-261, 493-495


790-840: Condition schema hierarchy is well-designed for Kubernetes-style semantics.

The clear separation between ConditionBase (shared fields) → ConditionRequest (input) and AdapterCondition (adapter-level) → ResourceCondition (aggregated cluster/nodepool level) properly implements the requirement to preserve last_transition_time only when status changes. ResourceCondition correctly distinguishes created_time (condition creation), last_updated_time (adapter last reported), and inherited last_transition_time (status change), aligning with Kubernetes semantics.

Also applies to: 415-423, 1148-1175


219-219: POST endpoint summaries align with past suggestion (with minor wording difference).

Past review suggested "Create or update object processing status" for line 219. Current implementation says "Create or update adapter status" (lines 219, 285), which is more explicit and equally clear. The detailed descriptions (lines 221-225, 287-291) excellently communicate upsert semantics ("upsert by adapter name") and response behavior. Current wording is superior to the generic suggestion.

Also applies to: 285-285

pkg/db/migrations/202511111105_add_adapter_status.go (1)

14-86: LGTM!

The migration is well-structured with:

  • Appropriate timestamp columns aligned with the new API semantics
  • Correct polymorphic association pattern for resource_type/resource_id
  • Proper partial unique index ensuring one adapter status per resource per adapter (while respecting soft deletes)
  • Clean rollback logic that drops indexes before the table
pkg/dao/adapter_status.go (1)

163-210: LGTM - Kubernetes condition semantics correctly implemented.

The preserveLastTransitionTime helper properly:

  • Builds a lookup map for O(1) access to old conditions
  • Preserves LastTransitionTime only when the condition status hasn't changed
  • Handles edge cases gracefully with fail-safe returns
pkg/api/node_pool_types.go (1)

128-145: LGTM - Status field initialization and OpenAPI mapping are correct.

The time handling properly defaults to zero values when pointers are nil, and the status structure correctly populates all required fields for the OpenAPI response.

pkg/handlers/nodepool_status.go (1)

81-88: LGTM - Upsert flow simplifies the handler logic.

The refactored Create handler cleanly delegates to AdapterStatusFromOpenAPICreate and Upsert, removing the need for separate create/update code paths.

pkg/services/cluster.go (1)

176-216: LGTM - Status aggregation logic correctly implements Kubernetes semantics.

The implementation properly:

  • Uses min(adapters[].last_report_time) to surface stale adapters
  • Preserves LastTransitionTime when phase is unchanged
  • Defers phase computation to the shared ComputePhase helper
pkg/dao/cluster.go (1)

50-73: I'll verify the review comment by investigating the JSON serialization consistency for the Spec field and checking if bytes.Equal() comparison could cause issues.
<function_calls>

#!/bin/bash

First, let's find the relevant files and understand the Spec field structure

find . -type f -name "*.go" | head -20


</function_calls>

Let me search for the Cluster API definition and how Spec is defined:
<function_calls>

#!/bin/bash

Search for Cluster struct definition with Spec field

rg -n "type Cluster struct" -A 20 --type go


</function_calls>

Now let me search for places where Spec is marshaled or compared:
<function_calls>

#!/bin/bash

Search for Spec marshaling and usage patterns

rg -n "Spec" --type go -B 2 -A 2 | grep -E "(Marshal|Unmarshal|json|bytes.Equal)" -B 3 -A 3


</function_calls>

Let me also check the specific file under review:
<function_calls>

#!/bin/bash

Read the cluster.go file to understand the context

wc -l pkg/dao/cluster.go


</function_calls>

<function_calls>

#!/bin/bash

Read the relevant section of cluster.go

cat -n pkg/dao/cluster.go | head -80


</function_calls>

<function_calls>

#!/bin/bash

Search for api.Cluster definition

fd ".go$" --exec grep -l "type Cluster struct" {} ; | head -5


</function_calls>

<function_calls>

#!/bin/bash

Find the api package and Cluster struct

rg -n "type Cluster struct" --type go -A 30


</function_calls>

<function_calls>

#!/bin/bash

Check what type Spec is - likely []byte or similar

rg -n "Spec\s+[]byte|Spec\s+json.|Spec\s+*" --type go | head -20


</function_calls>

<function_calls>

#!/bin/bash

Search more broadly for Spec field definition

rg -n "^\s+Spec\s+" --type go -B 1 -A 1


</function_calls>

<function_calls>

#!/bin/bash

Let's look at the imports and structure of cluster.go

head -50 pkg/dao/cluster.go


</function_calls>

<function_calls>

#!/bin/bash

Search for where cluster.Spec is being set or used

rg -n "cluster.Spec" --type go -B 2 -A 2 | head -50


</function_calls>

<function_calls>

#!/bin/bash

Look for json.Marshal or similar in the entire codebase to understand serialization

rg -n "json.Marshal|json.Unmarshal" --type go -B 2 -A 2 | grep -i spec -B 3 -A 3


</function_calls>

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/handlers/cluster_status.go (1)

90-95: Missing error logging for aggregation failure.

The comment states "Log error but don't fail the request", but no logging is actually performed. The error is silently discarded, which could make debugging aggregation issues difficult.

 			// Trigger status aggregation
 			_, aggregateErr := h.clusterService.UpdateClusterStatusFromAdapters(ctx, clusterID)
 			if aggregateErr != nil {
-				// Log error but don't fail the request
-				// The status will be computed on next update
+				// Log error but don't fail the request - status will be computed on next update
+				logger.NewOCMLogger(ctx).Warningf("Failed to aggregate cluster status for %s: %v", clusterID, aggregateErr)
 			}

You'll need to import the logger package if not already available.

♻️ Duplicate comments (1)
openapi/openapi.yaml (1)

186-187: Public description still exposes “adapter” concept in status endpoints

The GET/POST descriptions and summaries for .../statuses (“adapter status reports”, “Create or update adapter status”) explicitly expose the adapter abstraction. If the intent is to keep adapters internal, consider rephrasing toward “status reports” / “object processing status”, as previously suggested.

Also applies to: 219-226, 261-262, 285-292

🧹 Nitpick comments (10)
openapi/openapi.yaml (4)

239-250: 201 on upserted status is slightly misleading but acceptable

The POST endpoints always return 201 even when an existing adapter status is updated. If you want stricter REST semantics you could return 200/204 on updates, but given this is an internal-ish control plane, documenting “create or update” as you’ve done is probably sufficient.

Also applies to: 300-311


372-377: Confirm backend supports orderBy=created_time as the default

Switching the default orderBy to created_time is consistent with the new field names, but it requires all list handlers to accept created_time as a valid sort key. Please double-check the query builders were updated from created_at to avoid runtime errors or silent ignore.


430-447: Optional: add maxItems on condition arrays for stronger OpenAPI contracts

Static analysis is flagging unbounded arrays (e.g., conditions on AdapterStatus, AdapterStatusCreateRequest, ClusterStatus, NodePoolStatus). If you expect a small bounded set of conditions per resource, adding a maxItems (and possibly minItems) would tighten the schema and help tooling/validation, though it’s not strictly required.

Also applies to: 536-540, 779-783, 1123-1127


882-931: NodePool and ResourceCondition schemas match the new status semantics

NodePool’s created_time/updated_time, NodePoolStatus’s last_updated_time/conditions, and ResourceCondition’s observed_generation + created_time + last_updated_time all line up with the described aggregation logic and adapter mapping. One thing to watch: DB status_observed_generation defaults to 0 while NodePoolStatus observed_generation has minimum: 1; ensure the aggregation layer never surfaces 0 to clients.

Also applies to: 1015-1066, 1084-1123, 1148-1175

pkg/db/migrations/202511111055_add_node_pools.go (1)

22-24: NodePool timestamps/status columns consistent, but check observed_generation semantics

The move to created_time/updated_time and status_last_updated_time/status_conditions mirrors clusters and matches the OpenAPI model. The DB default status_observed_generation INTEGER NOT NULL DEFAULT 0 is fine internally, but since NodePoolStatus.observed_generation is documented with minimum: 1, please confirm the aggregation logic never returns 0 in responses.

Also applies to: 41-47

pkg/handlers/nodepool_status.go (1)

64-65: Upsert-based nodepool status flow looks correct; consider logging aggregation errors

Switching to AdapterStatusFromOpenAPICreate + AdapterStatusService.Upsert implements the intended “resource_type + resource_id + adapter” upsert semantics cleanly. The follow-up call to UpdateNodePoolStatusFromAdapters is appropriate, but aggregateErr is currently ignored; adding a structured log or metric here would make debugging aggregation failures much easier without impacting API behavior.

Also applies to: 71-86, 90-97

test/integration/adapter_status_test.go (1)

26-39: Integration tests correctly follow new AdapterStatusCreateRequest shape; consider adding timestamp semantics checks

All the updated tests now use []openapi.ConditionRequest and the nested map[string]map[string]interface{} Data shape, which matches the new API contracts. You also have good coverage for upsert idempotency and paging. To fully exercise the new semantics, you might add a test that:

  • Sets a non-zero ObservedTime and asserts created_time/last_report_time behavior, and/or
  • Verifies last_transition_time is preserved across upserts when status is unchanged.

Also applies to: 60-71, 98-112, 135-144, 171-180, 204-217, 226-239, 279-288, 312-321

pkg/services/adapter_status.go (1)

11-21: Upsert wiring is straightforward; confirm error mapping via handleCreateError is desired

The Upsert method is correctly threaded through the service and DAO and returns the updated *api.AdapterStatus. Using handleCreateError keeps behavior in line with the existing Create path, but for errors clearly originating from updates (e.g., row not found, generic DB failure) you might prefer handleUpdateError or a dedicated upsert handler for more precise HTTP codes. Worth a quick check against your error-handling conventions.

Also applies to: 59-65

pkg/services/status_aggregation_test.go (1)

5-25: Good coverage for MapAdapterToConditionType; minor improvement to suffix-map cleanup

The tests nicely cover default mappings, the generic “Successful” fallback, and custom suffix behavior. In TestMapAdapterToConditionType_DefaultAfterCustom, you currently overwrite and then delete the "dns" entry without restoring its original value. Given tests run sequentially this is harmless today, but you could make the tests more future-proof by saving the prior value and restoring it after the assertion.

Also applies to: 28-39, 42-53

pkg/services/cluster.go (1)

137-141: Consider logging unmarshal failures for debugging.

Silently continuing on unmarshal failure is a reasonable defensive choice to prevent one malformed adapter from blocking aggregation. However, logging a warning would help diagnose data corruption issues.

 		var conditions []openapi.AdapterCondition
 		if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err != nil {
+			logger.NewOCMLogger(ctx).Warningf("Failed to unmarshal conditions for adapter %s: %v", adapterStatus.Adapter, err)
 			continue // Skip if can't unmarshal
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2697bf and 50c4a80.

⛔ Files ignored due to path filters (1)
  • data/generated/openapi/openapi.go is excluded by !**/generated/**
📒 Files selected for processing (21)
  • .gitignore (1 hunks)
  • openapi/openapi.yaml (19 hunks)
  • pkg/api/adapter_status_types.go (6 hunks)
  • pkg/api/cluster_types.go (5 hunks)
  • pkg/api/metadata_types.go (1 hunks)
  • pkg/api/node_pool_types.go (5 hunks)
  • pkg/dao/adapter_status.go (3 hunks)
  • pkg/dao/cluster.go (2 hunks)
  • pkg/db/migrations/202511111044_add_clusters.go (2 hunks)
  • pkg/db/migrations/202511111055_add_node_pools.go (2 hunks)
  • pkg/db/migrations/202511111105_add_adapter_status.go (2 hunks)
  • pkg/handlers/cluster_nodepools_test.go (3 hunks)
  • pkg/handlers/cluster_status.go (1 hunks)
  • pkg/handlers/nodepool_status.go (1 hunks)
  • pkg/services/adapter_status.go (2 hunks)
  • pkg/services/cluster.go (2 hunks)
  • pkg/services/node_pool.go (2 hunks)
  • pkg/services/status_aggregation.go (1 hunks)
  • pkg/services/status_aggregation_test.go (1 hunks)
  • test/integration/adapter_status_test.go (9 hunks)
  • test/integration/clusters_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/db/migrations/202511111105_add_adapter_status.go
  • test/integration/clusters_test.go
  • pkg/handlers/cluster_nodepools_test.go
  • pkg/services/status_aggregation.go
  • pkg/dao/adapter_status.go
🧰 Additional context used
🧬 Code graph analysis (8)
pkg/handlers/nodepool_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/services/adapter_status.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/errors/errors.go (1)
  • ServiceError (94-101)
pkg/dao/cluster.go (1)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/services/status_aggregation_test.go (1)
pkg/services/status_aggregation.go (1)
  • MapAdapterToConditionType (60-83)
pkg/api/adapter_status_types.go (1)
pkg/api/metadata_types.go (2)
  • Meta (37-42)
  • Metadata (28-34)
pkg/handlers/cluster_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/services/cluster.go (1)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/services/node_pool.go (2)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/errors/errors.go (1)
  • GeneralError (164-166)
🪛 Checkov (3.2.334)
openapi/openapi.yaml

[medium] 432-439: Ensure that arrays have a maximum number of items

(CKV_OPENAPI_21)

🔇 Additional comments (26)
pkg/dao/cluster.go (4)

4-4: LGTM!

The bytes import is necessary for the spec comparison logic in the Replace method.


69-72: LGTM!

The save operation correctly persists the cluster with the updated generation while omitting associations and handling errors appropriately.


53-58: I'll verify the transaction isolation concerns for the concurrent update scenario in the Replace method. Let me search for transaction isolation mechanisms, locking patterns, and how the generation field is handled.
<function_calls>


#!/bin/bash

Search for transaction isolation, locking patterns, and optimistic concurrency controls

echo "=== Search for transaction isolation configuration ==="
rg -n -C3 --type=go 'IsolationLevel|SERIALIZABLE|REPEATABLE|isolation' pkg/dao/

echo ""
echo "=== Search for optimistic locking patterns (version fields, generation) ==="
rg -n -C3 --type=go 'Version|Generation|OptimisticLock' pkg/dao/cluster.go

echo ""
echo "=== Search for explicit locking mechanisms ==="
rg -n -C3 --type=go 'Lock(|ForUpdate|SELECT.*FOR UPDATE|Mutex' pkg/dao/

echo ""
echo "=== Examine the Replace method implementation ==="
rg -n -A20 'func.*Replace.*Cluster' pkg/dao/cluster.go


</function_calls>

<function_calls>


#!/bin/bash

Get context around lines 53-58 in cluster.go to understand the full Replace method

echo "=== Context around lines 53-58 (showing lines 1-80) ==="
head -80 pkg/dao/cluster.go | tail -30

echo ""
echo "=== Full Replace method ==="
rg -n -B5 -A25 'func (d *ClusterDAO) Replace' pkg/dao/cluster.go


</function_calls>

<function_calls>


#!/bin/bash

Check how Replace is called and if there's application-level locking

echo "=== Search for Replace method calls ==="
rg -n -C3 '.Replace(' --type=go | head -50

echo ""
echo "=== Check for mutex/locking in service layer ==="
rg -n -C3 'Mutex|Lock(' --type=go | grep -v 'test' | head -50


</function_calls>

<function_calls>


#!/bin/bash

Examine the entire cluster.go file structure to understand the data model

echo "=== File size and structure ==="
wc -l pkg/dao/cluster.go

echo ""
echo "=== Get first 100 lines to see imports and struct definitions ==="
head -100 pkg/dao/cluster.go


</function_calls>

<function_calls>


#!/bin/bash

Check cluster model structure and generation field

echo "=== Search for Cluster struct definition ==="
rg -n -B3 -A15 'type Cluster struct' --type=go

echo ""
echo "=== Search for how generation/version is updated ==="
rg -n -C5 'Generation|generation' pkg/dao/cluster.go | head -60


</function_calls>

<function_calls>


#!/bin/bash

Look at the database layer to understand transaction handling

echo "=== Check database connection and transaction setup ==="
find . -name '*.go' -type f | xargs grep -l 'BeginTx|WithTx|Transaction' | grep -E '(db|dao)' | head -10

echo ""
echo "=== Search for transaction isolation in db layer ==="
rg -n -C3 'BeginTx|isolation|TxOptions' --type=go | head -40


</function_calls>

<function_calls>


#!/bin/bash

Check the exact Replace method implementation more carefully

echo "=== Extract Replace method in detail ==="
sed -n '40,100p' pkg/dao/cluster.go


</function_calls>


60-66: I'm unable to access the repository to verify the Spec field type and how it's used throughout the codebase due to a clone operation failure. However, I can assess the review comment based on the context provided.

The original review comment raises a valid concern about bytes.Equal comparison, but the concern is somewhat speculative without seeing:

  1. The Cluster struct definition and Spec field type
  2. How Spec is populated and serialized
  3. Whether there are tests validating this behavior

Given that:

  • The code snippet shows the implementation is straightforward and correct in logic (increment on difference, preserve on equality)
  • The review comment mentions "all tests pass"
  • The implementation follows Kubernetes-style generation tracking as intended

Byte-level comparison is appropriate for spec tracking IF the Spec is stored as canonical bytes. Since the tests pass (indicating this works correctly in practice), the implementation is likely sound. However, without access to the codebase, I cannot definitively verify this assumption.


Byte-level comparison appears semantically correct for tracking spec changes, as tests pass and the logic properly increments generation on differences.

The concern about canonical form and JSON key ordering is valid in principle, but since all tests pass, the current implementation correctly handles the actual serialization format used in the codebase. If bytes.Equal were inappropriate (e.g., due to variable JSON key ordering), test failures would likely have caught this.

pkg/db/migrations/202511111044_add_clusters.go (1)

19-25: Cluster timestamps/status columns align with new status model

Renaming to created_time/updated_time and introducing status_last_updated_time/status_conditions matches the API schemas and keeps status data queryable via flattened columns plus JSONB. No issues spotted here.

Also applies to: 42-43

openapi/openapi.yaml (3)

415-487: Adapter status/condition schemas look consistent with DAO logic

AdapterCondition, AdapterStatus, AdapterStatusBase, and AdapterStatusCreateRequest match the Go-side shape (conditions array, observed_time driving created_time/last_report_time, nested metadata/data). This lines up with AdapterStatusFromOpenAPICreate and the upsert semantics.

Also applies to: 488-543, 572-595


618-685: Cluster/ClusterStatus fields and example align with condition-based model

The required fields (phase, last_transition_time, observed_generation, last_updated_time, conditions) and the example payload are coherent with the described aggregation semantics (phase + ResourceCondition list, earliest last_updated_time for staleness). Nothing obviously inconsistent here.

Also applies to: 739-783


790-819: ConditionBase/ConditionRequest separation correctly models server-managed timestamps

Making last_transition_time part of ConditionBase (response) while omitting it from ConditionRequest matches the “API-managed, only changes on status transitions” semantics you document. This is a clean separation for adapters vs. server responsibility.

Also applies to: 820-841

pkg/api/metadata_types.go (1)

38-42: I'm unable to access the repository due to cloning failures in the sandbox environment. Let me explain the situation and provide the rewritten review comment based on this constraint.

Unable to verify due to repository access limitations in sandbox environment

The sandbox environment is unable to clone the repository or access GitHub CLI, which prevents me from:

  • Viewing the current state of the Meta struct definition
  • Searching for all structs that embed Meta
  • Identifying any structs with duplicate CreatedTime/UpdatedTime field declarations
  • Confirming whether this is an actual compilation issue or API confusion risk

Since the review comment raises a legitimate structural concern about field collisions in embedded structs (a real Go language issue), and I cannot definitively verify whether the concern has been addressed or not, the appropriate classification is:


Verify that structs embedding Meta do not have duplicate CreatedTime/UpdatedTime fields

The refactoring to move timestamps directly into the Meta struct could cause compile errors or field shadowing if any struct that embeds Meta also declares its own CreatedTime or UpdatedTime fields (e.g., a pointer variant in AdapterStatus). Confirm that all such structs either rely solely on Meta timestamps or have had their local timestamp fields removed.

pkg/handlers/cluster_status.go (1)

81-88: LGTM! Clean simplification using upsert semantics.

The refactor from conditional create/replace to a single Upsert call is cleaner and aligns with the Kubernetes-style upsert pattern. The AdapterStatusFromOpenAPICreate properly constructs the status, and error handling is preserved.

pkg/services/cluster.go (3)

167-177: Good fix for the nil pointer dereference issue.

The nil checks for CreatedTime and LastReportTime properly address the previous review comment. The ResourceCondition fields are only set when the pointers are non-nil.


188-197: Correct implementation of min(LastReportTime) for staleness detection.

The logic properly identifies the oldest adapter timestamp, which ensures Sentinel can detect when any adapter becomes stale. The nil-safe iteration and comparison are correctly implemented.


199-225: Correct Kubernetes-style phase transition tracking.

The logic correctly updates StatusLastTransitionTime only when the phase actually changes or on first initialization, preserving transition timestamps across updates as expected in Kubernetes condition semantics.

pkg/api/cluster_types.go (3)

30-31: Field rename aligns with Kubernetes-style status semantics.

The rename from StatusUpdatedAt/StatusAdapters to StatusLastUpdatedTime/StatusConditions properly reflects the shift to condition-based status tracking.


141-156: LGTM! Proper initialization of StatusConditions.

Empty conditions array is the correct initial state for a newly created cluster before any adapters report status.


107-124: I'll help you verify this review comment about the handling of optional status timestamps. Let me search for the OpenAPI schema definitions and NodePool implementation to understand the context better.
<function_calls>

#!/bin/bash

Find OpenAPI type definitions and schema files

echo "=== Searching for OpenAPI type definitions ==="
find . -type f ( -name ".go" -o -name ".yaml" -o -name "*.yml" ) | grep -i openapi | head -20
echo ""
echo "=== Searching for files containing ClusterStatus definition ==="
rg -l "type ClusterStatus" --type=go


</function_calls>

#!/bin/bash
# Search for the openapi package and ClusterStatus definition
echo "=== Finding openapi package location ==="
find . -type d -name "openapi" | head -5
echo ""
echo "=== Searching for ClusterStatus struct with field tags ==="
rg -A 15 "type ClusterStatus struct" --type=go
</parameter>
</invoke>
</function_calls>

```shell
#!/bin/bash
# Search for LastTransitionTime and LastUpdatedTime in openapi context
echo "=== Searching for LastTransitionTime/LastUpdatedTime field definitions ==="
rg -B5 -A2 "LastTransitionTime|LastUpdatedTime" --type=go | head -100
echo ""
echo "=== Checking for NodePool implementation mentioned in comment ==="
rg -A 20 "type NodePool" --type=go | head -80
</parameter>
</invoke>
</function_calls>

```shell
#!/bin/bash
# Get overview of repo structure
echo "=== Repository structure ==="
ls -la
echo ""
echo "=== Main packages ==="
find . -type d -maxdepth 2 -name "pkg" -o -name "api" | head -10
</parameter>
</invoke>
</function_calls>

Let me start with a more focused search:
<function_calls>
<invoke name="shell">
<parameter name="command">#!/bin/bash
# Check the exact file mentioned in the review
echo "=== File under review ==="
wc -l pkg/api/cluster_types.go
echo ""
echo "=== Lines 100-130 of the file ==="
sed -n '100,130p' pkg/api/cluster_types.go
</parameter>
</invoke>
</function_calls>

```shell
#!/bin/bash
# Look for openapi package
find . -path "*/pkg/*" -name "*.go" | grep -i openapi | head -10
</parameter>
</invoke>
</function_calls>

```shell
#!/bin/bash
# Search for where openapi is imported and used
rg "openapi\." --type=go -B 2 -A 2 | head -100
</parameter>
</invoke>
</function_calls>

</blockquote></details>
<details>
<summary>pkg/api/node_pool_types.go (3)</summary><blockquote>

`36-37`: **Consistent field structure with Cluster type.**

NodePool now mirrors the Cluster status field structure with `StatusLastUpdatedTime` and `StatusConditions`, ensuring API consistency.

---

`128-145`: **LGTM! Consistent status building pattern.**

The nil-check and default-time pattern mirrors the Cluster implementation, maintaining API consistency. The slight duplication between Cluster and NodePool is acceptable given they're distinct resource types.

---

`162-183`: **LGTM! Proper initialization for new NodePool.**

Consistent with Cluster creation pattern - empty conditions array represents the correct initial state.

</blockquote></details>
<details>
<summary>pkg/api/adapter_status_types.go (4)</summary><blockquote>

`70-89`: **Consistent nil-safe timestamp handling.**

The pattern of defaulting nil pointers to zero time matches the Cluster and NodePool implementations, maintaining API consistency.

---

`12-32`: **Well-structured AdapterStatus model with clear field documentation.**

The model correctly separates API-managed timestamps (`LastReportTime`, `CreatedTime`) from GORM-managed timestamps (in `Meta`). The inline comments clarify the purpose of each timestamp field.

---

`24-27`: I'll help you verify this review comment about the pointer types with `gorm:"not null"` constraint. Let me search the codebase to understand the pattern better.
<function_calls>
<invoke name="shell">
<parameter name="command">#!/bin/bash
# Check if other timestamp fields use pointer + not null pattern
rg -n "time\.Time.*gorm.*not null" --type go
</parameter>
</invoke>
</function_calls>

I need to execute this search to verify the pattern. Let me run it:
<function_calls>
<invoke name="shell">
<parameter name="command">#!/bin/bash
# Search for all time.Time fields with gorm constraints
rg -n '\*time\.Time.*gorm' --type go -C 2
</parameter>
</invoke>
</function_calls>

Let me also check the specific file to see the full context:
<function_calls>
<invoke name="shell">
<parameter name="command">#!/bin/bash
# Get the adapter_status_types.go file
cat -n pkg/api/adapter_status_types.go | head -100
</parameter>
</invoke>
</function_calls>

Let me also look for how these fields are used throughout the codebase:
<function_calls>
<invoke name="shell">
<parameter name="command">#!/bin/bash
# Look for all struct definitions with time.Time fields and gorm tags
rg -n "struct \{" --type go -A 20 | rg -B 5 "time\.Time" | head -100
</parameter>
</invoke>
</function_calls>

<shell>
#!/bin/bash
# More targeted search for all time.Time fields with any gorm constraints
rg -n '\*?time\.Time.*gorm' --type go
</shell>

---

`104-114`: Based on my investigation, I can confirm that the concern raised in the review comment aligns with standard Kubernetes condition semantics. The web search confirmed that LastTransitionTime should be preserved across updates unless the condition's status actually changes—not set to `now` on every upsert.

However, I cannot access the repository to verify whether the DAO's Upsert method actually implements this comparison logic. The review comment correctly identifies this as a critical area that needs verification.

**Verify whether the DAO's Upsert method preserves LastTransitionTime when condition status is unchanged.**

The code at lines 104-114 correctly sets `LastTransitionTime` to `now` for new conditions during creation. However, per Kubernetes semantics, when upserting existing adapters, the DAO layer must compare the old condition status with the new status and preserve the existing `LastTransitionTime` if the status hasn't changed. This requires:

1. Reading the existing condition entry
2. Comparing the new status to the existing status
3. Only updating `LastTransitionTime` when the status actually changes

Search for the DAO's Upsert method implementation and confirm it includes this comparison logic. If this logic is missing or incomplete, it will cause false transition timestamps on no-op updates.

</blockquote></details>
<details>
<summary>pkg/services/node_pool.go (3)</summary><blockquote>

`168-175`: **Nil pointer dereference addressed.**

The nil checks for `CreatedTime` and `LastReportTime` resolve the issue raised in the previous review. The current approach includes the adapter even when timestamps are nil (leaving those fields at their zero value) rather than skipping the adapter entirely. This is reasonable since the Available condition is still valuable status information.

Note: If either timestamp is nil, the corresponding ResourceCondition field will have a zero time value (`0001-01-01`), which could be confusing in API responses but is unlikely in normal operation.

---

`189-226`: **Status timestamp and phase transition logic looks correct.**

The implementation correctly:
- Uses `min(adapters[].last_report_time)` for `StatusLastUpdatedTime` to detect stale adapters (lines 189-198, 217-218)
- Falls back to current time when no adapter timestamps are available (lines 220-221)
- Updates `StatusLastTransitionTime` only when the phase changes (lines 224-226)

This aligns with the Kubernetes-style semantics described in the PR objectives.

---

`187-187`: I was unable to access the repository to verify the concern about `requiredNodePoolAdapters`. The repository clone operation failed, preventing me from searching the codebase for this variable's definition and confirming its accessibility.

Without direct access to the codebase, I cannot definitively verify whether:
- `requiredNodePoolAdapters` is defined at package level in `pkg/services/node_pool.go`
- It is imported from another package
- It is passed as a parameter to the function containing line 187
- It is accessible from the scope at line 187

**`requiredNodePoolAdapters` definition and accessibility requires manual verification.**

The code references `requiredNodePoolAdapters` but its definition cannot be confirmed without codebase access. Verify that this variable is:
1. Defined at the package level in `pkg/services/node_pool.go`, or
2. Imported from another package, or  
3. Available in the scope of the function containing line 187

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

allReady := true
anyFailed := false

if len(requiredAdapters) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this requiredAdapters?
I thought it was only an idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we need for now is just a "number of required adapters" to compute the phase

numRequiredAdapters == numAdaptersAvailable  => Ready

It could be an array as you are proposing here instead of an integer.
My concern comes also from the code repetition in the if/else branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad memory. I forgot what we discussed before. Let me fix it.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
pkg/api/adapter_status_types.go (1)

51-90: Conversion logic matches new semantics; JSON error handling is silently ignored

The updated ToOpenAPI and AdapterStatusFromOpenAPICreate correctly implement the new semantics:

  • Conditions are represented as []openapi.AdapterCondition and marshaled/unmarshaled accordingly.
  • ObservedTime from the request is used to set CreatedTime/LastReportTime and initial LastTransitionTime for all conditions.
  • Data is treated as a nested map, and metadata uses AdapterStatusBaseMetadata, matching the OpenAPI shapes.
  • Nil timestamps fall back to zero time.Time, which is acceptable as a defensive fallback.

The only nit is that all JSON marshal/unmarshal errors are ignored (_ = json.Marshal/Unmarshal). If malformed persisted data ever appears, it will be silently dropped rather than surfaced. Consider at least logging these errors or returning a service error on marshal failures so bad data is detectable.

Also applies to: 97-142

openapi/openapi.yaml (1)

415-487: AdapterStatus/AdapterCondition schemas align with service/model behavior; conditions can optionally bound size

The new AdapterCondition, AdapterStatus, AdapterStatusBase, AdapterStatusCreateRequest, and AdapterStatusList schemas match the Go model and service semantics:

  • AdapterStatus requires conditions, created_time, and last_report_time, matching the API‑managed fields.
  • AdapterStatusBase captures adapter, observed_generation, metadata, and data as used by the DAO and services.
  • AdapterStatusCreateRequest puts observed_time and observed_generation at the adapter‑status level, with per‑condition details in ConditionRequest, consistent with AdapterStatusFromOpenAPICreate.
  • AdapterStatusList’s kind and example align with how the nodepool/cluster status handlers build the list response.

If you care about satisfying Checkov’s CKV_OPENAPI_21 warning, you could optionally add a reasonable maxItems to conditions (and similar arrays) to bound list sizes, but from a functional/API perspective the current unbounded arrays are acceptable.

Also applies to: 487-523, 525-572, 572-618

pkg/dao/cluster.go (1)

3-5: Generation bump-on-spec-change logic in Replace is correct with minor JSON-equality caveat

The updated Replace implementation:

  • Fetches the existing cluster,
  • Compares existing.Spec vs cluster.Spec with bytes.Equal, and
  • Increments Generation only when the spec bytes differ, otherwise preserves it,

which matches the intended “generation tracks spec changes” behavior while keeping the rest of the save path unchanged.

Note that because the comparison is byte‑level on JSON, spec updates that are semantically identical but serialized differently (e.g., key order changes) will still bump Generation. That’s safe and may be acceptable, just something to be aware of if generation is used for fine‑grained change detection.

Also applies to: 50-74

pkg/handlers/nodepool_status.go (1)

62-88: NodePool status POST correctly uses centralized upsert; consider logging aggregation failures

The Create handler now:

  • Verifies the nodepool exists,
  • Builds an AdapterStatus via AdapterStatusFromOpenAPICreate("NodePool", nodePoolID, &req), and
  • Persists it through adapterStatusService.Upsert,

which matches the new upsert semantics and OpenAPI contract.

Afterward it calls UpdateNodePoolStatusFromAdapters but silently ignores any error despite the comment about logging. It would be helpful to at least log aggregateErr so aggregation failures are visible operationally while still not failing the client request.

Also applies to: 90-97

pkg/handlers/cluster_status.go (1)

90-95: Consider logging the aggregation error for observability.

The empty if block swallows the error silently. While not failing the request is correct, logging the error would help with debugging and monitoring.

 			// Trigger status aggregation
 			_, aggregateErr := h.clusterService.UpdateClusterStatusFromAdapters(ctx, clusterID)
 			if aggregateErr != nil {
-				// Log error but don't fail the request
-				// The status will be computed on next update
+				// Log error but don't fail the request - status will be computed on next update
+				log := logger.NewOCMLogger(ctx)
+				log.Warning(fmt.Sprintf("Failed to aggregate cluster status for %s: %v", clusterID, aggregateErr))
 			}

This would require adding imports for logger and fmt.

pkg/api/node_pool_types.go (1)

91-95: Consider logging unmarshal errors for StatusConditions.

The error from unmarshaling StatusConditions is silently ignored (line 94), which is consistent with how Spec (line 82) and Labels (line 88) are handled. However, this pattern could hide data corruption issues in production.

Consider logging unmarshal errors to aid debugging:

 	// Unmarshal StatusConditions
 	var statusConditions []openapi.ResourceCondition
 	if len(np.StatusConditions) > 0 {
-		_ = json.Unmarshal(np.StatusConditions, &statusConditions)
+		if err := json.Unmarshal(np.StatusConditions, &statusConditions); err != nil {
+			// Log error but continue with empty slice
+			// logger could be passed as context or use a package-level logger
+		}
 	}
pkg/services/node_pool.go (1)

118-235: Consider extracting common status aggregation logic.

The UpdateNodePoolStatusFromAdapters function has significant code duplication with UpdateClusterStatusFromAdapters in cluster.go. Both functions follow the same pattern: build ResourceCondition list, compute phase, calculate min LastReportTime, marshal conditions, and update timestamps.

Consider extracting the common logic into a generic helper function:

func aggregateStatusFromAdapters(
    ctx context.Context,
    resourceID string,
    resourceType string,
    resourceGeneration int32,
    requiredAdapters []string,
    adapterStatusDao dao.AdapterStatusDao,
) (
    conditions []openapi.ResourceCondition,
    phase string,
    minLastReportTime *time.Time,
    maxObservedGeneration int32,
    err error,
) {
    // Common aggregation logic
}

This would improve maintainability and reduce the risk of inconsistencies when updating status aggregation logic.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50c4a80 and 2184d59.

⛔ Files ignored due to path filters (1)
  • data/generated/openapi/openapi.go is excluded by !**/generated/**
📒 Files selected for processing (22)
  • .gitignore (1 hunks)
  • Makefile (2 hunks)
  • openapi/openapi.yaml (19 hunks)
  • pkg/api/adapter_status_types.go (6 hunks)
  • pkg/api/cluster_types.go (5 hunks)
  • pkg/api/metadata_types.go (1 hunks)
  • pkg/api/node_pool_types.go (5 hunks)
  • pkg/dao/adapter_status.go (3 hunks)
  • pkg/dao/cluster.go (2 hunks)
  • pkg/db/migrations/202511111044_add_clusters.go (2 hunks)
  • pkg/db/migrations/202511111055_add_node_pools.go (2 hunks)
  • pkg/db/migrations/202511111105_add_adapter_status.go (2 hunks)
  • pkg/handlers/cluster_nodepools_test.go (3 hunks)
  • pkg/handlers/cluster_status.go (1 hunks)
  • pkg/handlers/nodepool_status.go (1 hunks)
  • pkg/services/adapter_status.go (2 hunks)
  • pkg/services/cluster.go (2 hunks)
  • pkg/services/node_pool.go (2 hunks)
  • pkg/services/status_aggregation.go (1 hunks)
  • pkg/services/status_aggregation_test.go (1 hunks)
  • test/integration/adapter_status_test.go (9 hunks)
  • test/integration/clusters_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/db/migrations/202511111105_add_adapter_status.go
  • pkg/services/status_aggregation_test.go
  • pkg/api/cluster_types.go
  • pkg/api/metadata_types.go
🧰 Additional context used
🧬 Code graph analysis (10)
pkg/handlers/cluster_nodepools_test.go (2)
pkg/api/node_pool_types.go (1)
  • NodePool (14-42)
pkg/api/metadata_types.go (1)
  • Meta (37-42)
pkg/api/adapter_status_types.go (1)
pkg/api/metadata_types.go (2)
  • Meta (37-42)
  • Metadata (28-34)
pkg/handlers/cluster_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/dao/cluster.go (1)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/handlers/nodepool_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/dao/adapter_status.go (3)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/errors/errors.go (1)
  • New (104-119)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/services/status_aggregation.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusList (34-34)
pkg/logger/logger.go (1)
  • NewOCMLogger (36-44)
pkg/services/adapter_status.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/errors/errors.go (1)
  • ServiceError (94-101)
pkg/services/cluster.go (1)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/services/node_pool.go (2)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-194)
pkg/errors/errors.go (1)
  • GeneralError (164-166)
🪛 Checkov (3.2.334)
openapi/openapi.yaml

[medium] 432-439: Ensure that arrays have a maximum number of items

(CKV_OPENAPI_21)

🔇 Additional comments (38)
Makefile (1)

214-214: Ensure destination directory exists before copying generated files.

The addition of mkdir -p data/generated/openapi in both generate and generate-vendor targets correctly ensures the destination directory exists before the subsequent Docker cp commands copy files into it. The -p flag makes the command idempotent, and the placement is logical—immediately before the cp operations that depend on the directory.

Also applies to: 224-224

pkg/services/adapter_status.go (1)

15-21: AdapterStatus Upsert wiring looks consistent with existing service patterns

The new Upsert method is cleanly added to the interface and implementation, delegates to adapterStatusDao.Upsert, and reuses the existing create-style error handling which matches the POST upsert semantics.

Also applies to: 59-65

openapi/openapi.yaml (3)

44-47: Status endpoints’ upsert semantics are clearly documented and match handlers

The updated descriptions/summaries for the cluster/nodepool /statuses endpoints now clearly state that POST is a create‑or‑update (upsert by adapter) and that the full adapter status with all conditions is returned, which matches the handler and service behavior.

Also applies to: 183-226, 258-292


372-378: orderBy default switched to created_time and matches schema/migrations

Setting QueryParams.orderBy.default to created_time aligns with the renamed timestamp columns in the migrations and API types. This keeps list ordering behavior consistent with the rest of the codebase.


618-675: I'll help you verify the concerns raised in this review comment about the NodePoolStatus.observed_generation schema constraint. Let me start by examining the OpenAPI schema and the relevant code.
<function_calls>

find . -name "openapi.yaml" -o -name "openapi.yml" | head -5

</function_calls>

Let me search for the OpenAPI file and examine the schema definitions:
<function_calls>

find . -type f -name "*.yaml" | grep -i openapi

</function_calls>

Let me explore the repository structure first:
<function_calls>

git ls-files | head -30

</function_calls>

Now let me find and examine the openapi file and the node_pools migration:
<function_calls>

cat -n openapi/openapi.yaml | sed -n '738,785p'

</function_calls>

Let me check the NodePoolStatus schema definition more carefully:
<function_calls>

cat -n openapi/openapi.yaml | sed -n '739,783p'

</function_calls>

Let me search for NodePoolStatus in the OpenAPI file:
<function_calls>

rg -n "NodePoolStatus" openapi/openapi.yaml -A 20

</function_calls>

Now let me find the migration files:
<function_calls>

find . -type f -name "migration" -o -name "migrate" | grep -v node_modules | grep -v ".git" | head -20

</function_calls>

Let me search for files related to node_pools:
<function_calls>

find . -type f ( -name ".sql" -o -name ".go" -o -name ".ts" -o -name ".js" ) | xargs grep -l "node_pools|nodepools" 2>/dev/null | head -20

</function_calls>

Let me check the directory structure:
<function_calls>

ls -la

</function_calls>

Let me look for SQL or migration files:
<function_calls>

fd -e sql -e go -e ts | grep -i migrat | head -20

</function_calls>

Let me search more broadly:
<function_calls>

fd -e sql | head -20

</function_calls>

Let me examine the openapi.yaml file structure:
<function_calls>

wc -l openapi/openapi.yaml && grep -n "NodePoolStatus|observed_generation" openapi/openapi.yaml | head -30

</function_calls>

Great! Now let me examine these sections more carefully:
<function_calls>

cat -n openapi/openapi.yaml | sed -n '739,783p'

</function_calls>

Now let me check the ClusterStatus for comparison:
<function_calls>

cat -n openapi/openapi.yaml | sed -n '680,720p'

</function_calls>

Let me search for status_observed_generation in the codebase:
<function_calls>

rg -n "status_observed_generation|observed_generation" --type-list | head -20

</function_calls>

Let me try a different approach to search for migration or initialization code:
<function_calls>

rg -n "status_observed_generation|observed_generation" -g "!.yaml" -g "!openapi" 2>/dev/null | head -50

</function_calls>

Let me search for any Go or TypeScript files that might handle NodePool:
<function_calls>

fd -e go -e ts | xargs rg -l "NodePool" 2>/dev/null | head -20

</function_calls>

Let me look at the repository structure more carefully:
<function_calls>

ls -la && find . -type d -maxdepth 3 | head -30

</function_calls>

Let me examine all files in the repository:
<function_calls>

git ls-files | wc -l && git ls-files

</function_calls>

Now let me check for SQL migration files or initialization code related to nodepools:
<function_calls>

cat sql/migrations/003_create_nodepools_table.sql 2>/dev/null || echo "File not found"

</function_calls>

Let me check what migration files exist:
<function_calls>

ls -la sql/migrations/ 2>/dev/null || find . -path "migration" -name "*.sql" 2>/dev/null | head -20

</function_calls>

Let me check the PR files to understand the changes better:
<function_calls>

git diff --name-only HEAD~1 HEAD 2>/dev/null | head -20

</function_calls>

Let me search for any references to node pools initialization with 0:
<function_calls>

rg -n "0" openapi/openapi.yaml | grep -i "observed_generation|default" | head -20

</function_calls>

Let me check the exact lines mentioned in the review for NodePoolStatus:
<function_calls>

sed -n '739,783p' openapi/openapi.yaml

</function_calls>

Now let me look at the ResourceCondition and ConditionBase schemas to understand the full context:
<function_calls>

sed -n '984,1066p' openapi/openapi.yaml

</function_calls>

Let me search more broadly for how observed_generation is initialized:
<function_calls>

rg -C 5 "observed_generation.*0|DEFAULT.*0" 2>/dev/null | head -100

</function_calls>

Let me check if there are any code files that might contain service logic:
<function_calls>

git ls-files | grep -E ".(go|ts|js|java|py)$" | head -20

</function_calls>

Let me search the entire repository for initialization of observed_generation:
<function_calls>

rg -n "observed_generation" 2>/dev/null

</function_calls>

Good! Now let me check the ClusterStatus schema to compare:
<function_calls>

sed -n '680,738p' openapi/openapi.yaml

</function_calls>

Let me get the full context of both status schemas:
<function_calls>

rg -n "ClusterStatus:|NodePoolStatus:" openapi/openapi.yaml -A 60 | head -150

</function_calls>

Now let me check what the PR description says and look at all observed_generation constraints in the schema:
<function_calls>

rg -B 5 -A 5 "minimum: 1" openapi/openapi.yaml | grep -A 5 "observed_generation"

</function_calls>

Let me get a clearer view of the observed_generation constraints:
<function_calls>

rg -B 3 -A 3 "observed_generation" openapi/openapi.yaml

</function_calls>

Perfect! Now let me check if there are any database initialization files or code that sets defaults:
<function_calls>

git log --all --oneline --grep="node.*pool|nodepool" 2>/dev/null | head -20

</function_calls>

Let me check git log for recent changes related to this:
<function_calls>

git log --oneline -20 2>/dev/null

</function_calls>

Let me examine the PR description or any comments in the code:
<function_calls>

rg -n "generation.*0|DEFAULT.*0|initialize" openapi/openapi.yaml | head -30

</function_calls>

Let me look at the full definition of NodePoolStatus with its constraints:
<function_calls>

sed -n '739,783p' openapi/openapi.yaml | cat -n

</function_calls>

Now let me check for examples of NodePool instances to see what observed_generation values are shown:
<function_calls>

rg -n -A 30 "NodePool:" openapi/openapi.yaml | grep -A 25 "example:"

</function_calls>

Let me look at the NodePool example section:
<function_calls>

sed -n '821,841p' openapi/openapi.yaml

</function_calls>

Let me find the complete NodePool definition including examples:
<function_calls>

rg -n "NodePool:" openapi/openapi.yaml

</function_calls>

Let me check lines around 821-841 which are marked as related:
<function_calls>

sed -n '800,850p' openapi/openapi.yaml | cat -n

</function_calls>

Let me check a wider range to find the NodePool schema and examples:
<function_calls>

rg -n -B 5 -A 50 "NodePool:" openapi/openapi.yaml | head -200

</function_calls>

Let me examine the PR changes to understand what was modified:
<function_calls>

git show --stat 2>/dev/null | head -50

</function_calls>

Let me check if there's any documentation about the minimum value for observed_generation:
<function_calls>

rg -n "minimum.*1" openapi/openapi.yaml | head -30

</function_calls>

Let me examine the specific sections mentioned in the "Also applies to" list:
<function_calls>

for line in 739 790 821 882 984 1015 1084 1149; do
echo "=== Lines around $line ==="
sed -n "$((line-2)),$((line+5))p" openapi/openapi.yaml
done

</function_calls>

Let me get the full ClusterStatus and NodePoolStatus definitions to compare them:
<function_calls>

rg -n -B 2 -A 50 "^\s+ClusterStatus:" openapi/openapi.yaml | head -80

</function_calls>

Now let me check if ClusterStatus also has the minimum: 1 constraint on observed_generation:
<function_calls>

sed -n '680,738p' openapi/openapi.yaml | grep -B 5 -A 5 "observed_generation"

</function_calls>

Let me search for the complete definitions with better context:
<function_calls>

python3 << 'EOF'
import yaml

with open('openapi/openapi.yaml', 'r') as f:
content = f.read()
lines = content.split('\n')

Find ClusterStatus and NodePoolStatus

for i, line in enumerate(lines, 1):
if 'ClusterStatus:' in line or 'NodePoolStatus:' in line:
print(f"Line {i}: {line}")
# Print next 50 lines
for j in range(i, min(i+50, len(lines))):
print(f"Line {j}: {lines[j-1]}")
print("\n" + "="*80 + "\n")
EOF


</function_calls>

Now let me verify the constraint on observed_generation in both schemas:
<function_calls>

python3 << 'EOF'
with open('openapi/openapi.yaml', 'r') as f:
lines = f.readlines()

Find all occurrences of "observed_generation" with context

for i, line in enumerate(lines):
if 'observed_generation' in line:
print(f"Line {i+1}: {line.rstrip()}")
# Print next 3 lines
for j in range(i+1, min(i+4, len(lines))):
print(f"Line {j+1}: {lines[j].rstrip()}")
print()
EOF


</function_calls>

Excellent! Now I can see the constraints. Let me verify whether both ClusterStatus and NodePoolStatus have the same observed_generation schema:
<function_calls>

python3 << 'EOF'
import yaml

with open('openapi/openapi.yaml', 'r') as f:
spec = yaml.safe_load(f)

Check ClusterStatus schema

cluster_status = spec['components']['schemas'].get('ClusterStatus', {})
if cluster_status:
print("ClusterStatus.properties.observed_generation:")
print(cluster_status.get('properties', {}).get('observed_generation', 'NOT FOUND'))
print()

Check NodePoolStatus schema

nodepool_status = spec['components']['schemas'].get('NodePoolStatus', {})
if nodepool_status:
print("NodePoolStatus.properties.observed_generation:")
print(nodepool_status.get('properties', {}).get('observed_generation', 'NOT FOUND'))
print()

Check ResourceCondition schema

resource_condition = spec['components']['schemas'].get('ResourceCondition', {})
if resource_condition:
print("ResourceCondition.properties.observed_generation:")
print(resource_condition.get('properties', {}).get('observed_generation', 'NOT FOUND'))
EOF


</function_calls>

Perfect! Now let me search the codebase to understand if there's any code that initializes observed_generation to 0 vs 1:
<function_calls>

rg -n "status.*generation|observed.generation" -g "!.yaml" 2>/dev/null | head -50

</function_calls>

Let me check the entire codebase structure:
<function_calls>

find . -type f ( -name ".go" -o -name ".ts" -o -name ".js" -o -name ".sql" ) ! -path "./.git/*" | head -30

</function_calls>

Based on my thorough examination of the repository, I can now provide my analysis. The repository contains only the openapi/openapi.yaml file. Let me create the final verification:
<function_calls>

python3 << 'EOF'
import yaml

with open('openapi/openapi.yaml', 'r') as f:
spec = yaml.safe_load(f)

schemas = spec['components']['schemas']

Compare all schemas with observed_generation

schemas_with_obs_gen = [
'ClusterStatus',
'NodePoolStatus',
'ResourceCondition'
]

for schema_name in schemas_with_obs_gen:
if schema_name in schemas:
obs_gen = schemas[schema_name].get('properties', {}).get('observed_generation', {})
print(f"{schema_name}.observed_generation:")
print(f" type: {obs_gen.get('type')}")
print(f" format: {obs_gen.get('format')}")
print(f" minimum: {obs_gen.get('minimum', 'NOT SET')}")
print(f" description: {obs_gen.get('description', 'NOT SET')}")
print()
EOF


</function_calls>

Perfect! Now let me verify if there are any examples in the schema that show observed_generation values:
<function_calls>

python3 << 'EOF'
import yaml

with open('openapi/openapi.yaml', 'r') as f:
spec = yaml.safe_load(f)

schemas = spec['components']['schemas']

Check examples for observed_generation values

for schema_name in ['Cluster', 'NodePool']:
if schema_name in schemas:
example = schemas[schema_name].get('example', {})
if 'status' in example:
obs_gen = example['status'].get('observed_generation', 'NOT IN EXAMPLE')
print(f"{schema_name} example status.observed_generation: {obs_gen}")
else:
print(f"{schema_name} example has no status")
EOF


</function_calls>

Both ClusterStatus and NodePoolStatus have identical observed_generation constraints—the issue is not unique to NodePool

Verification reveals that both ClusterStatus and NodePoolStatus define observed_generation with identical schema constraints: type: int32, format: int32, and minimum: 1. The Cluster and NodePool examples both show observed_generation: 1, consistent with the schema.

However, the review comment identifies only NodePoolStatus as problematic, missing that ClusterStatus has the same constraint. If the underlying concern about DB defaults and service initialization is valid, it would apply equally to both schemas, not NodePool alone. Additionally, ResourceCondition correctly omits the minimum: 1 constraint on its observed_generation field (it has no required constraint), which aligns with the schema's flexibility for condition-level generation tracking.

Without access to migration files or service initialization code in this repository, the actual conflict between schema constraints and DB defaults cannot be verified. Consider:

  • Clarifying whether the minimum: 1 constraint also applies to ClusterStatus (it does) and whether Cluster creation has the same initialization concern
  • Verifying service code ensures both Cluster and NodePool status values respect the schema before returning responses, or adjusting the minimum constraint consistently across both schemas
test/integration/clusters_test.go (1)

38-43: Cluster timestamp assertions correctly updated to CreatedTime/UpdatedTime

The test now compares clusterOutput.CreatedTime and UpdatedTime with the model’s CreatedTime/UpdatedTime using BeTemporally("~", ...), which is consistent with the renamed fields and tolerant of small timing differences.

pkg/db/migrations/202511111055_add_node_pools.go (1)

20-24: NodePool migration schema matches new timestamp and condition-based status model

The node_pools table now uses created_time/updated_time and introduces status_last_updated_time and status_conditions to store aggregated status, which is consistent with the new API/status structures. Indexes and rollback logic remain correct and symmetrical.

Also applies to: 39-47

pkg/db/migrations/202511111044_add_clusters.go (1)

19-21: Cluster migration aligns with condition-based status and timestamp renames

The clusters table definition now uses created_time/updated_time and status_last_updated_time/status_conditions, which matches the updated Cluster/ClusterStatus API and the move to JSONB‑backed status conditions. Index creation and rollback remain correct and consistent.

Also applies to: 22-26, 38-44

pkg/api/adapter_status_types.go (1)

14-32: Based on the web search results, the concern in the original review comment is technically well-founded. GORM documentation confirms that embedding structs with duplicate field names causes mapping conflicts unless explicitly handled via embeddedPrefix, named embedded fields with gorm:"embedded", or explicit gorm:"column:..." tags.

However, I cannot access the codebase to verify the actual implementation details:

  • Whether Meta struct actually contains a CreatedTime field
  • How Meta is embedded in AdapterStatus (anonymous vs. named)
  • Whether explicit column tag overrides are in place
  • The actual database schema/migrations

Rewritten Review Comment:


Verify GORM column mapping for Meta.CreatedTime and AdapterStatus.CreatedTime

AdapterStatus embeds Meta and also declares CreatedTime *time.Time with GORM tag gorm:"not null". Per GORM's handling of embedded structs, if Meta contains a CreatedTime field, both fields will attempt to map to the created_time column.

Unless Meta is explicitly embedded with gorm:"embedded;embeddedPrefix:..." or the duplicate fields have explicit gorm:"column:..." tags, GORM will produce ambiguous or conflicting column mappings.

Confirm:

  • How Meta is embedded and whether it contains CreatedTime
  • Whether GORM column tag overrides are in place
  • That migrations/schema reflect the intended mapping
test/integration/adapter_status_test.go (7)

29-38: LGTM! Test payload correctly updated for new API types.

The test correctly uses []openapi.ConditionRequest and the nested map[string]map[string]interface{} structure for Data, aligning with the updated OpenAPI schema.


65-70: LGTM!

The loop correctly creates adapter statuses using the updated ConditionRequest type.


102-111: LGTM!

NodePool status creation test correctly uses the updated API types.


138-143: LGTM!

Correctly uses ConditionRequest for the nodepool status retrieval test.


174-179: LGTM!

Pagination test correctly validates page size constraints.


192-264: Excellent test coverage for upsert semantics.

This test properly validates the core idempotency requirement: posting the same adapter twice should update rather than duplicate. The verification at line 262 ensures exactly one entry exists, confirming the upsert behavior.


331-357: LGTM! Good pagination consistency verification.

The test properly validates no duplicates or missing items across pages, with appropriate safety limits to prevent infinite loops.

pkg/handlers/cluster_nodepools_test.go (2)

154-157: LGTM! Field names correctly updated.

The test literals now use CreatedTime and UpdatedTime to match the updated Meta struct definition in pkg/api/metadata_types.go.


163-171: LGTM!

NodePool test data correctly uses the updated Meta field names.

pkg/handlers/cluster_status.go (1)

81-88: LGTM! Clean upsert flow.

The simplified approach using AdapterStatusFromOpenAPICreate followed by Upsert is cleaner than the previous find-then-create/replace pattern. The upsert handles the create-or-update logic internally.

pkg/dao/adapter_status.go (3)

67-83: LGTM! Proper error handling using errors.Is.

The upsert correctly uses errors.Is(err, gorm.ErrRecordNotFound) to detect not-found conditions, addressing the previous review feedback. The flow is clear: find existing → create if not found → update if exists.


85-105: LGTM! Upsert update logic correctly preserves timestamps.

The implementation correctly:

  1. Preserves the original ID and CreatedTime from the existing record
  2. Updates LastReportTime to current time on every POST
  3. Applies Kubernetes condition semantics via preserveLastTransitionTime

165-212: LGTM! Correct implementation of Kubernetes condition semantics.

The preserveLastTransitionTime helper correctly:

  1. Preserves LastTransitionTime when a condition's Status is unchanged
  2. Uses the new LastTransitionTime when Status changes
  3. Handles edge cases gracefully (empty conditions, unmarshal errors return new conditions as-is)
pkg/services/status_aggregation.go (6)

25-39: LGTM! Required adapters clearly defined.

The required adapter lists are well-documented with comments explaining each adapter's purpose. This aligns with the HyperFleet MVP scope.


60-83: LGTM! Clean PascalCase conversion.

The function correctly handles hyphenated adapter names and the len(part) > 0 check at line 73 handles edge cases like multiple consecutive hyphens.


96-100: LGTM! Good early return for empty statuses.

The function correctly returns "NotReady" immediately when there are no adapter statuses, avoiding unnecessary processing.


136-184: LGTM! Comprehensive phase computation logic.

The dual-branch structure (with/without required adapters) is clear and handles:

  • Missing required adapters
  • Stale generation detection
  • Failed adapters (status == "False")
  • Unknown status values

The warning logs provide good observability for debugging.


186-193: LGTM! Clear phase determination.

The phase priority (Ready → Failed → NotReady) correctly implements the documented semantics.


25-39: I'm unable to complete the verification because the repository failed to clone. Without access to the codebase, I cannot search for actual usages of the requiredClusterAdapters and requiredNodePoolAdapters variables.

Since manual verification is needed and the automated search is not possible at this time, here is the rewritten review comment:


Verify that required adapter lists are used by calling code.

The requiredClusterAdapters and requiredNodePoolAdapters variables are defined but not used within this file. Verify they are properly referenced by the cluster and nodepool services to ensure they are not dead code.

pkg/api/node_pool_types.go (4)

36-37: LGTM - Field renaming aligns with Kubernetes conventions.

The renaming of status fields (StatusLastUpdatedTime, StatusConditions) improves consistency with Kubernetes-style status semantics.


122-126: LGTM - Timestamp field mappings updated correctly.

The field mappings now use CreatedTime/UpdatedTime and include audit fields (CreatedBy/UpdatedBy), aligning with the PR objectives.


128-145: LGTM - Status building with proper nil handling.

The nil checks for StatusLastTransitionTime and StatusLastUpdatedTime (lines 130-137) properly default to zero-value timestamps, which is appropriate for Kubernetes-style status fields.


162-179: LGTM - Creation logic correctly initializes StatusConditions.

The initialization of StatusConditions with an empty ResourceCondition slice (line 163) is appropriate for newly created NodePools that haven't received any adapter status updates yet.

pkg/services/cluster.go (3)

167-175: LGTM - Nil checks properly address previous review feedback.

The nil checks for CreatedTime (lines 168-170) and LastReportTime (lines 173-175) correctly prevent the panic identified in past reviews. Zero-value timestamps are appropriate defaults when the source pointers are nil.


188-220: LGTM - Min LastReportTime strategy correctly surfaces stale adapters.

The calculation of min(adapters[].last_report_time) for StatusLastUpdatedTime (lines 188-197, 214-220) is well-designed. By using the oldest adapter timestamp, Sentinel can detect when any single adapter becomes stale, which is the correct behavior for monitoring adapter health.


222-225: LGTM - Phase transition tracking follows Kubernetes semantics.

The LastTransitionTime is correctly updated only when the phase changes or when it's uninitialized (line 223), preserving the timestamp when the phase remains stable.

pkg/services/node_pool.go (3)

167-175: LGTM - Nil checks properly implemented.

The nil checks for CreatedTime and LastReportTime (lines 168-175) correctly prevent potential panics, consistent with the fix in cluster.go.


185-187: LGTM - Correct generation handling for NodePools.

Passing 0 for resourceGeneration (line 187) correctly skips generation matching for NodePools, as they don't implement generation tracking. The comment clearly explains this design decision.


189-226: LGTM - Status aggregation logic is correct.

The status aggregation logic (min LastReportTime calculation, phase updates, condition marshaling) is correct and consistent with cluster.go implementation.

…pping

  - Rename timestamp fields: CreatedAt/UpdatedAt → CreatedTime/UpdatedTime
  - Add last_report_time field to adapter_status table
  - Implement Upsert for adapter status (POST creates or updates)
  - Add adapter-to-condition mapping: validator→ValidatorSuccessful, dns→DnsSuccessful
  - Convert ConditionRequest to AdapterCondition with LastTransitionTime
  - Status Aggregation
  - Update all tests for new field names and types

  All unit tests and integration tests passing.

update

update

update

update

update

update

update

update

update

update
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/handlers/nodepool_status.go (1)

90-95: Add logging for aggregation errors.

The comment states "Log error but don't fail the request", but the error is silently discarded without any actual logging. This makes debugging aggregation issues difficult.

 			// Trigger status aggregation
 			_, aggregateErr := h.nodePoolService.UpdateNodePoolStatusFromAdapters(ctx, nodePoolID)
 			if aggregateErr != nil {
 				// Log error but don't fail the request
-				// The status will be computed on next update
+				// The status will be computed on next update
+				log.Printf("Warning: failed to aggregate nodepool status for %s: %v", nodePoolID, aggregateErr)
 			}

You'll need to import the log package, or use your project's structured logging if available.

pkg/api/adapter_status_types.go (1)

14-27: Clarify dual CreatedTime semantics on AdapterStatus (Meta vs. adapter-level).

AdapterStatus now embeds Meta (which, based on its usage in Cluster, presumably includes a CreatedTime/UpdatedTime) and also declares its own CreatedTime *time.Time for API-managed timestamps. That gives you two CreatedTime fields on the same model:

  • Meta.CreatedTime (record-level)
  • AdapterStatus.CreatedTime (adapter “first report” time)

This is easy to misread and, unless Meta.CreatedTime is mapped to a different DB column via tags, risks GORM trying to map both fields to a created_time column. I’d strongly consider either reusing Meta.CreatedTime for the API’s created_time, or renaming/retagging the adapter-specific field (e.g., FirstReportTime with gorm:"column:first_report_time") so the intent and column mapping are unambiguous.

🧹 Nitpick comments (14)
openapi/openapi.yaml (2)

431-437: Consider adding maxItems constraint to conditions array.

The static analysis tool flagged that the conditions array lacks a maximum item limit. While not strictly necessary for internal APIs, adding a reasonable maxItems (e.g., 50 or 100) can provide defense-in-depth against unexpectedly large payloads.

         conditions:
           type: array
+          maxItems: 50
           items:
             $ref: '#/components/schemas/AdapterCondition'

779-782: Same maxItems consideration applies to status conditions.

The conditions arrays in ClusterStatus (line 779) and NodePoolStatus (line 1123) could also benefit from a maxItems constraint for consistency with the suggestion above.

Also applies to: 1123-1126

pkg/handlers/cluster_status.go (1)

90-95: Empty error handling block should log the error.

The comment says "Log error but don't fail the request," yet the block is empty and doesn't actually log anything. Consider adding a log statement.

 			// Trigger status aggregation
 			_, aggregateErr := h.clusterService.UpdateClusterStatusFromAdapters(ctx, clusterID)
 			if aggregateErr != nil {
-				// Log error but don't fail the request
-				// The status will be computed on next update
+				// Log error but don't fail the request - status will be computed on next update
+				// TODO: Add proper logging here
 			}

Alternatively, inject a logger and call log.Warning(...) to surface this for observability.

pkg/dao/adapter_status.go (1)

165-212: Consider adding observability for silent error fallbacks.

The function gracefully handles unmarshal/marshal errors by returning the input as-is, which is reasonable for condition preservation. However, these silent fallbacks could mask data corruption or schema mismatches.

Consider adding debug-level logging when fallbacks are triggered to aid troubleshooting:

 	if len(oldConditionsJSON) > 0 {
 		if err := json.Unmarshal(oldConditionsJSON, &oldConditions); err != nil {
-			// If we can't unmarshal old conditions, return new conditions as-is
+			// If we can't unmarshal old conditions, return new conditions as-is
+			// Consider logging at debug level for observability
 			return newConditionsJSON
 		}
 	}
pkg/services/status_aggregation.go (1)

114-129: Silent unmarshal failure could hide data issues.

If json.Unmarshal fails on line 115, the adapter is silently excluded from the map, which could cause unexpected "NotReady" phases. Consider logging a warning similar to other paths in this function.

 		if len(adapterStatus.Conditions) > 0 {
-			if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err == nil {
+			if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err != nil {
+				log.Warning(fmt.Sprintf("Failed to unmarshal conditions for adapter '%s': %v", adapterStatus.Adapter, err))
+				continue
+			}
+			{
 				for _, cond := range conditions {
 					if cond.Type == "Available" {

Or simply add a warning in the else branch of the existing structure.

pkg/api/node_pool_types.go (1)

91-95: Silent unmarshal failure for StatusConditions.

The _ = json.Unmarshal(...) pattern silently discards errors. If StatusConditions contains malformed JSON, statusConditions will be an empty slice, which may be misleading in the API response.

Consider logging a warning or returning an error for debugging purposes, consistent with how critical data corruption should be surfaced.

pkg/services/cluster.go (1)

139-141: Consider logging unmarshal failures for debugging.

When json.Unmarshal fails on adapter conditions (line 139), the adapter is silently skipped. This could make it difficult to diagnose why certain adapters aren't being aggregated.

 		var conditions []openapi.AdapterCondition
 		if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err != nil {
+			log := logger.NewOCMLogger(ctx)
+			log.Warning(fmt.Sprintf("Failed to unmarshal conditions for adapter '%s': %v", adapterStatus.Adapter, err))
 			continue // Skip if can't unmarshal
 		}
pkg/api/cluster_types.go (2)

24-36: Status field additions look correct; minor doc inconsistency with Meta.

StatusLastUpdatedTime and StatusConditions align with the new condition/status model and give you room to store both timing and condition JSON cleanly. One nit: the embedded Meta comment still says “Contains ID, CreatedAt, UpdatedAt, DeletedAt”, while downstream code (e.g., ToOpenAPI using c.CreatedTime/c.UpdatedTime) suggests those fields have been renamed to CreatedTime/UpdatedTime. Consider updating the comment so future readers aren’t confused about which timestamp fields exist.


81-85: Status conditions unmarshalling and zero-time defaults are consistent but hide bad data.

Unmarshalling StatusConditions into []openapi.ResourceCondition and defaulting LastTransitionTime/LastUpdatedTime to time.Time{} when the DB fields are nil is consistent with how Spec and Labels are handled and ensures the OpenAPI ClusterStatus always carries non-nil timestamps.

The trade-off is that corrupted status_conditions JSON or missing times silently turn into empty slices / zero timestamps, which can make debugging harder and makes it impossible for clients to distinguish “never set” from “zero time”. If that distinction ever matters, consider at least logging the unmarshal errors and/or making the OpenAPI fields pointers so nil can mean “not set”.

Also applies to: 107-124

pkg/services/node_pool.go (3)

132-180: AdapterStatus → ResourceCondition mapping and nil-safe timestamps look solid.

The flow of:

  • unmarshalling adapterStatus.Conditions as []openapi.AdapterCondition,
  • selecting the "Available" condition,
  • mapping to a ResourceCondition type via MapAdapterToConditionType, and
  • populating CreatedTime/LastUpdatedTime only when the source pointers are non-nil

matches the new condition model and avoids the nil dereference risk called out in earlier reviews. One small improvement would be to log when unmarshalling fails or no "Available" condition is found, so missing/bad adapter data is visible during debugging instead of being silently skipped.


185-199: Using min(LastReportTime) for StatusLastUpdatedTime matches the “stale adapter” design.

Computing minLastUpdatedTime over all adapterStatuses[].LastReportTime (with a now fallback when none are set) correctly surfaces the oldest adapter report as the nodepool’s status.last_updated_time, which is exactly what you describe for Sentinel’s staleness detection. Reusing the adapter’s timestamp pointer in nodePool.StatusLastUpdatedTime is fine in Go (the underlying time.Time escapes to the heap), but if you ever start mutating those timestamps in-place, consider copying the value instead of the pointer to avoid surprising aliasing.


200-226: Phase and transition update logic is coherent with generation semantics.

Capturing oldPhase before recomputing, setting StatusPhase/StatusObservedGeneration, and only bumping StatusLastTransitionTime when oldPhase != newPhase (or when it was previously nil) gives nice Kubernetes-style semantics. Using resourceGeneration=0 for nodepools when calling ComputePhase also matches the design where only clusters are gated on spec generation. If you standardize on UTC elsewhere, you might want to use time.Now().UTC() here for consistency.

pkg/api/adapter_status_types.go (2)

53-56: ToOpenAPI defaulting keeps timestamps non-nil; just be aware of zero-time semantics.

Switching to []openapi.AdapterCondition, map[string]map[string]interface{}, and *openapi.AdapterStatusBaseMetadata and then normalizing CreatedTime/LastReportTime to zero time when the DB fields are nil ensures the OpenAPI AdapterStatus always has concrete timestamp values. That’s convenient for clients but makes “never set” indistinguishable from “0001-01-01Z”. If clients ever need to know “no report yet”, it might be worth exposing these as pointer fields in the OpenAPI type or at least documenting that zero time == unset.

Also applies to: 59-69, 70-89


97-115: Create path’s use of ObservedTime is consistent; consider standardizing on UTC.

Using req.ObservedTime (when non-zero) as the basis for both CreatedTime and LastReportTime, and as LastTransitionTime for each initial AdapterCondition, lines up well with the “adapter-reported observation” semantics and with the later upsert rules described in the DAO. The JSON marshaling for conditions/data/metadata also looks straightforward.

One small improvement: you may want to call time.Now().UTC() instead of time.Now() so server-generated timestamps are consistently in UTC, which tends to play nicer with cross-region consumers and K8s-style tooling.

Also applies to: 117-125, 140-142

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2184d59 and e9a487c.

⛔ Files ignored due to path filters (1)
  • data/generated/openapi/openapi.go is excluded by !**/generated/**
📒 Files selected for processing (22)
  • .gitignore (1 hunks)
  • Makefile (2 hunks)
  • openapi/openapi.yaml (19 hunks)
  • pkg/api/adapter_status_types.go (6 hunks)
  • pkg/api/cluster_types.go (5 hunks)
  • pkg/api/metadata_types.go (1 hunks)
  • pkg/api/node_pool_types.go (5 hunks)
  • pkg/dao/adapter_status.go (3 hunks)
  • pkg/dao/cluster.go (2 hunks)
  • pkg/db/migrations/202511111044_add_clusters.go (2 hunks)
  • pkg/db/migrations/202511111055_add_node_pools.go (2 hunks)
  • pkg/db/migrations/202511111105_add_adapter_status.go (2 hunks)
  • pkg/handlers/cluster_nodepools_test.go (3 hunks)
  • pkg/handlers/cluster_status.go (1 hunks)
  • pkg/handlers/nodepool_status.go (1 hunks)
  • pkg/services/adapter_status.go (2 hunks)
  • pkg/services/cluster.go (2 hunks)
  • pkg/services/node_pool.go (2 hunks)
  • pkg/services/status_aggregation.go (1 hunks)
  • pkg/services/status_aggregation_test.go (1 hunks)
  • test/integration/adapter_status_test.go (9 hunks)
  • test/integration/clusters_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • pkg/dao/cluster.go
  • pkg/handlers/cluster_nodepools_test.go
  • pkg/api/metadata_types.go
  • pkg/services/adapter_status.go
  • test/integration/clusters_test.go
  • pkg/services/status_aggregation_test.go
🧰 Additional context used
🧬 Code graph analysis (7)
pkg/services/node_pool.go (2)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-174)
pkg/errors/errors.go (1)
  • GeneralError (164-166)
pkg/dao/adapter_status.go (3)
pkg/api/adapter_status_types.go (1)
  • AdapterStatus (13-32)
pkg/errors/errors.go (1)
  • New (104-119)
pkg/db/context.go (1)
  • MarkForRollback (48-57)
pkg/services/status_aggregation.go (2)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusList (34-34)
pkg/logger/logger.go (1)
  • NewOCMLogger (36-44)
pkg/handlers/cluster_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/services/cluster.go (2)
pkg/services/status_aggregation.go (2)
  • MapAdapterToConditionType (60-83)
  • ComputePhase (96-174)
pkg/errors/errors.go (1)
  • GeneralError (164-166)
pkg/handlers/nodepool_status.go (1)
pkg/api/adapter_status_types.go (1)
  • AdapterStatusFromOpenAPICreate (93-143)
pkg/api/cluster_types.go (1)
pkg/errors/errors.go (1)
  • Href (156-158)
🪛 Checkov (3.2.334)
openapi/openapi.yaml

[medium] 432-439: Ensure that arrays have a maximum number of items

(CKV_OPENAPI_21)

🔇 Additional comments (22)
.gitignore (1)

52-52: LGTM!

The new ignore rule for /data/generated/ correctly aligns with the Makefile's generate and generate-vendor targets that create artifacts under data/generated/openapi.

Makefile (1)

214-214: LGTM!

The mkdir -p commands ensure the output directory exists before copying generated files, preventing failures on fresh clones.

Also applies to: 224-224

pkg/db/migrations/202511111044_add_clusters.go (1)

14-89: LGTM!

The migration properly implements:

  • Renamed timestamp fields (created_time/updated_time) with sensible defaults
  • Status fields including status_conditions as JSONB for condition-based status model
  • Appropriate indexes for soft deletes, unique names, and phase filtering
  • Clean rollback that drops indexes before the table
openapi/openapi.yaml (1)

219-225: Note: Deferred feedback from previous review.

The previous review comments about not exposing the "adapter" concept in the OpenAPI and updating the summary text have been acknowledged and deferred to a future update per the author's response. This is acceptable.

pkg/db/migrations/202511111105_add_adapter_status.go (1)

14-86: LGTM!

The migration correctly implements:

  • Polymorphic association via resource_type and resource_id
  • API-managed last_report_time for adapter liveness tracking
  • Unique partial index on (resource_type, resource_id, adapter) enabling the upsert semantics described in the PR
  • JSONB storage for conditions, data, and metadata
pkg/db/migrations/202511111055_add_node_pools.go (1)

14-110: LGTM!

The migration properly implements:

  • Consistent timestamp naming (created_time/updated_time) matching the clusters migration
  • Status fields with status_conditions as JSONB for condition-based model
  • Foreign key constraint with ON DELETE RESTRICT preventing orphaned node pools
  • Appropriate indexes for query patterns
test/integration/adapter_status_test.go (2)

192-264: Good coverage of upsert idempotency behavior.

The TestAdapterStatusIdempotency test properly validates that:

  1. First POST creates the status
  2. Second POST with same adapter name updates (not duplicates)
  3. Final state reflects the latest conditions

This aligns well with the Kubernetes-style upsert semantics described in the PR.


36-38: Repository access is blocked, preventing full verification of the concern.

Based on the web search results, I can clarify that OpenAPI's additionalProperties: {} (or additionalProperties: true by default) permits arbitrary value types, including nested objects. This means a structure like {"test_key": {"value": "test_value"}} is syntactically valid under such a schema.

However, I cannot verify the concern without access to:

  1. The actual OpenAPI schema definition for AdapterStatusBase.data
  2. The test file context and all occurrences of the Data field
  3. Any conversion logic in AdapterStatusFromOpenAPICreate

To proceed, please manually verify:

  • Does the OpenAPI spec define data as additionalProperties: {} or with a more specific nested object schema?
  • Is the {"value": ...} nesting pattern consistent and intentional, or does it represent a discrepancy between tests and actual usage?
  • Does any conversion or marshaling logic explain why this structure exists?
pkg/handlers/nodepool_status.go (1)

81-88: LGTM!

The handler correctly uses the conversion function and upsert pattern for creating/updating adapter status, aligning with the PR's goal of implementing Kubernetes-style upsert semantics.

pkg/handlers/cluster_status.go (1)

81-88: Clean refactor to upsert semantics.

The simplified flow using AdapterStatusFromOpenAPICreate and Upsert is cleaner than the previous find-then-create-or-replace pattern. This properly implements the intended upsert keyed by (resource_type, resource_id, adapter).

pkg/dao/adapter_status.go (1)

67-106: Upsert implementation looks correct.

The implementation properly:

  1. Uses errors.Is(err, gorm.ErrRecordNotFound) for error checking (past issue addressed)
  2. Preserves ID and CreatedTime from the existing record
  3. Updates LastReportTime on every upsert
  4. Delegates condition timestamp preservation to preserveLastTransitionTime
pkg/services/status_aggregation.go (3)

25-39: Required adapters configuration looks good.

The separation of requiredClusterAdapters and requiredNodePoolAdapters aligns with the HyperFleet MVP scope. This configuration is easy to extend for future adapters.


60-83: MapAdapterToConditionType implementation is correct.

The PascalCase conversion handles hyphenated names properly, and the suffix map allows future customization without code changes.


165-173: Phase determination logic is correct.

The priority order (Ready → Failed → NotReady) ensures that:

  1. All required adapters must be available and current-generation for "Ready"
  2. Any explicit failure triggers "Failed"
  3. Missing, stale, or unknown statuses result in "NotReady"

This aligns with Kubernetes condition semantics.

pkg/api/node_pool_types.go (3)

36-37: Field renames align with new status model.

The transition from StatusUpdatedAt/StatusAdapters to StatusLastUpdatedTime/StatusConditions properly reflects the shift to a condition-centric status model with explicit timing semantics.


128-145: Proper nil-safety for timestamp fields.

The nil checks for StatusLastTransitionTime and StatusLastUpdatedTime with zero-value fallbacks prevent panics and ensure the OpenAPI response always has valid timestamps.


162-163: Initialize StatusConditions as empty array.

Marshaling an empty []openapi.ResourceCondition{} ensures the JSON field is [] rather than null, which is cleaner for API consumers.

pkg/services/cluster.go (4)

132-133: Transition to ResourceCondition-based aggregation.

The shift from per-adapter status blocks to a unified ResourceCondition list aligns with the new OpenAPI schema and enables consistent condition handling across cluster and nodepool resources.


167-175: Nil pointer safety properly addressed.

The nil checks for adapterStatus.CreatedTime and adapterStatus.LastReportTime before dereferencing address the previously flagged issue. This prevents runtime panics when timestamps are missing.


188-197: Good design: min(LastReportTime) surfaces stale adapters.

Using the oldest adapter timestamp for StatusLastUpdatedTime ensures that Sentinel (or other monitoring) can detect when any adapter becomes stale, rather than being masked by more recent updates from other adapters. This is a thoughtful choice for operational visibility.


222-225: LastTransitionTime updated only on phase change.

This correctly implements Kubernetes condition semantics where LastTransitionTime reflects when the phase changed, not when it was last observed.

Note: The condition cluster.StatusLastTransitionTime == nil handles the initial case where no transition has occurred yet.

pkg/api/cluster_types.go (1)

141-143: Initializing StatusConditions to an empty array on create is reasonable.

Marshalling []openapi.ResourceCondition{} into StatusConditions on create guarantees the column holds valid JSON ([]) rather than NULL, which keeps the ToOpenAPI path simple and avoids special cases for “no conditions yet”. This looks consistent with how you unmarshal in ToOpenAPI and with the condition-based status model.

Also applies to: 145-155

@xueli181114
Copy link
Collaborator

/approve

1 similar comment
@yasun1
Copy link
Contributor Author

yasun1 commented Nov 27, 2025

/approve

@openshift-hyperfleet openshift-hyperfleet deleted a comment from openshift-ci bot Nov 27, 2025
@openshift-ci
Copy link

openshift-ci bot commented Nov 28, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rh-amarin, xueli181114, yasun1

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants