-
Notifications
You must be signed in to change notification settings - Fork 11
HYPERFLEET-386 - feat: use ready/available conditions instead of phase #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HYPERFLEET-386 - feat: use ready/available conditions instead of phase #39
Conversation
|
Skipping CI for Draft Pull Request. |
WalkthroughReplaces the flat phase/timestamp/generation status model with a condition-centric model. Introduces synthesized top-level Kubernetes-style conditions (Available and Ready) and per-condition metadata, migrates Cluster/NodePool statuses to a single JSONB Sequence Diagram(s)sequenceDiagram
participant Adapter
participant API as API Server
participant Svc as Cluster/NodePool Service
participant Agg as Status Aggregation
participant DB as Database
Adapter->>API: POST /resources/{id}/status (adapter conditions JSON)
API->>Svc: ProcessAdapterStatus(resourceID, adapterStatus)
Svc->>DB: Read existing adapter status
Svc->>DB: Upsert adapter status (if applicable)
Svc->>Agg: ComputeAvailableCondition(adapterStatuses, requiredAdapters)
Agg-->>Svc: (isAvailable, minObservedGeneration)
Svc->>Agg: ComputeReadyCondition(adapterStatuses, requiredAdapters, resourceGeneration)
Agg-->>Svc: isReady
Svc->>Svc: BuildSyntheticConditions(Available, Ready, adapter conditions)
Svc->>DB: Update resource.status_conditions (JSONB)
API-->>Adapter: 201 Created or 204 No Content
sequenceDiagram
participant Client
participant API as API Server
participant GenericSvc as Generic List Service
participant SQLHelper
participant DB as Database
Client->>API: GET /clusters?filter=status.conditions.Ready='True'
API->>GenericSvc: BuildListQuery(TSL)
GenericSvc->>SQLHelper: ExtractConditionQueries(TSL, tableName)
SQLHelper->>SQLHelper: Validate condition type/status, build jsonb_path_query_first expr
SQLHelper-->>GenericSvc: (modifiedTSL, conditionExpressions)
GenericSvc->>DB: Execute SELECT with appended conditionExpressions (indexed)
DB->>DB: Use idx_*_ready_status indexes for lookup
DB-->>API: Return filtered rows
API-->>Client: Filtered results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
pkg/services/generic_test.go (1)
59-71: Add a real status.conditions SQL-translation test.The comment signals coverage for condition queries, but the test table doesn’t include a status.conditions case. Adding a small test that exercises
buildSearchValueswith a condition query will protect the new path.🧪 Proposed test addition
+func TestSQLTranslationStatusConditions(t *testing.T) { + RegisterTestingT(t) + var dbFactory db.SessionFactory = dbmocks.NewMockSessionFactory() + defer dbFactory.Close() //nolint:errcheck + + g := dao.NewGenericDao(&dbFactory) + genericService := sqlGenericService{genericDao: g} + + var list []api.Cluster + listCtx, model, serviceErr := genericService.newListContext( + context.Background(), + "", + &ListArguments{Search: "status.conditions.Ready='True'"}, + &list, + ) + Expect(serviceErr).ToNot(HaveOccurred()) + + d := g.GetInstanceDao(context.Background(), model) + sql, values, serviceErr := genericService.buildSearchValues(listCtx, &d) + Expect(serviceErr).ToNot(HaveOccurred()) + Expect(sql).To(ContainSubstring("jsonb_path_query_first")) + Expect(values).To(ConsistOf(`$[*] ? (@.type == "Ready")`, "True")) +}pkg/services/generic.go (1)
169-223: Condition extraction + global AND breaks OR/NOT semantics.Replacing each condition node with
1=1and then AND‑ing all extracted condition expressions forces every condition to be true, even when the user wroteORor negated logic. This changes query results (e.g.,A OR status.conditions.Ready='True'becomesA AND Ready). Consider preserving the boolean structure by embedding condition predicates into the TSL tree (or extending the SQL walker) instead of post‑ANDing, and add an OR/parentheses test to lock the behavior.test/integration/search_field_mapping_test.go (1)
88-153: TestSearchCombinedQuery no longer tests a combined query.The test name/comments say “combined query,” but the search string is label-only. Either rename/adjust comments or reintroduce a combined status+label query.
🧪 Suggested rename/comment fix (label-only)
-// TestSearchCombinedQuery verifies that combined queries (AND/OR) -// work correctly with field mapping -func TestSearchCombinedQuery(t *testing.T) { +// TestSearchLabelRegionQuery verifies label-only queries work correctly +func TestSearchLabelRegionQuery(t *testing.T) { @@ - // Query using combined AND condition with labels (labels search still works) + // Query using region label searchStr := "labels.region='us-east'" @@ - // Should return matchCluster and wrongStatusCluster but not wrongRegionCluster + // Should return the us-east cluster but not wrongRegionCluster
🤖 Fix all issues with AI agents
In `@docs/api-resources.md`:
- Around line 340-355: The NodePool GET example's conditions array omits
required ResourceCondition timestamp fields; update the example (or add a note)
to include created_time, last_updated_time, and last_transition_time for each
condition, referencing the ResourceCondition type and the NodePool "conditions"
field so readers see the expected schema; ensure the example JSON includes those
three timestamp properties for each condition (or add a brief parenthetical like
"timestamps omitted for brevity" if you intentionally leave them out).
In `@openapi/openapi.yaml`:
- Around line 1-4: The openapi/openapi.yaml file is generated and must not be
edited directly; instead, update the TypeSpec source in the hyperfleet-api-spec
repository (the API spec that generates the OpenAPI document) to make your
changes and then run the generation/regeneration process so that the top-level
OpenAPI document (openapi: 3.0.0, info.title: HyperFleet API, info.version) is
rebuilt from the canonical TypeSpec source.
In `@pkg/db/migrations/202601210001_add_conditions_gin_index.go`:
- Around line 8-10: The function addConditionsGinIndex is misnamed because it
creates BTREE expression indexes; rename the function to addConditionsBtreeIndex
and update the migration description/comment accordingly (and any exported
references) so the name matches the index type, or alternatively change the
index creation SQL to use GIN if you intended GIN; locate addConditionsGinIndex
in the migration file and update its name, the migration ID/description inside
the gormigrate.Migration, and any callers to keep identifiers consistent.
In `@pkg/db/sql_helpers_test.go`:
- Around line 99-113: The test's error assertion incorrectly treats any
non-empty error as a match; in the table-driven test block that checks
tt.expectError / tt.errorContains (variables tt, expectError, errorContains,
err), replace the bogus found logic with a proper substring check using
strings.Contains(err.Error(), tt.errorContains) and call t.Errorf (or t.Fatalf)
when the expected substring is not present; ensure you import "strings" if
needed and only perform the contains check when tt.errorContains is non-empty
and err is non-nil.
In `@pkg/db/sql_helpers.go`:
- Around line 148-235: conditionsNodeConverter currently emits an unqualified
status_conditions column which causes "column reference is ambiguous" in
multi-table queries; change the signature of conditionsNodeConverter to accept a
tableName string and use it to qualify the column (e.g., tableName +
".status_conditions") when building the jsonb_path_query_first expression, then
propagate the new parameter through extractConditionsWalk and
ExtractConditionQueries (and all call sites) and update unit tests to pass the
tableName; ensure the jsonPath and rightStr handling remains unchanged and the
returned sq.Expr uses the qualified column reference.
In `@pkg/services/cluster.go`:
- Around line 181-211: The Available/Ready conditions are recreating CreatedTime
and LastTransitionTime as now on every aggregation; preserve historical
timestamps by unmarshaling cluster.StatusConditions to find previous "Available"
and "Ready" api.ResourceCondition entries, copy their CreatedTime into the new
availableCondition/readyCondition, and only update LastTransitionTime when the
Status actually changes (otherwise keep the previous LastTransitionTime); leave
LastUpdatedTime as now and set ObservedGeneration as currently done
(minObservedGeneration and cluster.Generation respectively).
🧹 Nitpick comments (7)
pkg/api/presenters/node_pool.go (1)
30-53: Consider extracting common condition initialization logic.This initialization pattern for Available and Ready conditions is likely duplicated in the cluster presenter. Consider extracting a helper function like
NewInitialConditions()to reduce duplication across presenters.♻️ Suggested helper function
// In a shared location (e.g., pkg/api/status_types.go or pkg/api/presenters/common.go) func NewInitialConditions() []api.ResourceCondition { now := time.Now() return []api.ResourceCondition{ { Type: "Available", Status: api.ConditionUnknown, ObservedGeneration: 0, LastTransitionTime: now, CreatedTime: now, LastUpdatedTime: now, }, { Type: "Ready", Status: api.ConditionUnknown, ObservedGeneration: 0, LastTransitionTime: now, CreatedTime: now, LastUpdatedTime: now, }, } }pkg/services/node_pool.go (2)
188-195: Consider preservingCreatedTimefrom existing conditions.Setting
CreatedTime: nowon every status update means the synthetic conditions' creation timestamp will be overwritten each time. Kubernetes convention is to preserveCreatedTimeand only updateLastUpdatedTimeandLastTransitionTime(when status changes).Consider fetching the existing conditions first and preserving their
CreatedTime:♻️ Suggested approach
+ // Unmarshal existing conditions to preserve CreatedTime + var existingConditions []api.ResourceCondition + if len(nodePool.StatusConditions) > 0 { + _ = json.Unmarshal(nodePool.StatusConditions, &existingConditions) + } + existingAvailableCreatedTime := now + existingReadyCreatedTime := now + for _, c := range existingConditions { + if c.Type == "Available" { + existingAvailableCreatedTime = c.CreatedTime + } else if c.Type == "Ready" { + existingReadyCreatedTime = c.CreatedTime + } + } + availableCondition := api.ResourceCondition{ Type: "Available", Status: availableStatus, ObservedGeneration: minObservedGeneration, LastTransitionTime: now, - CreatedTime: now, + CreatedTime: existingAvailableCreatedTime, LastUpdatedTime: now, }
202-209: Same consideration for Ready condition'sCreatedTime.Apply the same preservation logic for the Ready condition's
CreatedTimeto maintain Kubernetes-style condition semantics.pkg/services/status_aggregation.go (2)
94-122: Consider extracting adapter map building to reduce duplication.Both
ComputeAvailableConditionandComputeReadyConditioncontain identical logic for building theadapterMap. Extracting this to a helper would improve maintainability.♻️ Suggested refactor
// adapterInfo holds parsed condition data for an adapter type adapterInfo struct { available string observedGeneration int32 } // buildAdapterMap extracts Available condition status and generation from adapter statuses func buildAdapterMap(adapterStatuses api.AdapterStatusList) map[string]adapterInfo { adapterMap := make(map[string]adapterInfo) for _, adapterStatus := range adapterStatuses { var conditions []struct { Type string `json:"type"` Status string `json:"status"` } if len(adapterStatus.Conditions) > 0 { if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err == nil { for _, cond := range conditions { if cond.Type == "Available" { adapterMap[adapterStatus.Adapter] = adapterInfo{ available: cond.Status, observedGeneration: adapterStatus.ObservedGeneration, } break } } } } } return adapterMap }Also applies to: 163-191
204-208: TheresourceGeneration > 0check is always true and unnecessary.Generation is always initialized to 1 in BeforeCreate hooks (cluster_types.go:49, node_pool_types.go:58). Since
ComputeReadyConditionis always called withcluster.GenerationornodePool.Generation(see cluster.go:183, node_pool.go:181),resourceGenerationis guaranteed to be >= 1. The defensive check makes the condition always true; consider removing it to simplify the logic, or documenting why this guard exists if intentionally kept for safety.test/factories/node_pools.go (1)
125-127: Prefer context-bound DB sessions in factories.Both updates create a new session with
context.Background(). Consider threading a callercontext.Contextthrough the factory helpers and retrieving the session viadb.NewContext(ctx)to align with transaction lifecycle and cancellation expectations. As per coding guidelines, please preferdb.NewContext(ctx)for DB sessions.Also applies to: 178-179
test/factories/clusters.go (1)
105-107: Prefer context-bound DB sessions in factories.Both updates spin up sessions with
context.Background(). Consider threading a callercontext.Contextand usingdb.NewContext(ctx)for DB access to preserve transaction lifecycle and cancellation semantics. As per coding guidelines, please preferdb.NewContext(ctx)for DB sessions.Also applies to: 158-160
| // addConditionsGinIndex adds expression indexes on the Ready condition | ||
| // within status_conditions JSONB columns for efficient lookups. | ||
| func addConditionsGinIndex() *gormigrate.Migration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name inconsistent with index type.
The function is named addConditionsGinIndex but creates BTREE indexes, not GIN indexes. This could cause confusion for maintainers.
🔧 Suggested fix
-// addConditionsGinIndex adds expression indexes on the Ready condition
+// addConditionsReadyIndex adds BTREE expression indexes on the Ready condition
// within status_conditions JSONB columns for efficient lookups.
-func addConditionsGinIndex() *gormigrate.Migration {
+func addConditionsReadyIndex() *gormigrate.Migration {🤖 Prompt for AI Agents
In `@pkg/db/migrations/202601210001_add_conditions_gin_index.go` around lines 8 -
10, The function addConditionsGinIndex is misnamed because it creates BTREE
expression indexes; rename the function to addConditionsBtreeIndex and update
the migration description/comment accordingly (and any exported references) so
the name matches the index type, or alternatively change the index creation SQL
to use GIN if you intended GIN; locate addConditionsGinIndex in the migration
file and update its name, the migration ID/description inside the
gormigrate.Migration, and any callers to keep identifiers consistent.
| if tt.expectError { | ||
| if err == nil { | ||
| t.Errorf("Expected error containing %q, but got nil", tt.errorContains) | ||
| return | ||
| } | ||
| if tt.errorContains != "" && err.Error() != "" { | ||
| // Check if error message contains expected string | ||
| found := false | ||
| if errMsg := err.Error(); errMsg != "" { | ||
| found = true | ||
| } | ||
| if !found { | ||
| t.Logf("Error message: %s", err.Error()) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error assertions don’t validate the expected message.
The current logic marks any non-empty error as “found,” so errorContains never fails. Use strings.Contains and fail when it doesn’t match.
✅ Fix the assertion
import (
+ "strings"
"testing"
@@
- if tt.errorContains != "" && err.Error() != "" {
- // Check if error message contains expected string
- found := false
- if errMsg := err.Error(); errMsg != "" {
- found = true
- }
- if !found {
- t.Logf("Error message: %s", err.Error())
- }
- }
+ if tt.errorContains != "" && !strings.Contains(err.Error(), tt.errorContains) {
+ t.Errorf("Expected error containing %q, got %q", tt.errorContains, err.Error())
+ }
return
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if tt.expectError { | |
| if err == nil { | |
| t.Errorf("Expected error containing %q, but got nil", tt.errorContains) | |
| return | |
| } | |
| if tt.errorContains != "" && err.Error() != "" { | |
| // Check if error message contains expected string | |
| found := false | |
| if errMsg := err.Error(); errMsg != "" { | |
| found = true | |
| } | |
| if !found { | |
| t.Logf("Error message: %s", err.Error()) | |
| } | |
| } | |
| if tt.expectError { | |
| if err == nil { | |
| t.Errorf("Expected error containing %q, but got nil", tt.errorContains) | |
| return | |
| } | |
| if tt.errorContains != "" && !strings.Contains(err.Error(), tt.errorContains) { | |
| t.Errorf("Expected error containing %q, got %q", tt.errorContains, err.Error()) | |
| } |
🤖 Prompt for AI Agents
In `@pkg/db/sql_helpers_test.go` around lines 99 - 113, The test's error assertion
incorrectly treats any non-empty error as a match; in the table-driven test
block that checks tt.expectError / tt.errorContains (variables tt, expectError,
errorContains, err), replace the bogus found logic with a proper substring check
using strings.Contains(err.Error(), tt.errorContains) and call t.Errorf (or
t.Fatalf) when the expected substring is not present; ensure you import
"strings" if needed and only perform the contains check when tt.errorContains is
non-empty and err is non-nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/services/node_pool.go (1)
161-168: Guard againstUnknownleaking into ResourceCondition status.
CastingAdapterConditionStatustoResourceConditionStatuscan emit"Unknown", which isn’t a valid resource condition status per the new model.✅ Suggested fix
- condResource := api.ResourceCondition{ + if availableCondition.Status == api.AdapterConditionUnknown { + continue + } + status := api.ConditionFalse + if availableCondition.Status == api.AdapterConditionTrue { + status = api.ConditionTrue + } + condResource := api.ResourceCondition{ Type: MapAdapterToConditionType(adapterStatus.Adapter), - Status: api.ResourceConditionStatus(availableCondition.Status), + Status: status, Reason: availableCondition.Reason, Message: availableCondition.Message, ObservedGeneration: adapterStatus.ObservedGeneration, LastTransitionTime: availableCondition.LastTransitionTime, }
🤖 Fix all issues with AI agents
In `@pkg/api/presenters/cluster_test.go`:
- Around line 355-360: Update the misleading test comment to reflect the actual
assertions: change the comment that says "Verify Status defaults - Available and
Ready conditions initialized as Unknown" to indicate that the defaults are False
(ResourceConditionStatusFalse) for the "Available" and "Ready" conditions; this
refers to the assertions that check result.Status.Conditions,
result.Status.Conditions[0].Type == "Available",
result.Status.Conditions[0].Status == openapi.ResourceConditionStatusFalse and
result.Status.Conditions[1].Type == "Ready", result.Status.Conditions[1].Status
== openapi.ResourceConditionStatusFalse.
In `@pkg/services/node_pool.go`:
- Around line 184-214: When building the synthetic Available and Ready
conditions after ComputeAvailableCondition/ComputeReadyCondition, preserve the
previous CreatedTime and LastTransitionTime when the condition Status did not
change: look up the existing conditions in nodePool.Status.Conditions for Types
"Available" and "Ready", compare their Status to the newly computed
availableStatus/readyStatus, and if unchanged reuse the previous CreatedTime and
LastTransitionTime instead of always using now; still update ObservedGeneration
(use minObservedGeneration for Available and nodePool.Generation for Ready) and
set LastTransitionTime to now only when Status changes.
♻️ Duplicate comments (2)
pkg/services/cluster.go (1)
186-216: Preserve CreatedTime/LastTransitionTime across aggregations.These timestamps are reset to
nowon every aggregation, which discards historical transition data. Prefer carrying forwardCreatedTimeand only updatingLastTransitionTimewhen the condition’s status actually changes.docs/api-resources.md (1)
356-370: NodePool GET example still omits condition timestamps.
ResourceCondition includescreated_time,last_updated_time, andlast_transition_time, but the example omits them.📝 Suggested doc fix
{ "type": "Available", "status": "True", "reason": "ResourceAvailable", "message": "NodePool is accessible", - "observed_generation": 1 + "observed_generation": 1, + "created_time": "2025-01-01T00:00:00Z", + "last_updated_time": "2025-01-01T00:00:00Z", + "last_transition_time": "2025-01-01T00:00:00Z" }, { "type": "Ready", "status": "True", "reason": "ResourceReady", "message": "All adapters report ready at current generation", - "observed_generation": 1 + "observed_generation": 1, + "created_time": "2025-01-01T00:00:00Z", + "last_updated_time": "2025-01-01T00:00:00Z", + "last_transition_time": "2025-01-01T00:00:00Z" }
🧹 Nitpick comments (3)
test/integration/api_contract_test.go (1)
12-27: Test correctly validates condition status constant mappings.The test ensures domain constants (
api.ConditionTrue,api.ConditionFalse) stay synchronized with generated OpenAPI constants for bothAdapterConditionStatusandResourceConditionStatus.Consider adding validation for the
Unknownstatus as well, sinceAdapterConditionStatusUnknownis now used in the API (e.g., for startup pending scenarios):💡 Optional: Add Unknown constant validation
Expect(string(api.ConditionFalse)).To(Equal(string(openapi.ResourceConditionStatusFalse)), "api.ConditionFalse must match openapi.ResourceConditionStatusFalse") + // Verify Unknown status mapping for adapter conditions + Expect(string(api.ConditionUnknown)).To(Equal(string(openapi.AdapterConditionStatusUnknown)), + "api.ConditionUnknown must match openapi.AdapterConditionStatusUnknown")pkg/api/presenters/node_pool.go (1)
30-50: Align initial ObservedGeneration with initial generation.NodePool generation defaults to 1 (and Cluster init uses 1). Starting at 0 can make Available/Ready appear unobserved. Consider setting 1 or documenting the rationale.
♻️ Suggested change
- ObservedGeneration: 0, + ObservedGeneration: 1,- ObservedGeneration: 0, + ObservedGeneration: 1,openapi/openapi.yaml (1)
446-450: Align transition description with allowed AdapterConditionStatus values.
The enum allowsUnknown, but the text mentions only True/False.♻️ Suggested wording update
- Only updated when status changes (True/False), not when reason/message changes + Only updated when status changes (True/False/Unknown), not when reason/message changes
…onStatus types Defines separate condition status types: - ResourceConditionStatus (True/False) for Cluster/NodePool conditions - AdapterConditionStatus (True/False/Unknown) for adapter-reported conditions This separation ensures resource status is always definitive while adapters can report Unknown when status is indeterminate.
…Condition Replaces ComputePhase() with condition-based status aggregation: - ComputeReadyCondition(): True when all required adapters report Available=True at current spec.generation - ComputeAvailableCondition(): True when all required adapters report Available=True at any generation (last known good) Key behavior: After spec change, Ready becomes False immediately while Available stays True until adapters report failure.
Updates API response formatting: - New resources get default conditions: Available=False, Ready=False with reason AwaitingAdapters - Converts domain condition types to OpenAPI types for responses - Adapter status now uses AdapterConditionStatus type
Service layer changes for condition-based status: - UpdateConditionsFromAdapterStatus() updates Ready/Available conditions when adapter status is POSTed - Returns nil, nil for no-op when conditions don't change - Removes phase-related code from generic service - Adds comprehensive tests for condition update logic
Handler changes for condition-based status:
- Add handleCreateWithNoContent() framework helper for 204 responses
- POST /clusters/{id}/statuses returns:
- 201 Created when conditions changed
- 204 No Content when adapter report received but no condition change
- Same behavior for nodepool status endpoints
Database and search changes: - Add conditionsNodeConverter() for TSL to SQL translation - New search syntax: status.conditions.Ready='True' - Remove status_phase column from migrations - Add GIN index for efficient condition queries Query translation: status.conditions.Ready='True' becomes: jsonb_path_query_first(status_conditions, '$[*] ? (@.type == "Ready")') ->> 'status' = 'True'
Breaking API changes: - Remove status.phase field from Cluster and NodePool - Add status.conditions[] array with Ready/Available conditions - ResourceCondition.status enum: True, False (not Unknown) - AdapterCondition.status enum: True, False, Unknown
Removes obsolete phase-related code: - Remove phase validation from cluster types - Remove phase validation from node pool types - Remove associated phase tests
Updates test helpers to use conditions instead of phase: - Cluster factory creates default Ready/Available conditions - NodePool factory creates default Ready/Available conditions
End-to-end test coverage: - Tests for condition updates via adapter status POST - Updated contract tests for new response format - Tests for status.conditions.* search syntax
Documentation updates: - Updated API examples with condition-based status - Updated search syntax documentation - Documented condition semantics (Ready vs Available) - Updated agent configuration docs
9b786c7 to
db6bb14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/services/status_aggregation.go (1)
61-130: InconsistentminObservedGenerationreturn values for "not available" cases.When no adapters or required adapters exist (line 67), the function returns
1. When adapters exist but none are available (line 129), it returns0. This inconsistency may confuse callers.🐛 Proposed fix for consistency
func ComputeAvailableCondition(adapterStatuses api.AdapterStatusList, requiredAdapters []string) (bool, int32) { if len(adapterStatuses) == 0 || len(requiredAdapters) == 0 { - return false, 1 + return false, 0 }Alternatively, document the intended semantics for the
minObservedGenerationreturn value in the "not available" case.test/integration/search_field_mapping_test.go (1)
97-152: TestSearchCombinedQuery no longer exercises a combined predicate.
The test name/comments say “combined,” but the query is labels‑only and the comment references a non‑existentwrongStatusCluster. Consider updating the query (or renaming) so intent matches behavior.✏️ Suggested fix to keep it “combined”
- // Query using combined AND condition with labels (labels search still works) - searchStr := "labels.region='us-east'" + // Query using combined AND condition: Ready=False AND labels.region='us-east' + searchStr := "status.conditions.Ready='False' AND labels.region='us-east'" @@ - // Should return matchCluster and wrongStatusCluster but not wrongRegionCluster + // Should return matchCluster only (Ready=False, region=us-east)
🤖 Fix all issues with AI agents
In `@pkg/services/cluster_test.go`:
- Around line 515-520: Update the explanatory comments describing the edge case
where an adapter reports gen=1 after gen=2 so they precisely reflect the
asserted behavior: state that because minObservedGeneration is not downgraded,
the cluster conditions (Ready and Available) must not be overwritten and
Available remains True in this test, even though that is a stale/incorrect
reality until an adapter reports gen=2; mention minObservedGeneration, the
gen=1→gen=2 sequence, and that the test asserts Available stays True (and Ready
stays False) to avoid misleading future maintainers.
In `@pkg/services/node_pool_test.go`:
- Around line 416-427: The comment is contradictory: update either the comment
or the assertions around the upsert("validation", api.AdapterConditionFalse, 1)
call so they match; specifically either change the comment "Stale False is more
restrictive and should override." to reflect that Available remains True with
ObservedGeneration 2, or change the Expect calls on available.Status /
available.ObservedGeneration (from getAvailable()) to assert that Available
becomes False and ObservedGeneration reflects the stale update; adjust the test
around the upsert, getAvailable, and Expect lines to be consistent.
In `@test/integration/api_contract_test.go`:
- Around line 12-27: The test TestAPIContract_ConditionStatusConstants is
missing an assertion for the 'Unknown' value — add an Expect comparing
api.AdapterConditionUnknown to openapi.AdapterConditionStatusUnknown (i.e. add
an assertion that string(api.AdapterConditionUnknown) equals
string(openapi.AdapterConditionStatusUnknown) with a clear failure message)
alongside the existing ConditionTrue/ConditionFalse assertions so the
AdapterConditionUnknown constant is validated against the OpenAPI constant.
♻️ Duplicate comments (3)
pkg/db/sql_helpers.go (1)
184-235: Qualifystatus_conditionswith the table name to avoid ambiguous column errors.Line 234 builds
jsonb_path_query_first(status_conditions, ...)without a qualifier; in joins (e.g., clusters ↔ node_pools) this can become ambiguous.ExtractConditionQueriesalready threadstableName, butconditionsNodeConverterignores it—please pass and use it when building the expression.Also applies to: 251-259
AGENTS.md (1)
259-263: Tighten wording to “last known good configuration.”Line 262 would read more precisely as “last known good configuration.”
✏️ Suggested wording tweak
- - Indicates the resource is running at some known good configuration + - Indicates the resource is running at the last known good configurationdocs/api-resources.md (1)
356-371: NodePool Get example still missing condition timestamps.The NodePool Get example at lines 356-371 omits the required
created_time,last_updated_time, andlast_transition_timefields from the conditions, while the Cluster Get example (lines 102-122) includes them. This inconsistency was flagged in a past review but appears unaddressed.📝 Suggested fix
{ "type": "Available", "status": "True", "reason": "ResourceAvailable", "message": "NodePool is accessible", - "observed_generation": 1 + "observed_generation": 1, + "created_time": "2025-01-01T00:00:00Z", + "last_updated_time": "2025-01-01T00:00:00Z", + "last_transition_time": "2025-01-01T00:00:00Z" }, { "type": "Ready", "status": "True", "reason": "ResourceReady", "message": "All adapters report ready at current generation", - "observed_generation": 1 + "observed_generation": 1, + "created_time": "2025-01-01T00:00:00Z", + "last_updated_time": "2025-01-01T00:00:00Z", + "last_transition_time": "2025-01-01T00:00:00Z" }
🧹 Nitpick comments (8)
test/factories/node_pools.go (1)
82-136: Prefer context-bound DB sessions for these updates.Line 126 creates a new session via
dbFactory.New(context.Background()). Consider threading a context into this helper and retrieving sessions from that context (e.g., viadb.NewContext(ctx)) to keep transaction lifecycles consistent. As per coding guidelines.test/factories/clusters.go (1)
145-170: Consider reusingNewClusterWithLabelsto reduce duplication.Lines 152-166 duplicate the label-update logic already present in
NewClusterWithLabels. This can be simplified by delegating to the existing helper.♻️ Proposed refactor
func NewClusterWithStatusAndLabels(f *Factories, dbFactory db.SessionFactory, id string, isAvailable, isReady bool, labels map[string]string) (*api.Cluster, error) { cluster, err := NewClusterWithStatus(f, dbFactory, id, isAvailable, isReady) if err != nil { return nil, err } - if labels != nil { - labelsJSON, err := json.Marshal(labels) - if err != nil { - return nil, err - } - - dbSession := dbFactory.New(context.Background()) - err = dbSession.Model(cluster).Update("labels", labelsJSON).Error - if err != nil { - return nil, err - } - - if err := reloadCluster(dbSession, cluster); err != nil { - return nil, err - } - } + if labels != nil && len(labels) > 0 { + labelsJSON, err := json.Marshal(labels) + if err != nil { + return nil, err + } + dbSession := dbFactory.New(context.Background()) + if err := dbSession.Model(cluster).Update("labels", labelsJSON).Error; err != nil { + return nil, err + } + if err := reloadCluster(dbSession, cluster); err != nil { + return nil, err + } + } return cluster, nil }Alternatively, you could extract a private
applyLabels(dbFactory, cluster, labels)helper to share betweenNewClusterWithLabelsandNewClusterWithStatusAndLabels.pkg/services/status_aggregation.go (1)
70-98: Consider extracting shared adapter map building logic.The adapter map construction (parsing conditions JSON, finding "Available" condition, building map) is duplicated between
ComputeAvailableConditionandComputeReadyCondition. Extracting a helper would reduce duplication and ensure consistent behavior.♻️ Suggested helper extraction
type adapterInfo struct { available string observedGeneration int32 } func buildAdapterAvailabilityMap(adapterStatuses api.AdapterStatusList) map[string]adapterInfo { adapterMap := make(map[string]adapterInfo) for _, adapterStatus := range adapterStatuses { var conditions []struct { Type string `json:"type"` Status string `json:"status"` } if len(adapterStatus.Conditions) > 0 { if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err == nil { for _, cond := range conditions { if cond.Type == "Available" { adapterMap[adapterStatus.Adapter] = adapterInfo{ available: cond.Status, observedGeneration: adapterStatus.ObservedGeneration, } break } } } } } return adapterMap }Then both
ComputeAvailableConditionandComputeReadyConditioncan callbuildAdapterAvailabilityMap(adapterStatuses).Also applies to: 140-167
pkg/services/node_pool_test.go (1)
124-131: Consider using the NodePool test factory for fixtures.Inline fixtures here are fine, but using
test/factorieskeeps consistency and reduces drift with future schema changes. Based on learnings, please use test factories for NodePool fixtures when possible.pkg/services/cluster.go (2)
234-244: Fragile error detection using string matching.The check
strings.Contains(findErr.Error(), errors.CodeNotFoundGeneric)on line 237 is brittle. If the error message format changes, this will silently break. The code already checks forgorm.ErrRecordNotFoundon line 236, but the string-based check appears to handle a custom not-found error from the DAO layer.Consider using a typed error check or exporting a helper from the errors package:
♻️ Suggested approach
- if findErr != nil && !stderrors.Is(findErr, gorm.ErrRecordNotFound) { - if !strings.Contains(findErr.Error(), errors.CodeNotFoundGeneric) { - return nil, errors.GeneralError("Failed to get adapter status: %s", findErr) - } - } + if findErr != nil && !stderrors.Is(findErr, gorm.ErrRecordNotFound) && !errors.IsNotFound(findErr) { + return nil, errors.GeneralError("Failed to get adapter status: %s", findErr) + }This assumes adding an
IsNotFound(err error) boolhelper to the errors package that checks the error code properly.
241-244: Stale generation check returns (nil, nil) without logging.When discarding stale status updates (line 243), a debug/info log would aid troubleshooting adapter timing issues. Silent discards can make it hard to diagnose why an adapter's status isn't being applied.
📝 Suggested improvement
if existingStatus != nil && adapterStatus.ObservedGeneration < existingStatus.ObservedGeneration { // Discard stale status updates (older observed_generation). + logger.WithClusterID(ctx, clusterID).Debug("Discarding stale adapter status", + "adapter", adapterStatus.Adapter, + "incoming_generation", adapterStatus.ObservedGeneration, + "existing_generation", existingStatus.ObservedGeneration) return nil, nil }pkg/services/node_pool.go (2)
232-242: Same fragile error detection pattern as cluster.go.The string-based error check
strings.Contains(findErr.Error(), errors.CodeNotFoundGeneric)has the same brittleness concern as inpkg/services/cluster.go.♻️ Suggested approach
- if findErr != nil && !stderrors.Is(findErr, gorm.ErrRecordNotFound) { - if !strings.Contains(findErr.Error(), errors.CodeNotFoundGeneric) { - return nil, errors.GeneralError("Failed to get adapter status: %s", findErr) - } - } + if findErr != nil && !stderrors.Is(findErr, gorm.ErrRecordNotFound) && !errors.IsNotFound(findErr) { + return nil, errors.GeneralError("Failed to get adapter status: %s", findErr) + }
136-227: Significant code duplication with cluster.go.The
UpdateNodePoolStatusFromAdaptersandProcessAdapterStatusmethods are nearly identical to their cluster counterparts. Consider extracting common logic into a shared helper or generic function to reduce maintenance burden.This could be addressed in a follow-up PR to avoid scope creep, but worth tracking as tech debt.
Also applies to: 232-283
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yasun1 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ef43dfc
into
openshift-hyperfleet:main
https://issues.redhat.com/browse/HYPERFLEET-386
Depends on:
This PR aligns the API with the changes in the openshift-hyperfleet/architecture#79
Removes from the root of resources state:
phaseobserved_generationlast_transition_timelast_updated_timeThe information on these fields is superseded by the information in the status.conditions
AvailableReadyIt introduces also a way to query by
status.conditionsAnd it creates a postgreSQL index for this (usage of the index has to be better tested when we do performance tests)
The PR is split in different commits to make it easier the reviewing
What follows is a guide to do the PR:
PR Review Guide: HYPERFLEET-386
Summary
This PR replaces the phase-based status model with Kubernetes-style conditions for Clusters and NodePools. The change provides:
Review Order
The PR begins with establishing the new API contract (81ea64b - feat: update OpenAPI spec for conditions-based status), defining the ResourceConditionStatus (True/False) and AdapterConditionStatus
(True/False/Unknown) types that form the foundation of the Kubernetes-style conditions model. With the contract in place, the next commit (823ac7e - refactor: remove phase validation from types) cleans up the
old phase-based logic from domain types, making room for the new condition semantics. The test infrastructure is then updated (5dd7445 - test: update test factories for conditions) to use conditions instead
of phases, ensuring factories produce valid test data under the new model. Comprehensive integration tests follow (d7cac7f - test: add integration tests for conditions), covering the full lifecycle of
Ready/Available transitions, adapter status processing, idempotency, and the 204 No Content behavior for Unknown statuses. Finally, the documentation is updated (db6bb14 - docs: update documentation for
conditions) to reflect the new search syntax (status.conditions.Ready='True'), condition semantics, and API response formats.
We recommend reviewing in this logical order, which groups related changes together:
1. Domain Types (Start Here)
Understanding the new condition model is key to reviewing everything else.
pkg/api/status_types.goResourceConditionStatus(True/False) vsAdapterConditionStatus(True/False/Unknown)pkg/api/status_types_test.goKey points:
TrueorFalse- neverUnknownUnknownwhen status is indeterminate2. Status Aggregation Logic
Core business logic for computing Ready/Available from adapter reports.
pkg/services/status_aggregation.goComputeReadyCondition(),ComputeAvailableCondition(), andInitializeConditions()replacingComputePhase()Key logic:
Available=Trueat currentspec.generationAvailable=Trueat ANY generation (last known good)InitializeConditions()creates empty conditions array for new resources3. Presenters (API Response Formatting)
How conditions are presented in API responses.
pkg/api/presenters/cluster.gopkg/api/presenters/cluster_test.gopkg/api/presenters/node_pool.gopkg/api/presenters/node_pool_test.gopkg/api/presenters/adapter_status.goAdapterConditionStatustypepkg/api/presenters/adapter_status_test.goKey points:
4. Service Layer Changes
Business logic for condition initialization and updates based on adapter reports.
pkg/services/cluster.goUpdateConditionsFromAdapterStatus()logicpkg/services/cluster_test.gopkg/services/node_pool.gopkg/services/node_pool_test.gopkg/services/generic.gopkg/services/generic_test.goKey behavior:
Available=False,Ready=False, reasonAwaitingAdapters(initialized by service layer)nil, nil(no-op)5. Handler Changes
HTTP layer changes.
pkg/handlers/framework.gohandleCreateWithNoContent()for 204 responsespkg/handlers/cluster_status.gopkg/handlers/nodepool_status.goKey behavior:
POST /clusters/{id}/statusesreturns:201 Created- status updated, conditions changed204 No Content- adapter report received but no condition change (no-op)6. Database & Search
Query support for condition-based filtering.
pkg/db/sql_helpers.goconditionsNodeConverter()for TSL→SQL translationpkg/db/sql_helpers_test.gopkg/db/migrations/202601210001_add_conditions_gin_index.gopkg/db/migrations/202511111044_add_clusters.gostatus_phasecolumnpkg/db/migrations/202511111055_add_node_pools.gostatus_phasecolumnpkg/db/migrations/migration_structs.goNew search syntax:
Query translation:
7. API Contract & OpenAPI
API specification updates.
openapi/openapi.yamlBreaking changes:
status.phasefieldstatus.conditions[]array with Ready/Available conditionsResourceCondition.statusenum:True,Falseonly (not Unknown)AdapterCondition.statusenum:True,False,Unknown8. Type Cleanup
Removed obsolete phase-related code.
pkg/api/cluster_types.gopkg/api/cluster_types_test.gopkg/api/node_pool_types.gopkg/api/node_pool_types_test.go9. Test Factories
Updated test helpers.
test/factories/clusters.gotest/factories/node_pools.go10. Integration Tests
End-to-end test coverage.
test/integration/adapter_status_test.gotest/integration/api_contract_test.gotest/integration/search_field_mapping_test.gostatus.conditions.*search11. Documentation
docs/api-resources.mdAGENTS.mdTest Coverage
pkg/api/status_types_test.go,pkg/services/*_test.gotest/integration/adapter_status_test.gotest/integration/search_field_mapping_test.goTest Scenarios (Cucumber Format)
Feature: Adapter Status Processing
Scenario: ProcessAdapterStatus returns nil for Unknown Available (unit test)
Scenario: POST with Unknown Available returns 204 (integration test)
Scenario: ProcessAdapterStatus stores status for Available=True (unit test)
Scenario: POST adapter status returns 201 Created (integration test)
Scenario: ProcessAdapterStatus stores status for Available=False (unit test)
Scenario: ProcessAdapterStatus returns nil when multiple conditions include Available=Unknown (unit test)
Scenario: POST with multiple conditions including Available=Unknown returns 204 (integration test)
Scenario: ProcessAdapterStatus stores status when no Available condition present (unit test)
Scenario: NodePool POST with Unknown Available returns 204 (integration test)
Feature: Available/Ready Condition Transitions
Scenario: Full lifecycle of condition transitions for Cluster (unit test)
Scenario: Full lifecycle of condition transitions for NodePool (unit test)
Feature: Adapter Status Idempotency
Scenario: Same adapter posts twice - upsert not duplicate (integration test)
Feature: Condition Search Query Translation
Scenario: Ready condition translates to correct SQL
Scenario: Available condition translates to correct SQL
Scenario: Invalid condition status returns error
Scenario: Lowercase condition type returns error
Scenario: ExtractConditionQueries extracts single condition
Scenario: ExtractConditionQueries extracts multiple conditions
Scenario: ExtractConditionQueries handles mixed queries
Scenario: ExtractConditionQueries handles no conditions
Feature: Condition Field Detection
Scenario: hasCondition detects condition fields
Feature: Condition Type Validation
Scenario Outline: conditionTypePattern matches valid PascalCase types
Scenario Outline: conditionTypePattern rejects invalid types
Feature: Condition Status Values
Scenario: ResourceConditionStatus constants have correct string values
Scenario: AdapterConditionStatus constants have correct string values
Scenario: AdapterConditionStatus string conversion
Scenario: ResourceConditionStatus string conversion
Scenario: validConditionStatuses map contains correct values
Feature: Condition JSON Serialization
Scenario: ResourceCondition serializes to JSON with all fields
Scenario: ResourceCondition deserializes from JSON
Scenario: ResourceCondition round-trip preserves data
Scenario: AdapterCondition serializes Unknown status
Scenario: AdapterCondition deserializes Unknown status
Scenario: AdapterCondition round-trip preserves data
Feature: Pagination
Scenario: Pagination returns correct page size (integration test)
Scenario: Page beyond total returns empty (integration test)
Scenario: Empty cluster returns zero total (integration test)
Scenario: Single item pagination (integration test)
Scenario: Pagination across pages returns all items without duplicates (integration test)
Checklist for Reviewer
Summary by CodeRabbit
New Features
status.conditions.<Type>instead ofstatus.phasefor searching and filtering.Documentation
✏️ Tip: You can customize this high-level summary in your review settings.