feat: OLMv1 lifecycle — fresh install tests + OLMv0→OLMv1 migration target#2160
feat: OLMv1 lifecycle — fresh install tests + OLMv0→OLMv1 migration target#2160weshayutin wants to merge 16 commits into
Conversation
Signed-off-by: Wesley Hayutin <weshayutin@gmail.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds OLMv1 end-to-end and migration tests under ChangesOLMv1 tests, Makefile targets, and manifest tweaks
Sequence Diagram(s)sequenceDiagram
participant Test as Test Suite
participant API as Kubernetes API
participant CC as ClusterCatalog
participant CE as ClusterExtension Controller
participant OLM as OLM Resolver
Test->>API: Create Namespace & ServiceAccount
Test->>API: Create ClusterRoleBinding (cluster-admin)
Test->>API: Create ClusterCatalog (from image)
Test->>CC: Poll until Serving=True
Test->>API: Create ClusterExtension (reference catalog)
CE->>OLM: Resolve bundle from catalog
OLM-->>CE: Return bundle manifest
CE->>API: Create operator deployment
Test->>API: Poll ClusterExtension.status (wait Installed=True)
API-->>Test: Installed condition reached
Test->>API: Verify controller-manager pods Running
Test->>API: Verify OADP/Velero CRDs exist
Test->>API: Check no Deprecated conditions
alt Upgrade scenario (if upgradeVersion set)
Test->>API: Patch ClusterExtension to new version
CE->>OLM: Resolve upgraded bundle
OLM-->>CE: Return new bundle manifest
CE->>API: Update operator deployment
Test->>API: Poll for new bundle version & Installed=True
Test->>API: Verify controller-manager Running again
end
Test->>API: Delete ClusterExtension
Test->>API: Wait for deletion
Test->>API: Delete ClusterCatalog
Test->>API: Delete ClusterRoleBinding & ServiceAccount
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: weshayutin The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/manager/kustomization.yaml`:
- Around line 7-8: The checked-in kustomization currently pins newName/newTag to
the ephemeral ttl.sh image (ttl.sh/oadp-operator-7e53a850:1h); replace this with
a stable, non-expiring default (e.g., the official oadp-operator image and a
permanent tag or digest) or remove the ttl.sh newName/newTag entries so the
repository manifest does not reference an expiring image; ensure any temporary
ttl.sh usage is moved into CI/deploy workflows that inject the test image
dynamically rather than committing it to kustomization.yaml.
In `@Makefile`:
- Around line 1052-1057: The cleanup target test-olmv1-cleanup unconditionally
deletes ClusterCatalog $(OLMV1_CATALOG); guard that deletion so we only remove a
catalog the tests created by checking the same creation condition or a creation
marker. Modify test-olmv1-cleanup to only run the $(OC_CLI) delete
clustercatalog $(OLMV1_CATALOG) line when OLMV1_CATALOG_IMAGE is set (or when a
persisted marker like OLMV1_CATALOG_CREATED file/env var exists), and update the
Catalog creation step (the rule that creates the catalog) to set that marker
(e.g., touch a file or export a flag) so cleanup can safely detect it before
deleting.
In `@tests/olmv1/olmv1_install_test.go`:
- Around line 107-122: The current gomega.Eventually loop only checks
pod.Status.Phase == corev1.PodRunning which can false-pass; update the check in
the Eventually closure (the block using kubeClient.CoreV1().Pods(...).List and
iterating pods.Items) to verify readiness instead: for each pod, inspect
pod.Status.Conditions for condition.Type == corev1.PodReady with Status ==
corev1.ConditionTrue (or alternatively fetch the Deployment via
kubeClient.AppsV1().Deployments(namespace).Get and assert the Deployment status
has an Available condition == True / status.AvailableReplicas > 0); apply the
same change to the other occurrence mentioned (lines 213-226) so tests assert
PodReady or Deployment Available rather than just PodRunning.
- Around line 167-181: The current code reads the ClusterExtension via
getClusterExtension and then calls
dynamicClient.Resource(clusterExtensionGVR).Update after changing catalogSpec
(using unstructuredNestedMap/unstructuredSetNestedMap), which can race
controller status updates and yield 409s; instead patch only the
spec.source.catalog fields (or wrap the update in retry.RetryOnConflict) rather
than updating the whole object: construct a minimal merge patch containing
spec.source.catalog.version and spec.source.catalog.upgradeConstraintPolicy and
call dynamicClient.Resource(clusterExtensionGVR).Patch with types.MergePatchType
(add import "k8s.io/apimachinery/pkg/types"), or if you prefer keep Update, wrap
the read/modify/write in retry.RetryOnConflict to retry on conflicts. Ensure
references to getClusterExtension, unstructuredNestedMap,
unstructuredSetNestedMap, and dynamicClient.Resource(clusterExtensionGVR) are
updated accordingly.
In `@tests/olmv1/olmv1_suite_test.go`:
- Around line 110-126: The ClusterRoleBinding name in ensureClusterAdminBinding
only uses saName so it can collide across namespaces; change the naming or
reconcile existing bindings: either make bindingName include the namespace
(e.g., bindingName := saName + "-" + ns + "-cluster-admin") so it's unique per
namespace, or when Create returns AlreadyExists call
kubeClient.RbacV1().ClusterRoleBindings().Get to load the existing
ClusterRoleBinding and update its Subjects (add or replace the ServiceAccount
subject for {Name: saName, Namespace: ns}) and then call Update to persist the
corrected subjects; implement one of these approaches inside
ensureClusterAdminBinding.
- Around line 270-295: The cleanupOrphanedCRDs function is destructive on shared
clusters; change it to be gated behind an explicit opt-in (e.g., a test flag or
env var like TEST_DELETE_ORPHAN_CRDS) or a deterministic dedicated-cluster check
before calling dynamicClient.Resource(crdGVR).Delete, and after issuing Delete
for each CRD found by cleanupOrphanedCRDs poll/wait (using
dynamicClient.Resource(crdGVR).Get in a loop with backoff and timeout) until
apierrors.IsNotFound confirms the resource is fully removed before counting it
as deleted; ensure you still handle non-NotFound errors via logging and skip
deletion when the opt-in flag is not set.
- Around line 315-323: When Create on dynamicClient.Resource(clusterCatalogGVR)
returns apierrors.IsAlreadyExists(err), fetch the existing ClusterCatalog (using
dynamicClient.Resource(clusterCatalogGVR).Get with the same name and ctx) and
validate its image field against the requested image variable; if they differ,
fail the test (gomega.Expect/return error) or update/replace the catalog to
match the requested image instead of silently reusing it. Ensure the check
references the existing object's image path (e.g., status/spec field used for
image in ClusterCatalog) and only set createdCatalog = true and log "Created
ClusterCatalog" when you actually created or successfully reconciled the
resource to the desired image.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 77bf8ce7-890a-4973-98c9-d65314a9d12c
📒 Files selected for processing (5)
Makefileconfig/manager/kustomization.yamltests/olmv1/.gitignoretests/olmv1/olmv1_install_test.gotests/olmv1/olmv1_suite_test.go
Signed-off-by: Wesley Hayutin <weshayutin@gmail.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
Makefile (1)
1053-1055:⚠️ Potential issue | 🟠 MajorMake
ClusterCatalogdeletion opt-in.Line 1055 still deletes
$(OLMV1_CATALOG)unconditionally. The suite itself only deletes a catalog after it knows it created it (tests/olmv1/olmv1_suite_test.gosetscreatedCatalogonly on successful create, andtests/olmv1/olmv1_install_test.gochecks that flag inAfterAll). Pointing this cleanup target at an existing/shared catalog will remove a resource the tests did not own.🧹 Safer cleanup sketch
+OLMV1_DELETE_CATALOG ?= false + test-olmv1-cleanup: login-required ## Cleanup resources created by OLMv1 tests. $(OC_CLI) delete clusterextension oadp-operator --ignore-not-found=true - $(OC_CLI) delete clustercatalog $(OLMV1_CATALOG) --ignore-not-found=true + `@if` [ "$(OLMV1_DELETE_CATALOG)" = "true" ]; then \ + $(OC_CLI) delete clustercatalog $(OLMV1_CATALOG) --ignore-not-found=true; \ + fi $(OC_CLI) delete clusterrolebinding $(OLMV1_SERVICE_ACCOUNT)-cluster-admin --ignore-not-found=true $(OC_CLI) delete sa $(OLMV1_SERVICE_ACCOUNT) -n $(OLMV1_NAMESPACE) --ignore-not-found=trueIf you want parity with the suite’s ownership check, persist a creation marker and key catalog deletion off that instead of a plain name match.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 1053 - 1055, The Makefile target test-olmv1-cleanup currently unconditionally deletes the ClusterCatalog $(OLMV1_CATALOG); change it to be opt-in by checking a persisted "created" marker (e.g., touch a file when the suite successfully creates the catalog) before running the delete command in the test-olmv1-cleanup target. Specifically, modify the test-olmv1-cleanup target to only run `$(OC_CLI) delete clustercatalog $(OLMV1_CATALOG)` if the marker file exists, and ensure tests that create the catalog (tests/olmv1/olmv1_suite_test.go and olmv1_install_test.go flow) write/remove that marker so ownership is respected instead of deleting a shared catalog by name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@Makefile`:
- Around line 1053-1055: The Makefile target test-olmv1-cleanup currently
unconditionally deletes the ClusterCatalog $(OLMV1_CATALOG); change it to be
opt-in by checking a persisted "created" marker (e.g., touch a file when the
suite successfully creates the catalog) before running the delete command in the
test-olmv1-cleanup target. Specifically, modify the test-olmv1-cleanup target to
only run `$(OC_CLI) delete clustercatalog $(OLMV1_CATALOG)` if the marker file
exists, and ensure tests that create the catalog
(tests/olmv1/olmv1_suite_test.go and olmv1_install_test.go flow) write/remove
that marker so ownership is respected instead of deleting a shared catalog by
name.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: b65495fa-dc13-49e1-a8c0-13056be93809
📒 Files selected for processing (1)
Makefile
Signed-off-by: Wesley Hayutin <weshayutin@gmail.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 55
The issue — "Test OLMv0 to OLMv1 upgrade path/migration" — covers:
It's assigned to you and backlinked to PR |
This comment has been minimized.
This comment has been minimized.
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 55
"Add Makefile target to upgrade OLMv0 install to OLMv1 install" covers:
It's assigned to you and backlinked to PR |
|
@kaovilai taking this over :) |
…anifest generation, and implement migration tests Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
Makefile (1)
1131-1132: 💤 Low value
xargs -ris GNU-specific and may fail on macOS.The
-rflag (don't run if stdin is empty) is a GNU extension not available in BSD xargs. On macOS, this will error unless GNU coreutils is installed.Possible fix using shell conditional
- -$(OC_CLI) get crd -o name 2>/dev/null | grep -E '\.oadp\.openshift\.io|\.velero\.io' | \ - xargs -r $(OC_CLI) delete --ignore-not-found=true || true + -CRDS=$$($(OC_CLI) get crd -o name 2>/dev/null | grep -E '\.oadp\.openshift\.io|\.velero\.io'); \ + if [ -n "$$CRDS" ]; then echo "$$CRDS" | xargs $(OC_CLI) delete --ignore-not-found=true; fi || true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Makefile` around lines 1131 - 1132, The Makefile line uses GNU-only xargs -r which breaks on macOS; change the command to guard against empty stdin instead of relying on -r: capture the output of "$(OC_CLI) get crd -o name 2>/dev/null | grep -E '\.oadp\.openshift\.io|\.velero\.io'" into a variable or test it, and only pipe to "xargs $(OC_CLI) delete --ignore-not-found=true" when non-empty; update the invocation that currently contains "xargs -r" and the surrounding "$(OC_CLI) get crd -o name" pipeline accordingly so the delete step is skipped on empty input in a portable POSIX-compatible way.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Makefile`:
- Around line 1131-1132: The Makefile line uses GNU-only xargs -r which breaks
on macOS; change the command to guard against empty stdin instead of relying on
-r: capture the output of "$(OC_CLI) get crd -o name 2>/dev/null | grep -E
'\.oadp\.openshift\.io|\.velero\.io'" into a variable or test it, and only pipe
to "xargs $(OC_CLI) delete --ignore-not-found=true" when non-empty; update the
invocation that currently contains "xargs -r" and the surrounding "$(OC_CLI) get
crd -o name" pipeline accordingly so the delete step is skipped on empty input
in a portable POSIX-compatible way.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9043a314-cd22-4f4c-ab9c-d117aa57e4de
📒 Files selected for processing (5)
Makefilebundle/manifests/oadp-operator.clusterserviceversion.yamlconfig/manifests/bases/oadp-operator.clusterserviceversion.yamltests/olmv1/olmv1_migrate_test.gotests/olmv1/olmv1_suite_test.go
OLMv1 cannot adopt resources created by OLMv0. Add cleanup step that deletes all olm.managed=true labeled resources (ServiceAccounts, Roles, RoleBindings, Deployments, ClusterRoles, ClusterRoleBindings) before ClusterExtension install. Also add CatalogSource image detection to migrate custom FBC catalogs to ClusterCatalog, and update Makefile migration target with matching phases. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
The upgrade-v0-to-olmv1 Makefile target was missing the OLMv0 remnant cleanup step. OLMv1 refuses to install when it finds pre-existing resources labeled olm.managed=true (ServiceAccounts, Roles, ClusterRoles, etc). Add Phase 3b to delete these before ClusterExtension creation. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Makefile`:
- Around line 1141-1142: The Makefile currently runs $(OC_CLI) get crd ... |
xargs ... delete which removes OADP/Velero CRDs cluster-wide and thus destroys
all CR instances; change this to avoid deleting CRDs: either (A) remove the CRD
deletion entirely and only delete CR instances by listing and deleting
namespaced CRs (use $(OC_CLI) get <cr-kind> -n $(TARGET_NAMESPACE) ... and
delete those), or (B) gate the CRD deletion behind an explicit opt-in variable
(e.g. SKIP_CRD_DELETE or CONFIRM_DELETE_CRDS) and only run the existing
$(OC_CLI) delete command when that variable is set; target the unique symbols
$(OC_CLI), get crd, and the xargs ... delete pipeline to implement the safe
alternative.
In `@tests/olmv1/olmv1_migrate_test.go`:
- Around line 118-129: The cleanup currently swallows errors when listing or
deleting CatalogSources (calls using
dynamicClient.Resource(catalogSourceGVR).Namespace("openshift-marketplace").List/Delete
and the loop that uses isDefaultCatalogSource), causing flaky tests; change the
logic to assert on errors instead of ignoring them: check and fail the test
(using gomega/Expect or ginkgo.Fail) if the List returns an error and for each
Delete capture its error and assert it succeeded (or retry/collect and fail
after loop), and include the resource name in failure messages so failures are
deterministic and debuggable.
- Around line 82-85: The Eventually checks call
dynamicClient.Resource(subscriptionGVR).Namespace(namespace).List(...) and
ignore the returned error, which can cause nil derefs; update the lambda used in
gomega.Eventually (the anonymous func passed to gomega.Eventually at both
occurrences) to handle the error from List — if err != nil return a sentinel
value (e.g., -1) or otherwise surface the error so the Eventually assertion
won’t access list.Items on a nil list; replace the unconditional return
len(list.Items) in the lambda with logic that checks err and only returns
len(list.Items) when err == nil.
- Around line 170-183: The current deletion loops iterate all
ClusterRoles/ClusterRoleBindings matched by olmSelector and may remove non-OADP
resources; modify the logic in the blocks using
kubeClient.RbacV1().ClusterRoles().List / ClusterRoleBindings().List (variables
crs and crbs) to only delete items that are known OADP remnants by applying an
additional safety filter (e.g., check cr.Name or crb.Name for OADP-specific
prefixes like "oadp", "velero", or the OADP operator name, or check for a
specific label/annotation that identifies OADP resources such as "app=oadp" or
an OADP owner annotation) and skip/log all others instead of deleting them
unconditionally.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ec3a3efd-e2fd-4ef1-9512-a5174ce744ab
📒 Files selected for processing (2)
Makefiletests/olmv1/olmv1_migrate_test.go
…ion test - Fix CatalogSource cleanup to scan both openshift-marketplace and operator namespace (CI places CatalogSource in operator namespace via optional-operators-subscribe) - Add OLMv0 CSV version capture before migration and verify same version installed after OLMv1 ClusterExtension install - Verify installed bundle came from expected catalog, not community default - Pin ClusterExtension to specific ClusterCatalog via selector.matchLabels - Build fresh operator/bundle/catalog images in upgrade-v0-to-olmv1 to avoid expired ttl.sh images - Add verbose diagnostics for ClusterCatalog failures (image ref, catalogd logs) - Append test-upgrade-v0-to-olmv1 to test-e2e target for Prow presubmit - Add usage comments for local and CI/Prow workflows Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/olmv1/olmv1_migrate_test.go (1)
187-214: 💤 Low valueList errors in cleanup are silently ignored.
All
Listcalls for namespace-scoped cleanup (ServiceAccounts, Roles, RoleBindings, Deployments) ignore errors. If listing fails, cleanup silently skips those resources, potentially leaving orphaned OLMv0 remnants and causing flaky migration tests.Consider at minimum logging the errors:
Suggested improvement
- sas, _ := kubeClient.CoreV1().ServiceAccounts(namespace).List(ctx, olmSelector) + sas, err := kubeClient.CoreV1().ServiceAccounts(namespace).List(ctx, olmSelector) + if err != nil { + log.Printf("Warning: failed to list ServiceAccounts: %v", err) + } if sas != nil {Apply similar pattern to Roles, RoleBindings, and Deployments list calls.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/olmv1/olmv1_migrate_test.go` around lines 187 - 214, The List calls for namespace cleanup (kubeClient.CoreV1().ServiceAccounts(...).List, kubeClient.RbacV1().Roles(...).List, kubeClient.RbacV1().RoleBindings(...).List, kubeClient.AppsV1().Deployments(...).List) are ignoring returned errors; capture each error (e.g., sasErr, rolesErr, rbsErr, deployErr) and if non-nil log it (log.Printf or t.Errorf) so failures are visible during tests, and decide whether to continue cleanup or fail the test based on the error; apply this pattern to ServiceAccounts, Roles, RoleBindings and Deployments List calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Makefile`:
- Around line 1164-1166: The Makefile emits a scalar `channel:` but the
ClusterExtension API expects `spec.source.catalog.channels` to be an array;
update the generation logic that references OLMV1_CHANNEL and writes to
OLMV1_MANIFEST to output a YAML array field `channels:` with the channel value
inside (e.g., channels: [ '$(OLMV1_CHANNEL)' ]) instead of `channel:` so the
manifest matches the Go tests and schema.
- Around line 1002-1020: The recipe currently reads EXIT_CODE from
/tmp/oadp-e2e-exit-code in a new shell, causing a lost/zeroed exit when the
earlier ginkgo block sets EXIT_CODE; instead ensure the exit-code write/read
happen in the same recipe shell by moving the read/exit logic into the same
shell block that assigns EXIT_CODE (or by chaining commands with backslashes so
they run in one shell), or alternatively export and use the variable before Make
starts a new shell; specifically update the Makefile recipe so that the value
written to /tmp/oadp-e2e-exit-code (EXIT_CODE) and the final rm -f
/tmp/oadp-e2e-exit-code; exit $$E2E_EXIT logic execute in the same shell that
runs the ginkgo block and the call to test-upgrade-v0-to-olmv1, ensuring
EXIT_CODE is preserved for the final exit.
---
Nitpick comments:
In `@tests/olmv1/olmv1_migrate_test.go`:
- Around line 187-214: The List calls for namespace cleanup
(kubeClient.CoreV1().ServiceAccounts(...).List,
kubeClient.RbacV1().Roles(...).List, kubeClient.RbacV1().RoleBindings(...).List,
kubeClient.AppsV1().Deployments(...).List) are ignoring returned errors; capture
each error (e.g., sasErr, rolesErr, rbsErr, deployErr) and if non-nil log it
(log.Printf or t.Errorf) so failures are visible during tests, and decide
whether to continue cleanup or fail the test based on the error; apply this
pattern to ServiceAccounts, Roles, RoleBindings and Deployments List calls.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 929f30dd-b0f5-4f8a-911d-248df73e1d7b
📒 Files selected for processing (9)
.gitignore.golangci.yamlMakefilebundle/manifests/oadp-operator.clusterserviceversion.yamlconfig/manifests/bases/oadp-operator.clusterserviceversion.yamltests/olmv1/.gitignoretests/olmv1/olmv1_install_test.gotests/olmv1/olmv1_migrate_test.gotests/olmv1/olmv1_suite_test.go
…or catalog target - Fix generate-olmv1-manifest: channel (singular) → channels (array) per ClusterExtension API schema - Scope olm.managed=true cluster-wide RBAC cleanup to OADP-related resources only, avoiding deletion of other operators' resources in shared clusters - Replace Update with MergePatch for ClusterExtension upgrade to avoid 409 Conflict races with the controller - Validate existing ClusterCatalog image on AlreadyExists instead of silently returning - Add deploy-olmv1-mirror-catalog target for testing with productized index images when current OCP version lacks redhat-oadp-operator (e.g., 4.22) Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
|
/test all |
|
"flake" fix #2198 /retest |
|
@weshayutin: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Pull request overview
Adds OLMv1 (ClusterExtension/ClusterCatalog) lifecycle coverage and migration tooling so OADP can be installed fresh via OLMv1 and migrated from an existing OLMv0 (Subscription/CSV) install, with CI-friendly Makefile targets and Ginkgo suites.
Changes:
- Introduces new Ginkgo suites under
tests/olmv1/for fresh OLMv1 install/upgrade and OLMv0→OLMv1 migration verification. - Extends
Makefilewithtest-olmv1,test-upgrade-v0-to-olmv1,upgrade-v0-to-olmv1, and manifest-generation targets; appends migration test execution aftertest-e2e. - Enables
SingleNamespaceinstall mode in both generated and base CSV manifests.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
Makefile |
Adds OLMv1 test/migration targets, manifest generation, and runs migration test after test-e2e. |
tests/olmv1/olmv1_suite_test.go |
New shared helpers for OLMv1 tests (ClusterExtension/ClusterCatalog operations, CRD cleanup, condition logging). |
tests/olmv1/olmv1_install_test.go |
New fresh-install/upgrade validation suite for OLMv1. |
tests/olmv1/olmv1_migrate_test.go |
New OLMv0→OLMv1 migration suite with version/catal og validation and cleanup steps. |
tests/olmv1/.gitignore |
Ignores test temp output directory. |
config/manifests/bases/oadp-operator.clusterserviceversion.yaml |
Enables SingleNamespace install mode in base CSV. |
bundle/manifests/oadp-operator.clusterserviceversion.yaml |
Enables SingleNamespace install mode in bundle CSV. |
.golangci.yaml |
Excludes tests/olmv1 from golangci-lint. |
.gitignore |
Ignores generated oadp-olmv1-manifest.yaml. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thank you for you reviews! |
|
/hold |
xargs -r (--no-run-if-empty) is a GNU coreutils extension not available on macOS BSD xargs. Use shell variable + conditional instead. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
The Eventually closures for Subscription and CSV deletion verification ignored List errors. If List fails (transient API error), the nil list causes a panic on list.Items access. Use gomega.Gomega parameter to properly assert both error and emptiness. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Cleanup List calls for ServiceAccounts, Roles, RoleBindings, Deployments, ClusterRoles, and ClusterRoleBindings were silently discarding errors. Now log warnings on failure and skip iteration to prevent nil panics. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
When ensureClusterCatalog finds an AlreadyExists catalog with a matching image, it returned without setting createdCatalog=true. This caused AfterAll to skip cleanup, leaving the catalog on the cluster after repeated test runs. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
CRD deletion in Kubernetes is asynchronous — finalizers run and existing CRs get garbage collected. If the install test creates a ClusterExtension before old CRDs are fully gone, OLMv1 may hit ownership conflicts. Now polls until each deleted CRD is confirmed gone. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Both install and migration tests failed with AlreadyExists when a previous run left a ClusterExtension behind. Now delete any existing CE before creating. Also remove hardcoded clusterExtensionName const in favor of the configurable packageName flag, aligning CE name with test-olmv1-cleanup target which deletes $(OLMV1_PACKAGE). Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
…cted buildClusterExtension only set the catalog selector when the global catalogImage flag was non-empty. In migration, the catalog image is auto-detected from the Subscription (migratedCatalogImage) while catalogImage stays empty, so the selector was never set and OLMv1 could pick a wrong default/community catalog. Add withCatalogSelector option function so the migration test can explicitly pin to the created ClusterCatalog. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
|
Please check/approve openshift/release#79152 to unblock testing here |
Overview
Enable OADP users to install and migrate to OLMv1 (
ClusterExtension-based) management. This PR covers:tests/olmv1/)make test-olmv1/make test-olmv1-cleanupMakefile targetsmake generate-olmv1-manifest— generates OLMv1 install manifest per OCPSTRAT-2268 adoption templatemake upgrade-v0-to-olmv1Makefile target for migrating existing OLMv0 installsmake test-upgrade-v0-to-olmv1— Ginkgo migration test with version verificationinstallModes: enableSingleNamespace=true(OADP-4051)test-e2eFixes #2194
Fixes #2193
Related Issues & Epics
upgrade-v0-to-olmv1Makefile targetProgress
Fresh Install —
test-olmv1(OADP-7235)OLMV1_*Makefile variables (OLMV1_PACKAGE,OLMV1_NAMESPACE,OLMV1_CHANNEL,OLMV1_VERSION,OLMV1_UPGRADE_VERSION,OLMV1_CATALOG,OLMV1_CATALOG_IMAGE,OLMV1_SERVICE_ACCOUNT,OLMV1_INSTALLER_BINDING)make test-olmv1target — runs Ginkgo suite against clustermake test-olmv1-cleanuptarget — deletes ClusterExtension, ClusterCatalog, SA, CRBClusterExtension(waits forInstalled=True, fail-fast onInvalidConfiguration/Failed)RunningDeprecated,PackageDeprecated,ChannelDeprecated,BundleDeprecatedallFalse)upgradeConstraintPolicy: SelfCertified(skipped whenOLMV1_UPGRADE_VERSIONunset)cleanupOrphanedCRDs— removes*.oadp.openshift.io/*.velero.ioCRDs left by OLMv0 (OLMv1 cannot adopt CRDs it didn't create)ensureClusterCatalog/waitForClusterCatalogServingfor custom catalog imagesOLMV1_PACKAGE=redhat-oadp-operator) — PASSED usingredhat-operator-index:v4.21as ClusterCatalog (v4.22 index does not include OADP yet). Installedredhat-oadp-operatorv1.5.5 fromregistry.redhat.io/oadp/oadp-operator-bundle, controller-manager Running, 6 CRDs created, no deprecation warningsOLMv1 Manifest —
generate-olmv1-manifest(OCPSTRAT-2268)make generate-olmv1-manifest— generatesoadp-olmv1-manifest.yamlper official adoption template<pkg>-installer, CRB=<pkg>-installer-bindingOLMV1_CHANNEL/OLMV1_VERSIONappended when setOLMV1_PIN_CATALOG— addsselector.matchLabelsto pin ClusterExtension to a specific ClusterCatalog.gitignoreOLMv0 → OLMv1 Migration —
upgrade-v0-to-olmv1(#2194)make upgrade-v0-to-olmv1target:OLMV1_CATALOG_IMAGE=<image>Subscription,OperatorGroup,CSV,CatalogSource(scans bothopenshift-marketplaceand operator namespace for CI compatibility)*.oadp.openshift.ioand*.velero.ioCRDsolm.managed=truelabeled remnants (SA, Roles, RoleBindings, Deployments, ClusterRoles, ClusterRoleBindings)ClusterCatalogwith verbose error diagnostics (image ref, catalogd logs, ttl.sh expiry hint)ClusterExtensionpinned to specific ClusterCatalog viaselector.matchLabelscondition=Installed=TrueMigration Test —
test-upgrade-v0-to-olmv1(#2193)olmv1_migrate_test.go) with structured assertionsspec.source+spec.sourceNamespace)openshift-marketplaceand operator namespace (CI creates CatalogSource in operator namespace viaoptional-operators-subscribe)olm.managed=trueremnant resources (ServiceAccounts, Roles, Deployments, ClusterRoles, etc.)make test-e2e— runs after e2e tests complete, migration failure does not mask e2e exit code$(ARTIFACT_DIR)/junit_olmv1_report.xmlCSV / Bundle (OADP-4051)
SingleNamespace: trueininstallModes— Velero already supports single-namespace viaWATCH_NAMESPACE;OwnNamespacevsSingleNamespaceonly differs in whether watched namespace equals install namespaceDesign Notes
Why
spec.config.inline.watchNamespace(not an annotation)OADP's CSV declares
OwnNamespaceinstall mode. OLMv1 requiresspec.config.inline.watchNamespaceset to the install namespace — without it the install fails withInvalidConfiguration. The metadata annotationolm.operatorframework.io/watch-namespaceis not read by operator-controller; only the spec field matters.Why delete CRDs before migrating from OLMv0
OLMv1 takes ownership of CRDs it creates. CRDs already present on the cluster (owned by OLMv0 or manually created) cannot be adopted — the install will proceed but CRD lifecycle management is broken.
cleanupOrphanedCRDshandles this by deleting*.oadp.openshift.ioand*.velero.ioCRDs before install.Why pin ClusterExtension to a specific ClusterCatalog
Without
catalog.selector.matchLabels, OLMv1 resolves from all available ClusterCatalogs. On clusters with default catalogs (e.g.,openshift-community-operators), the community OADP package (v0.5.6) may be selected instead of the custom build. Pinning viaolm.operatorframework.io/metadata.name: <catalog>ensures resolution from the correct catalog.CI/Prow compatibility
Prow's
optional-operators-subscribestep creates CatalogSource inOO_INSTALL_NAMESPACE(typicallyopenshift-adp), notopenshift-marketplace. The migration test readsspec.sourceNamespacefrom the Subscription to find CatalogSources in any namespace. Both Makefile and Go test clean up CatalogSources from both locations.Why enable
SingleNamespaceinstallMode (false→true)OLMv1 does not distinguish between
OwnNamespaceandSingleNamespace— both mean the operator watches a single namespace. OLMv1'sspec.config.inline.watchNamespacefield controls which namespace, regardless of installMode type. WithSingleNamespace: false, OLMv1 rejects installs where the watched namespace differs from the install namespace, even though the OADP controller handles this identically (it readsWATCH_NAMESPACEeither way). EnablingSingleNamespacegives OLMv1 users the flexibility to watch a namespace other than the install namespace without any operator code changes. The operator'sWATCH_NAMESPACEenv var works the same in both modes.No operator Go code changes needed
WATCH_NAMESPACEinconfig/manager/manager.yamlis set fromfieldRef: metadata.namespace. OLMv1 propagatesspec.config.inline.watchNamespaceas theWATCH_NAMESPACEenv var into the deployed manager, overriding the fieldRef. The controller readsWATCH_NAMESPACEidentically regardless of who set it.Platform requirement
OCP 4.20+ with OLMv1 enabled (operator-controller and catalogd running).
OPRUN-4131(GA Single/OwnNamespace support — removes theTechPreviewNoUpgradefeature gate requirement) is Release Pending, so no feature gate needed on 4.20+.Usage
How to Test (Reviewer Guide)
Prerequisites: OCP 4.20+ cluster with OLMv1 enabled (operator-controller + catalogd running). Logged in via
oc login.1. Fresh OLMv1 install (no prior OADP)
2. OLMv0 → OLMv1 migration (shell-based)
3. OLMv0 → OLMv1 migration test (Ginkgo with assertions)
4. Manifest generation
5. SingleNamespace installMode
Common issues:
ttl.shimages expire afterTTL_DURATION(default 1h) — rebuild if tests fail with image pull errorsClusterExtensionCRD will not existOLMV1_CATALOG_IMAGEis set or CatalogSource auto-detection found the right oneFiles Changed
Makefiletest-e2etests/olmv1/olmv1_suite_test.gotests/olmv1/olmv1_install_test.gotests/olmv1/olmv1_migrate_test.gotests/olmv1/.gitignore.gitignoreoadp-olmv1-manifest.yaml.golangci.yamltests/olmv1/from lintingbundle/manifests/oadp-operator.clusterserviceversion.yamlSingleNamespaceinstallModeconfig/manifests/bases/oadp-operator.clusterserviceversion.yamlSingleNamespaceinstallModeNote
Responses generated with Claude
Summary by CodeRabbit
New Features
Tests