@openshift-cherrypick-robot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name |
Commit |
Details |
Required |
Rerun command |
| ci/prow/4.22-e2e-test-kubevirt-aws |
b739e83 |
link |
true |
/test 4.22-e2e-test-kubevirt-aws |
Full PR test history. Your PR dashboard.
Details
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
Originally posted by @openshift-ci[bot] in #2182 (comment)
Claude analysis
OADP E2E Test Failure Analysis
Generated by Claude via Vertex AI on 2026-04-30 17:40:00 UTC
Executive Summary
- Total Tests: 50
- Failed Tests: 1
- Known Flakes: 0 (failure does not match known flake patterns)
- Critical Issues: 0 (real bugs requiring immediate attention)
- Environmental Issues: 1 (DataVolume provisioning delay)
Failed Tests Analysis
1. todolist CSI backup and restore, in a Fedora VM [ENVIRONMENTAL]
Root Cause: DataVolume provisioning timeout - VM failed to start within the 10-minute test timeout due to slow PVC cloning
Evidence:
junit_report.xml: "context deadline exceeded" (test timeout at 17:38:18, duration: 648.028s ~10.8 minutes)
CDI logs (openshift-cnv/cdi-deployment):
- 17:28:18: DataVolume "fedora-todolist-disk" created, cloning from openshift-virtualization-os-images/fedora-1217dcc8c58d scheduled
- 17:28:18: PVC "fedora-todolist-disk" in "Pending" state (not bound)
virt-handler logs (openshift-cnv/virt-handler):
- 17:38:21: VMI fedora-todolist first appeared - "VMI is in phase: Scheduled | Domain does not exist"
- 17:38:22: VMI reached Running state - "VMI is in phase: Running | Domain status: Running, reason: Unknown"
Test code (virt_backup_restore_suite_test.go:101-105):
- wait.PollUntilContextTimeout with 10-minute timeout waiting for VM to reach "Running" status
- Failure at line 105: gomega.Expect(err).ToNot(gomega.HaveOccurred())
Diagnosis:
The test created the Fedora VM at ~17:28:18, which required cloning a 30Gi DataVolume from the source fedora-1217dcc8c58d in the openshift-virtualization-os-images namespace. The test code waits 10 minutes for the VM to reach "Running" status (virt_backup_restore_suite_test.go:101-105).
The PVC fedora-todolist-disk remained in "Pending" state and wasn't bound quickly enough. The VM launcher pod could not start until the DataVolume was fully provisioned. The VM actually reached "Running" state at 17:38:22, which was 4 seconds after the test timed out at 17:38:18.
This is a timing issue where:
- DataVolume cloning took >10 minutes (likely due to 30Gi size and cluster storage performance)
- VM startup happened mere seconds after timeout
- Test timeout (10 minutes) was insufficient for the DataVolume provisioning operation
Likely Cause: Environmental - Slow cluster storage performance causing DataVolume clone operation to exceed the 10-minute test timeout. The VM successfully started immediately after the DataVolume completed, indicating no functional issue.
Recommended Actions:
- Increase timeout - The test already has a 45-minute
BackupTimeout, but the VM startup poll is hardcoded to 10 minutes. Consider increasing the VM readiness timeout to 15-20 minutes for tests involving large DataVolume cloning operations (virt_backup_restore_suite_test.go:101).
- Add DataVolume readiness check - Before starting the VM startup wait, explicitly wait for the DataVolume to reach "Succeeded" phase. This would provide better error messages when DataVolume provisioning is slow.
- Investigate cluster storage performance - The 30Gi clone taking >10 minutes suggests potential storage backend slowness. Review AWS EBS performance metrics for the test cluster.
- Consider smaller test images - If feasible, use a smaller Fedora VM image for E2E tests to reduce provisioning time.
@openshift-cherrypick-robot: The following test failed, say
/retestto rerun all failed tests or/retest-requiredto rerun all mandatory failed tests:/test 4.22-e2e-test-kubevirt-awsFull PR test history. Your PR dashboard.
Details
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
Originally posted by @openshift-ci[bot] in #2182 (comment)
Claude analysis