B/R e2e add velero restic DC workaround, describe, namespace events, wait for CSI snapshot to be ready, IsDCReady wait for builds, azure-rg#654
Conversation
|
Welp. Looks like I opened a new can of worms here Fixing |
|
Browsing Azure container for backup shows resource group issue coming from #582 looks like is not getting the value from $AZURE_RESOURCE_FILE
|
|
/retest |
60c9620 to
7b8ff54
Compare
c74bdc2 to
e1cd727
Compare
|
after tests pass except azure will merge #659 into this and close that out to verify. Also want to see if azure errors resource group errors (in velero backup logs) now show up properly. Velero deployment label change caused failure logs to not show up in e2e logs. |
|
The resource group error #658 now output in e2e log https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_oadp-operator/654/pull-ci-openshift-oadp-operator-master-4.10-operator-e2e-azure/1520908535485960192#1:build-log.txt%3A400 |
|
This is useful for debugging pods not starting up or stuck. Example event that would come up before openshift#650 is merged ``` Event: Error: couldn't find key access_key in Secret openshift-adp/oadp-ts-example-velero-1-aws-registry-secret, Src: kubelet, Reason: Failed ```
* fix awk for azure resource group * newline proof awk for resourcegroup
a6ec56c to
641bd1f
Compare
|
@kaovilai: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |


Adding several enhancements to B/R suite test
-rgresource group #658Restic DC restore workaround
Allow us to add data verification in the future. Currently we only verify app is running and responsive.
We would need to add after PreBackupVerification which check app is running with PreBackupDataEntry or similar.
Namespace Events
This is useful for debugging pods not starting up or stuck.
Example event that could come up before #650 is merged if secret contains carriage return that would be hard to dig from artifacts.
B/R describe
Help diagnose B/R failures when there is nothing in the restore logs. Maybe remove after velero-io/velero#4743
Wait for CSI snapshot to be ready (Uncovered by Namespace events enhancements)
On restore it is possible that snapshot isn't ready to be used as DataSource for PVC yet as shown at #654 (comment)
We should wait for CSI snapshot to be ready before uninstalling the application.
Example output
Update IsDCReady to wait for BuildConfig.
We should wait for all builds to be complete before considering DC ready for backup/disaster simulation/restore processes.
Fix #646
Example output
Failure logs from backup/restore are now done via downloadrequest
This prevent unrelated to backup or restore pod errors from showing up.