From 46606925c2924f552ee063d20b488158443cde03 Mon Sep 17 00:00:00 2001 From: Alejandro Gullon Date: Fri, 24 Apr 2026 12:50:25 +0200 Subject: [PATCH] NO-JIRA: fix flaky observability and telemetry test teardown Add retry logic to Loki queries in observability tests to handle the race condition between OTEL collector restart and Loki data ingestion. Add healthcheck to telemetry suite teardown to prevent vg-manager CrashLoopBackOff from accumulating across rapid MicroShift restarts, which caused intermittent storage test failures in shared scenarios. Co-Authored-By: Claude Opus 4.6 pre-commit.check-secrets: ENABLED --- test/suites/optional/observability.robot | 6 ++++-- test/suites/telemetry/telemetry.robot | 2 ++ 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/test/suites/optional/observability.robot b/test/suites/optional/observability.robot index ec9ffbcd48..b60f002891 100644 --- a/test/suites/optional/observability.robot +++ b/test/suites/optional/observability.robot @@ -46,12 +46,14 @@ Kube Metrics Are Exported Journald Logs Are Exported [Documentation] The opentelemetry-collector should be able to export journald logs. - Check Loki Query ${LOKI_HOST} ${LOKI_PORT} {service_name="journald"} + Wait Until Keyword Succeeds 10x 5s + ... Check Loki Query ${LOKI_HOST} ${LOKI_PORT} {service_name="journald"} Kube Events Logs Are Exported [Documentation] The opentelemetry-collector should be able to export Kubernetes events. - Check Loki Query ${LOKI_HOST} ${LOKI_PORT} {service_name="kube_events"} + Wait Until Keyword Succeeds 10x 5s + ... Check Loki Query ${LOKI_HOST} ${LOKI_PORT} {service_name="kube_events"} Logs Should Not Contain Receiver Errors [Documentation] Internal receiver errors are not treated as fatal. Typically these are due to a misconfiguration diff --git a/test/suites/telemetry/telemetry.robot b/test/suites/telemetry/telemetry.robot index 0d9be3f0f1..e4ab2be172 100644 --- a/test/suites/telemetry/telemetry.robot +++ b/test/suites/telemetry/telemetry.robot @@ -6,6 +6,7 @@ Resource ../../resources/microshift-host.resource Resource ../../resources/microshift-config.resource Resource ../../resources/microshift-process.resource Resource ../../resources/observability.resource +Resource ../../resources/ostree-health.resource Library ../../resources/journalctl.py Library ../../resources/prometheus.py Library ../../resources/ProxyLibrary.py @@ -110,6 +111,7 @@ Setup Teardown [Documentation] Test suite teardown + Wait For MicroShift Healthcheck Success Logout MicroShift Host Remove Kubeconfig