-
Notifications
You must be signed in to change notification settings - Fork 298
Description
🏥 CI Failure Investigation - Run #35690
Summary
Lint and integration test suites now fail because the new Docker validation checks surfaced two issues: staticcheck trips over the capitalized error strings introduced in validateDockerImage, and TestValidateContainerImages/valid_container_image hard-fails when the runner’s Docker daemon is unreachable.
Failure Details
- Run: 22019644467
- Commit: ca31b65
- Trigger: push
Root Cause Analysis
validateDockerImagenow returns early when Docker is missing or the daemon is down, but the associatedfmt.Errorfstrings still begin with a capitalD, which violates staticcheck ST1005 and makeslint-gofail.TestValidateContainerImagesrelies on Docker working, yet the new daemon check surfaces a failure in the GitHub runner because the daemon isn’t responsive. The tests skip only when the CLI is missing and so now report the daemon error instead of skipping.
Failed Jobs and Errors
- lint-go:
staticcheckreported ST1005 onpkg/workflow/docker_validation.go:95/103because the error strings start with a capitalizedDocker. - Integration: Workflow Actions & Containers:
TestValidateContainerImages/valid_container_imagenow returns "Docker daemon not running - could not validate container image 'alpine:latest'" becausevalidateDockerImagefails early whenisDockerDaemonRunning()is false.
Investigation Findings
- The lint failure comes directly from the new docker validation guard; both error return paths now emit capitalized messages that staticcheck flags. Lowercasing those strings clears the lint error.
- The integration failure occurs because the runner exposes the docker CLI but the daemon isn’t responsive, so
validateDockerImagereports the daemon error and the subtest fails rather than skipping. - Running
go test -tags integration ./pkg/workflow -run TestValidateContainerImageslocally was blocked: the environment tried to download Go 1.25.0 (forbidden) and the local toolchain is 1.24.13, so the command could not finish.
Recommended Actions
- Lowercase the
fmt.Errorfmessages inpkg/workflow/docker_validation.goso staticcheck ST1005 no longer fails the lint job. - Guard
TestValidateContainerImages(or thett.skipIfNoDockerpath) with anisDockerDaemonRunning()check so tests skip when the daemon isn’t responsive instead of failing.
Prevention Strategies
Document that any integration tests hitting Docker should check both CLI availability and daemon health before asserting success, and run staticcheck locally after touching error strings to avoid uppercase violations.
AI Team Self-Improvement
- Before landing Docker validation changes, ensure error messages start lowercase to satisfy staticcheck.
- When adding integration coverage that relies on Docker, gate the tests with both
exec.LookPath("docker")andisDockerDaemonRunning()so missing daemons are treated as skips.
Historical Context
No existing [CI Failure Doctor] issue referenced run #35690; this appears to be a new failure pattern introduced by the recent Docker validation perf change.
🩺 Diagnosis provided by CI Failure Doctor
To install this workflow, run
gh aw add githubnext/agentics/workflows/ci-doctor.md@ea350161ad5dcc9624cf510f134c6a9e39a6f94d. View source at https://github.com/githubnext/agentics/tree/ea350161ad5dcc9624cf510f134c6a9e39a6f94d/workflows/ci-doctor.md.
- expires on Feb 15, 2026, 3:32 PM UTC