-
Notifications
You must be signed in to change notification settings - Fork 296
Closed
Labels
Description
Summary
When running gh aw compile --actionlint (particularly in strict mode), the step can report “0 errors” (or display "No issues found" in its summary) but still exit with a nonzero status. This causes CI and pre-commit validations to fail despite clean workflows, and is not consistent with direct actionlint usage. Manual reruns using direct actionlint often pass, indicating the bug is in gh-aw's integration, not the workflows themselves.
Root Cause Analysis
Most likely root cause:
gh awmaintains its own aggregate stats for actionlint findings, displaying "No issues found" if the count of errors and warnings is zero (seeActionlintStatsanddisplayActionlintSummaryinpkg/cli/actionlint.go).- However, orchestration logic (see
compile_orchestration.go) also checks the error return fromrunBatchActionlint(...), which encapsulates not just linter findings, but also subprocess failures, JSON parsing errors, file access issues, or Docker invocation errors. - Therefore, it is possible for the summary of results (based on parsed linter findings) to be clean, but for the overall process to still exit with failure if:
- Docker execution fails,
- actionlint emits unrecognized output,
- JSON output is truncated, malformed, or missing,
- file system or path handling fails for a subset of batch files,
- other integration errors occur.
Consequence:
- Users are told their workflows pass linting ("0 errors"), but their CI/pipeline (or pre-commit hook) fails, creating confusion and churn.
Reproduction Steps
- Run:
gh aw compile --actionlint(optionally with--strictor as part of pre-commit/CI) - Observe: Output contains "No issues found" or "0 errors found" in the summary
- But: The command exits nonzero
- Run: Directly invoke actionlint on each generated
.lock.yml(e.g.actionlint .github/workflows/*.lock.yml) - Observe: Direct actionlint passes with exit code 0
Likely underlying mechanisms
- Partial batch failure: If one file in a batch has an invocation issue, but the rest parse fine, the summary counts the parsed subset, process still exits with failure
- Subprocess error handling:
runBatchActionlint(...)may return an error on tool or parsing issues, rather than just on findings - Docker or path quirks: Dockerized actionlint may fail if paths are not mounted/translated correctly even though the file compiles and actionlint parses fine when run locally
Remediation Proposal
- In
pkg/cli/actionlint.goand the orchestration path, make clear distinctions between:- Lint findings (errors/warnings)
- Tooling/integration failures (subprocess, Docker, parsing, etc)
- If lint findings are zero but tooling integration fails, display an explicit message:
- e.g. "No issues found, but actionlint invocation failed. This likely indicates a tooling or integration error, not a workflow problem."
- Never display "No issues found" in the same run as a nonzero exit code unless this is a real workflow validation failure
- Consider returning a custom error type or status code to distinguish linter failures from integration/tooling failures in orchestration.
- Add regression test coverage: zero findings + integration failure must emit a unique error, assist users in distinguishing false negatives.
Impact
- Prevents churn and confusion in PR check failure root-cause analysis
- Aligns CI, local validation, and direct actionlint runs
- Speeds up identifying real regressions when failures actually occur
References
pkg/cli/actionlint.gopkg/cli/compile_orchestration.go- [Suggested direct test: actionlint .github/workflows/.lock.yml]
NOTE: This issue was prepared following CONTRIBUTING.md agentic analysis guidelines. If more reproduction detail or implementation steps are needed, please request follow-up.
Reactions are currently unavailable