feat(bake-oci-manifests): composite action for OCI manifests bake#5
Conversation
Eight microservices (auth-internal, auth-jwt, campaigns, db-maintenance,
db-migrator, mcp, messaging, mta-renderer) each ship a ~200-line
.github/workflows/release.yml that bakes their kustomization/ tree into an
OCI artifact at ${ECR_REGISTRY}/<service>-manifests:<tag>. The bodies are
near-identical — every release.yml does:
1. resolve+validate the bare-semver tag
2. checkout at the tag
3. mint Teleport workload-identity JWT
4. assume the teleport-image-push IAM role
5. ECR login
6. verify per-service images exist for the tag (else refuse)
7. ensure the manifests ECR repo exists with IMMUTABLE tags
8. install flux CLI
9. sed -i $APP_PACKAGE_VERSION_TO_BE_REPLACED -> $tag (and guard the sed)
10. flux push artifact + cosign keyless sign + verify pullable
The per-service variation is just:
- service name (kebab-case) -> derives TELEPORT_TOKEN, IMAGE_MANIFESTS
- which image basenames to verify exist before baking
This composite action parameterizes on those two values and absorbs the
rest. A migrated caller release.yml drops from ~200 lines to ~17. See
README §"Bake OCI manifests artifact" for the full usage shape.
Outputs:
- version: resolved semver tag the artifact was baked at
- digest: sha256:... digest of the pushed artifact
Inputs documented in the action.yml; defaults match the production
omega/omicron environment (eu-central-1, account 515260921971,
teleport.maestra.io:443).
No caller migration in this PR — first caller (db-migrator) will land
as a separate change so a rollback is a single revert.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces a new composite GitHub Action for baking Kubernetes manifests into versioned OCI artifacts. The action orchestrates version resolution, Teleport-based authentication to AWS/ECR, Flux-powered artifact creation, Cosign signing, and verification—alongside comprehensive README documentation. ChangesOCI Manifests Baking Action
Action Documentation
Sequence DiagramsequenceDiagram
participant Workflow as GitHub Workflow
participant Action as Bake Action
participant Teleport
participant AWSSTS as AWS STS
participant ECR
participant Flux
participant Cosign
Workflow->>Action: Trigger with service, images
Action->>Action: Resolve version
Action->>Teleport: Setup & query version
Teleport->>Teleport: Generate workload JWT
Action->>AWSSTS: Assume role with JWT
AWSSTS->>Action: AWS credentials
Action->>ECR: Login & validate images
Action->>ECR: Create manifests repo
Action->>Flux: Install Flux CLI
Action->>Action: Substitute placeholders
Action->>Flux: Push OCI artifact
Flux->>ECR: Upload manifest artifact
Action->>Cosign: Sign artifact by digest
Cosign->>ECR: Upload signature
Action->>Flux: Pull artifact back
Action->>Action: Verify no placeholders
Action->>Workflow: Return version + digest
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
bake-oci-manifests/action.yml (4)
59-62: 💤 Low valueSemver validation is permissive and may accept non-standard tags.
The shell glob pattern
[0-9]*.[0-9]*.[0-9]*will accept tags like1.2.3-rc1,1.2.3.4, or1.2.3abc. If strict semver (X.Y.Zonly) is required, consider a stricter check:if ! [[ "$tag" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then echo "::error::tag '$tag' is not bare semver (e.g. 1.0.648)"; exit 1 fiIf prerelease suffixes are intentionally allowed, the current pattern is acceptable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bake-oci-manifests/action.yml` around lines 59 - 62, The semver check using the case on variable "tag" is too permissive (matches things like 1.2.3-rc1 or 1.2.3abc); replace the case-based glob with a strict regex test against "tag" that requires exactly three numeric dot-separated components (major.minor.patch) and fail with the same error message if it does not match; update the validation logic around the existing "tag" handling (replace the case "$tag" in ... esac block) to use the regex check so only bare semver X.Y.Z passes.
116-118: 💤 Low valueHardcoded registry ID inconsistent with configurable
ecr-registryinput.The
registriesvalue is hardcoded to"515260921971", butecr-registryis configurable. If a caller overridesecr-registryto a different account, ECR login will still authenticate to the wrong account.Consider extracting the account ID from the input or adding a separate
aws-account-idinput:+ aws-account-id: + description: 'AWS account ID for ECR login.' + required: false + default: '515260921971'- uses: aws-actions/amazon-ecr-login@v2 with: - registries: "515260921971" + registries: ${{ inputs.aws-account-id }}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bake-oci-manifests/action.yml` around lines 116 - 118, The workflow currently hardcodes the registries value for the aws-actions/amazon-ecr-login@v2 step to "515260921971", which ignores the configurable ecr-registry input; update the step to derive the registry/account ID from the action input (ecr-registry) instead of the hardcoded literal (or add a new aws-account-id input and use that) so the registries field dynamically uses the provided value (reference the existing input name ecr-registry and the step using aws-actions/amazon-ecr-login@v2 and its registries property).
164-167: ⚖️ Poor tradeoffConsider verifying Flux CLI download integrity.
The Flux binary is downloaded and extracted without checksum verification. For supply chain security, consider verifying the SHA256 checksum:
curl -sLO "https://github.com/fluxcd/flux2/releases/download/v${FLUX_VERSION}/flux_${FLUX_VERSION}_checksums.txt" curl -sLO "https://github.com/fluxcd/flux2/releases/download/v${FLUX_VERSION}/flux_${FLUX_VERSION}_linux_amd64.tar.gz" sha256sum --check --ignore-missing flux_${FLUX_VERSION}_checksums.txt sudo tar -xzf flux_${FLUX_VERSION}_linux_amd64.tar.gz -C /usr/local/bin fluxAlternatively, use the official
fluxcd/flux2-setupaction if available for your workflow.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bake-oci-manifests/action.yml` around lines 164 - 167, The current run step downloads and extracts the Flux binary without verifying integrity; update the run block that uses FLUX_VERSION to first download the matching flux_${FLUX_VERSION}_checksums.txt and the tarball, run sha256sum --check --ignore-missing against flux_${FLUX_VERSION}_checksums.txt and abort on failure, and only then extract flux_${FLUX_VERSION}_linux_amd64.tar.gz to /usr/local/bin (or switch to the official fluxcd/flux2-setup action); ensure the commands reference the exact filenames flux_${FLUX_VERSION}_checksums.txt and flux_${FLUX_VERSION}_linux_amd64.tar.gz and that the tar extraction is conditional on a successful checksum check.
209-214: 💤 Low value
COSIGN_EXPERIMENTALis unnecessary in modern Cosign versions.The
COSIGN_EXPERIMENTALenvironment variable is no longer required since Cosign v1.14.0. Keyless signing is now the default behavior. While setting this variable is harmless, it can be removed for cleaner configuration.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bake-oci-manifests/action.yml` around lines 209 - 214, Remove the now-unnecessary COSIGN_EXPERIMENTAL env var from the env block in the action (it’s set alongside ECR_REGISTRY, IMAGE_MANIFESTS, and DIGEST) and leave the cosign invocation (the run: cosign sign --yes "${ECR_REGISTRY}/${IMAGE_MANIFESTS}@${DIGEST}") unchanged; specifically delete the COSIGN_EXPERIMENTAL: "true" line so only ECR_REGISTRY, IMAGE_MANIFESTS, and DIGEST remain exported for the cosign sign step.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bake-oci-manifests/action.yml`:
- Around line 107-108: The step masks tbot failures by appending "|| true" which
hides authentication errors; remove the "|| true" so the job fails when tbot
exits non-zero (targeting the invocation "tbot start -c /tmp/tbot.yaml"), and
strengthen validation of the produced file "/tmp/tbot-output/jwt_svid" by
checking its contents (not just existence) — e.g., assert the file is non-empty
and follows JWT structure (three dot-separated base64 segments) or is parseable;
if "|| true" must remain, add a comment explaining the reason and still add the
stronger content checks after "test -f /tmp/tbot-output/jwt_svid".
In `@README.md`:
- Line 125: The on.push.tags entry uses regex syntax but GitHub Actions expects
globs; replace the current tag pattern string "[0-9]+.[0-9]+.[0-9]+" with a glob
that matches semver, e.g. "[0-9]*.[0-9]*.[0-9]*" (or "v[0-9]*.[0-9]*.[0-9]*" if
you want a leading v), by updating the on.push.tags array (symbol: on.push.tags)
and also add the suggested job-level guard (symbol: if: ${{ github.ref_name =~
'^[0-9]+\.[0-9]+\.[0-9]+$' }}) on the release job to strictly validate X.Y.Z
format.
- Line 141: Replace the mutable action ref "uses:
maestra-io/github-actions/bake-oci-manifests@main" with a full-length commit SHA
for the maestra-io/github-actions/bake-oci-manifests action; locate the line
containing uses: maestra-io/github-actions/bake-oci-manifests@main in the README
and update it to uses:
maestra-io/github-actions/bake-oci-manifests@<full-commit-sha> (you can obtain
the SHA from the upstream repo commit history) and consider configuring
Dependabot/Renovate to keep that SHA updated automatically.
---
Nitpick comments:
In `@bake-oci-manifests/action.yml`:
- Around line 59-62: The semver check using the case on variable "tag" is too
permissive (matches things like 1.2.3-rc1 or 1.2.3abc); replace the case-based
glob with a strict regex test against "tag" that requires exactly three numeric
dot-separated components (major.minor.patch) and fail with the same error
message if it does not match; update the validation logic around the existing
"tag" handling (replace the case "$tag" in ... esac block) to use the regex
check so only bare semver X.Y.Z passes.
- Around line 116-118: The workflow currently hardcodes the registries value for
the aws-actions/amazon-ecr-login@v2 step to "515260921971", which ignores the
configurable ecr-registry input; update the step to derive the registry/account
ID from the action input (ecr-registry) instead of the hardcoded literal (or add
a new aws-account-id input and use that) so the registries field dynamically
uses the provided value (reference the existing input name ecr-registry and the
step using aws-actions/amazon-ecr-login@v2 and its registries property).
- Around line 164-167: The current run step downloads and extracts the Flux
binary without verifying integrity; update the run block that uses FLUX_VERSION
to first download the matching flux_${FLUX_VERSION}_checksums.txt and the
tarball, run sha256sum --check --ignore-missing against
flux_${FLUX_VERSION}_checksums.txt and abort on failure, and only then extract
flux_${FLUX_VERSION}_linux_amd64.tar.gz to /usr/local/bin (or switch to the
official fluxcd/flux2-setup action); ensure the commands reference the exact
filenames flux_${FLUX_VERSION}_checksums.txt and
flux_${FLUX_VERSION}_linux_amd64.tar.gz and that the tar extraction is
conditional on a successful checksum check.
- Around line 209-214: Remove the now-unnecessary COSIGN_EXPERIMENTAL env var
from the env block in the action (it’s set alongside ECR_REGISTRY,
IMAGE_MANIFESTS, and DIGEST) and leave the cosign invocation (the run: cosign
sign --yes "${ECR_REGISTRY}/${IMAGE_MANIFESTS}@${DIGEST}") unchanged;
specifically delete the COSIGN_EXPERIMENTAL: "true" line so only ECR_REGISTRY,
IMAGE_MANIFESTS, and DIGEST remain exported for the cosign sign step.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b2d5dfce-ba83-4957-b33f-795eb4ecf3d6
📒 Files selected for processing (2)
README.mdbake-oci-manifests/action.yml
| tbot start -c /tmp/tbot.yaml || true | ||
| test -f /tmp/tbot-output/jwt_svid |
There was a problem hiding this comment.
Masking tbot exit code may hide authentication failures.
The || true causes the step to succeed even if tbot fails (network issues, invalid token, Teleport misconfiguration). The subsequent test -f only verifies the file exists, not that it contains a valid JWT.
Consider validating the JWT was actually produced:
- tbot start -c /tmp/tbot.yaml || true
- test -f /tmp/tbot-output/jwt_svid
+ tbot start -c /tmp/tbot.yaml
+ if [[ ! -s /tmp/tbot-output/jwt_svid ]]; then
+ echo "::error::tbot did not produce a valid JWT"
+ exit 1
+ fiIf || true is intentional due to known tbot exit code quirks, add a comment explaining why and strengthen the file validation.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tbot start -c /tmp/tbot.yaml || true | |
| test -f /tmp/tbot-output/jwt_svid | |
| tbot start -c /tmp/tbot.yaml | |
| if [[ ! -s /tmp/tbot-output/jwt_svid ]]; then | |
| echo "::error::tbot did not produce a valid JWT" | |
| exit 1 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bake-oci-manifests/action.yml` around lines 107 - 108, The step masks tbot
failures by appending "|| true" which hides authentication errors; remove the
"|| true" so the job fails when tbot exits non-zero (targeting the invocation
"tbot start -c /tmp/tbot.yaml"), and strengthen validation of the produced file
"/tmp/tbot-output/jwt_svid" by checking its contents (not just existence) —
e.g., assert the file is non-empty and follows JWT structure (three
dot-separated base64 segments) or is parseable; if "|| true" must remain, add a
comment explaining the reason and still add the stronger content checks after
"test -f /tmp/tbot-output/jwt_svid".
| name: release | ||
| on: | ||
| push: | ||
| tags: ["[0-9]+.[0-9]+.[0-9]+"] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does GitHub Actions on.push.tags use glob patterns or regular expressions, and would the pattern "[0-9]+.[0-9]+.[0-9]+" match a tag like "1.2.3"?
💡 Result:
GitHub Actions on.push.tags uses glob patterns, not regular expressions [1][2]. The pattern "[0-9]+.[0-9]+.[0-9]+" will not function as a regular expression to match "1.2.3". Instead, it will be interpreted as a literal glob pattern [1][2]. In glob syntax, the brackets [] define a character set, meaning [0-9] matches any single digit [1]. Consequently, the pattern [0-9]+.[0-9]+.[0-9]+ would match a tag consisting of a single digit, followed by a literal +, a dot, a single digit, a literal +, a dot, a single digit, and a literal + (e.g., "1+2+3"). It will not match "1.2.3" [1][2]. To match semantic versioning tags like "1.2.3" using glob patterns, you should use the wildcard * [1][2]. For example, *.*.* would match "1.2.3" [1]. If you require strict validation that a tag follows a specific numeric format, you should use a glob pattern that covers the expected structure (such as *.*.*) and then use a job-level if condition with a regular expression to validate the tag name more precisely [3][4]: jobs: build: if:
Citations:
- 1: https://docs.github.com/actions/using-workflows/workflow-syntax-for-github-actions
- 2: https://stackoverflow.com/questions/68078004/why-doesnt-my-github-action-trigger-on-a-regex-tag
- 3: https://stackoverflow.com/questions/61891328/trigger-github-action-only-on-new-tags
- 4: https://stackoverflow.com/questions/66458906/how-to-conditionally-run-github-workflow-action-only-for-tags-made-on-specific-b
Fix tag trigger pattern: this uses regex-like syntax in a glob field.
At Line 125, on.push.tags expects glob patterns, not regular expressions. The pattern "[0-9]+.[0-9]+.[0-9]+" will not match semver tags like 1.2.3 because in glob syntax, [0-9] matches a single digit and + is treated as a literal character—this pattern would only match tags like "1+2+3". As a result, the release workflow will never trigger from normal semantic version tags.
Suggested fix
on:
push:
- tags: ["[0-9]+.[0-9]+.[0-9]+"]
+ tags: ["*.*.*"]For stricter validation that tags actually follow the X.Y.Z format, additionally use a job-level condition: if: ${{ github.ref_name =~ '^[0-9]+\.[0-9]+\.[0-9]+$' }}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tags: ["[0-9]+.[0-9]+.[0-9]+"] | |
| on: | |
| push: | |
| tags: ["*.*.*"] |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 125, The on.push.tags entry uses regex syntax but GitHub
Actions expects globs; replace the current tag pattern string
"[0-9]+.[0-9]+.[0-9]+" with a glob that matches semver, e.g.
"[0-9]*.[0-9]*.[0-9]*" (or "v[0-9]*.[0-9]*.[0-9]*" if you want a leading v), by
updating the on.push.tags array (symbol: on.push.tags) and also add the
suggested job-level guard (symbol: if: ${{ github.ref_name =~
'^[0-9]+\.[0-9]+\.[0-9]+$' }}) on the release job to strictly validate X.Y.Z
format.
| bake: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: maestra-io/github-actions/bake-oci-manifests@main |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What does GitHub recommend for pinning third-party GitHub Actions in workflows: branch refs, tags, or full-length commit SHAs?
💡 Result:
GitHub explicitly recommends pinning third-party GitHub Actions to a full-length commit SHA to ensure supply chain security [1][2]. While using tags (e.g., @v1) or branch references (e.g., @main) is more convenient, they are mutable and can be moved, deleted, or hijacked by malicious actors to point to compromised code [3][1][4]. Pinning to a full-length commit SHA provides an immutable reference, ensuring that the exact code you have audited and trusted is executed every time, regardless of any subsequent changes to the action's repository [1][5]. To maintain security while using SHA pinning, GitHub and security best practices suggest the following: 1. Use full-length commit SHAs: Always use the complete SHA, not a shortened version, to ensure immutability [6][1][2]. 2. Include version comments: When pinning to a SHA, it is common practice to add a comment in your workflow file indicating the original tag or version the SHA corresponds to (e.g., uses: actions/checkout@abc123... # v4.1.1) [7][8][4]. This makes it easier to track which version you are using and simplifies future updates. 3. Automate updates: Because SHA pinning prevents automatic updates, use tools like Dependabot or Renovate to automatically monitor for new versions and propose pull requests to update your pinned SHAs [7][4][5]. 4. Enforce policies: Administrators can configure repository or organization-level policies to require that all actions be pinned to a full-length commit SHA, causing workflows to fail if they use unpinned actions [1][2]. If you choose to use tags instead of SHAs, GitHub advises doing so only when you fully trust the action's creator, noting that even trusted authors can have their accounts or repositories compromised [1].
Citations:
- 1: https://docs.github.com/en/actions/reference/security/secure-use
- 2: https://github.blog/changelog/2025-08-15-github-actions-policy-now-supports-blocking-and-sha-pinning-actions/
- 3: https://dev.to/kanywst/hacking-github-from-tag-rewrites-to-dangling-commits-where-the-git-protocol-trusts-you-without-2o4h
- 4: Pin GitHub Actions to commit SHAs in release workflow github/gh-aw#10021
- 5: https://www.stepsecurity.io/blog/pinning-github-actions-for-enhanced-security-a-complete-guide
- 6: https://stackoverflow.com/questions/78903499/how-do-i-pin-an-action-to-a-specific-sha
- 7: Pin all GitHub Actions to immutable SHA commits for supply chain security github/gh-aw#5691
- 8: https://github.com/mheap/pin-github-action
Pin the action reference to a full-length commit SHA instead of @main.
At Line 141, using a mutable ref like @main (or even tags like @v1) reduces reproducibility and weakens supply-chain security. GitHub explicitly recommends pinning to a full-length commit SHA to ensure the exact audited code is executed and prevent hijacking of mutable references.
Suggested doc fix
- - uses: maestra-io/github-actions/bake-oci-manifests@main
+ - uses: maestra-io/github-actions/bake-oci-manifests@abc123def456... # v1Consider using Dependabot or Renovate to automate SHA updates as new versions are released.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 141, Replace the mutable action ref "uses:
maestra-io/github-actions/bake-oci-manifests@main" with a full-length commit SHA
for the maestra-io/github-actions/bake-oci-manifests action; locate the line
containing uses: maestra-io/github-actions/bake-oci-manifests@main in the README
and update it to uses:
maestra-io/github-actions/bake-oci-manifests@<full-commit-sha> (you can obtain
the SHA from the upstream repo commit history) and consider configuring
Dependabot/Renovate to keep that SHA updated automatically.
Summary
Adds a
bake-oci-manifestscomposite action that absorbs the ~200-linerelease.ymlworkflow currently duplicated across 8 source repos (auth-internal,auth-jwt,campaigns,db-maintenance,db-migrator,mcp,messaging,mta-renderer). A migrated caller drops from ~200 lines to ~17.Per-service variation reduces to two inputs:
service— kebab-case service name (derivesTELEPORT_TOKENandIMAGE_MANIFESTS).images— comma-separated list of image basenames to verify exist in ECR before baking.Static defaults (region, ECR registry, Teleport host, IAM role) live in the action; can be overridden by inputs.
Caller shape (post-migration)
Migration plan
This PR adds the action only. Caller migrations land as 8 follow-up PRs (one per source repo), starting with
db-migratoras the canary so a rollback is a single revert.Test plan
keylessOIDC; verify withcosign verify --certificate-identity-regexp ...).🤖 Generated with Claude Code
Changes Overview
This PR adds a new composite GitHub Action,
bake-oci-manifests, that consolidates duplicated release workflow logic currently spread across eight repositories (auth-internal, auth-jwt, campaigns, db-maintenance, db-migrator, mcp, messaging, mta-renderer). The changes include:Files Changed:
README.md– Added documentation section describing the action, its usage, inputs, outputs, and prerequisites (+57/-1)bake-oci-manifests/action.yml– New composite action implementing the OCI manifest baking workflow (+235/-0)Key Features
The composite action:
${APP_PACKAGE_VERSION_TO_BE_REPLACED}placeholders inkustomization/YAML filesInputs:
service(required) – kebab-case service nameimages(required) – comma-separated image basenames to verify in ECRflux-version,aws-region,ecr-registry,teleport-fqdn,aws-role-arn– configurable with production defaultsOutputs:
version– resolved semver tagdigest– SHA256 of pushed artifactBreaking Changes
None. This is a purely additive feature that centralizes existing workflow patterns without affecting current repositories until migration PRs are applied.
Migration Path
The PR adds only the action itself. Eight follow-up PRs will migrate individual repositories, starting with db-migrator as a canary to enable single-revert rollback if issues arise.