Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions ci-operator/config/openshift/os/openshift-os-c9s.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,13 @@ resources:
memory: 3Gi
tests:
- as: validate
commands: ./ci/validate.sh
commands: |
#!/bin/bash
set -xeuo
./ci/validate.sh
container:
from: src
from: coreos_coreos-assembler_latest
skip_if_only_changed: ^docs/|\.md$|^(?:.*/)?(?:\.gitignore|OWNERS|PROJECT|LICENSE)$
- as: build-test-qemu
commands: /src/ci/build-test-qemu.sh
container:
Expand Down
7 changes: 5 additions & 2 deletions ci-operator/config/openshift/os/openshift-os-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,12 @@ resources:
memory: 3Gi
tests:
- as: validate
commands: ./ci/validate.sh
commands: |
#!/bin/bash
set -xeuo
./ci/validate.sh
container:
from: src
from: coreos_coreos-assembler_latest
Copy link
Copy Markdown
Member

@cheesesashimi cheesesashimi May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this to coreos_coreos-assembler_latest would probably break because src points to what's defined under .images.build_root and I'm not sure if the COSA image has a ./ci/validate.sh script inside it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this and re-reading your description of this change, we may want to change this to build-test-qemu-img instead since that image will be the result of https://github.com/openshift/os/blob/master/ci/Dockerfile, which is the COSA image and the contents of the openshift/os repo. Doing that will include both the ./ci/vaildate.sh and ./ci/set-openshift-user.sh scripts.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this to coreos_coreos-assembler_latest would probably break because src points to what's defined under .images.build_root and I'm not sure if the COSA image has a ./ci/validate.sh script inside it.

I'm confused about this, since the pj-rehearse jobs appear to have passed. If you look at the history for the job on this PR, you can see some failures where during some iterations the scripts weren't found due to incorrect paths, etc. From what I can tell it looks like the openshift/os repo is present (magic!) and the ci/validate.sh script is able to be executed successfully.

See the latest job where I stuck some debug output as part of the job - https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/28790/rehearse-28790-pull-ci-openshift-os-master-validate/1529165858037829632/artifacts/test/build-log.txt

Copy link
Copy Markdown
Member

@cheesesashimi cheesesashimi May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm a little confused as well! But I think I have an understanding of what's going on here and can explain the source of my confusion (and hopefully make you less confused as well!):

  • The Dockerfile for openshift/os is in ci/Dockerfile and is what becomes build-test-qemu-img. We probably should make this the .build_root part of the CI config, but that's a separate concern and not relevant right now. The resulting image is what the COSA build scripts use to run. It has the scripts and the layering test binary from openshift/os layered on top of the coreos-assembler image. This is why I thought we should use that image to run ./ci/validate.sh.
  • The src image is registry.ci.openshift.org/coreos/fcos-buildroot:testing-devel, produced by the coreos/fedora-coreos-config repo. Interestingly, the validate step is the only part of the openshift/os CI config that uses this image. None of the other image builds or tests directly consume or use this image.
  • The "magic" is that before the validate test runs, a bunch of init containers are run. Amongst those is a cloneRefs step ($ curl -s 'https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_release/28790/rehearse-28790-pull-ci-openshift-os-master-validate/1529165858037829632/artifacts/ci-operator-step-graph.json' | jq -r '.[1].manifests[0].spec.initContainers[].name') which clones openshift/os (and the fedora-coreos-config submodule).

So in a nutshell, what's happening in the validate step is we're taking registry.ci.openshift.org/coreos/coreos-assembler:latest, cloning openshift/os (and the fedora-coreos-config submodule) into it and running ./ci/validate.sh. My confusion came from forgetting about the cloneRefs step and thinking that since coreos-assembler:latest doesn't have the ./ci/validate.sh script present that it would fail.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed information, Zack! It certainly helps improve my understanding of the flow of the jobs.

Interestingly, the validate step is the only part of the openshift/os CI config that uses this image. None of the other image builds or tests directly consume or use this image.

Should we drop the use of the fcos-buildroot as part of this PR? Would we need to reconfigure build_root to point to another image?

Is the cloneRefs step logged in any of the test artifacts?

If you think the PR is good to go as is, please drop an /approve if you can.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drop the use of the fcos-buildroot as part of this PR? Would we need to reconfigure build_root to point to another image?

We could configure it to point to ci/Dockerfile and build that. We'd then need to replace build-test-qemu-img with src. There's no rush in doing that, so if we want to do that, let's handle that as a separate PR.

Is the cloneRefs step logged in any of the test artifacts?

Unfortunately it's not. From what I can tell, the container used purposely does not create logs (although I don't know why). You might be able to catch it while the job is running if you click through to the console for the CI namespace. The only place I was really able to find it was in the job test steps artifact and even then, I had to know that it was an init container.

If you think the PR is good to go as is, please drop an /approve if you can.

Will do.

skip_if_only_changed: ^docs/|\.md$|^(?:.*/)?(?:\.gitignore|OWNERS|PROJECT|LICENSE)$
- as: validate-built-image
commands: cat /etc/os-release
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,9 +140,12 @@ resources:
memory: 3Gi
tests:
- as: validate
commands: ./ci/validate.sh
commands: |
#!/bin/bash
set -xeuo
./ci/validate.sh
container:
from: src
from: coreos_coreos-assembler_latest
cron: '@daily'
- as: validate-built-image
commands: cat /etc/os-release
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ presubmits:
secretName: result-aggregator
trigger: (?m)^/test( | .* )images,?($|\s.*)
- agent: kubernetes
always_run: true
always_run: false
branches:
- ^c9s$
- ^c9s-
Expand All @@ -114,6 +114,7 @@ presubmits:
pj-rehearse.openshift.io/can-be-rehearsed: "true"
name: pull-ci-openshift-os-c9s-validate
rerun_command: /test validate
skip_if_only_changed: ^docs/|\.md$|^(?:.*/)?(?:\.gitignore|OWNERS|PROJECT|LICENSE)$
spec:
containers:
- args:
Expand Down