Skip to content

COS-2902: Add C10s variant#1498

Merged
openshift-merge-bot[bot] merged 14 commits intoopenshift:masterfrom
travier:c10s
Apr 5, 2025
Merged

COS-2902: Add C10s variant#1498
openshift-merge-bot[bot] merged 14 commits intoopenshift:masterfrom
travier:c10s

Conversation

@travier
Copy link
Copy Markdown
Member

@travier travier commented Apr 30, 2024

c9s: Point image-c9s to image-rhel-9.4


c9s: Use RPMs from 9.4 RHAOS repo


Revert "c9s.repo: temporarily use mirrored repos"

This reverts commit 88e41a0.


DoNotMerge: CI changes to test C10S builds


manifests: Add initial c10s based variant

See: #1466


kola-denylist.yaml: Update for c10s variant


c10s.repo: Skip GPG checks for now


Corresponding Fedora CoreOS changes: coreos/fedora-coreos-config#3015

@travier
Copy link
Copy Markdown
Member Author

travier commented Apr 30, 2024

Do not merge as it includes changes that we do not want but that are here to test in CI (see commits).
We'll have to update the Prow CI config once ready.
/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 30, 2024
@openshift-ci openshift-ci Bot requested review from cverna and jlebon April 30, 2024 16:11
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2024
@travier
Copy link
Copy Markdown
Member Author

travier commented May 2, 2024

/retest

2 similar comments
@travier
Copy link
Copy Markdown
Member Author

travier commented May 3, 2024

/retest

@travier
Copy link
Copy Markdown
Member Author

travier commented May 6, 2024

/retest

@jlebon
Copy link
Copy Markdown
Member

jlebon commented May 6, 2024

Hmm, almost seems like the logs are truncated... This is probably fallout from coreos/coreos-assembler#3785.

jlebon added a commit to jlebon/coreos-assembler that referenced this pull request May 6, 2024
I can't reproduce this locally, but I have a suspicion that `tail` can
exit too quickly in some circumstances, causing truncated output:

openshift/os#1498 (comment)
coreos#3785 (comment)

Rather than having an unconditional `sleep`, let's make it easier to
test that theory by having an env var we can use to make it optional.
Then we'll test that in CI.

Mid-term, I'd like to revert 79b15c8 soon so we can go back to
virtio-serial which is just so much cleaner.
@jlebon
Copy link
Copy Markdown
Member

jlebon commented May 6, 2024

Let's see if coreos/coreos-assembler#3792 helps (then here, we'd export TAIL_SLEEP=3 or something).

jlebon added a commit to coreos/coreos-assembler that referenced this pull request May 6, 2024
I can't reproduce this locally, but I have a suspicion that `tail` can
exit too quickly in some circumstances, causing truncated output:

openshift/os#1498 (comment)
#3785 (comment)

Rather than having an unconditional `sleep`, let's make it easier to
test that theory by having an env var we can use to make it optional.
Then we'll test that in CI.

Mid-term, I'd like to revert 79b15c8 soon so we can go back to
virtio-serial which is just so much cleaner.
@jlebon
Copy link
Copy Markdown
Member

jlebon commented May 7, 2024

/retest

@jlebon
Copy link
Copy Markdown
Member

jlebon commented May 7, 2024

Ahh and indeed now we clearly see the error for all of them which means we were missing output. Yuck OK, so we need to fix the tail handling. Though I'll see if I can bang on the virtio-serial bug again.

@travier
Copy link
Copy Markdown
Member Author

travier commented May 7, 2024

Ahh and indeed now we clearly see the error for all of them which means we were missing output. Yuck OK, so we need to fix the tail handling. Though I'll see if I can bang on the virtio-serial bug again.

Thanks! I indeed had a missing change in those commits.

@travier
Copy link
Copy Markdown
Member Author

travier commented May 7, 2024

Hum, the workaround is not that ugly and only impacts CI here so maybe we should merge it for now until we've fixed this COSA.

@travier
Copy link
Copy Markdown
Member Author

travier commented May 7, 2024

CI fixes in openshift/release#51750

@jlebon
Copy link
Copy Markdown
Member

jlebon commented May 7, 2024

and only impacts CI here

So far. :) It's racy so I don't see why it couldn't happen in the prod pipeline. I'd hate for someone to be debugging a failure there and working with incomplete output thinking the error is happening somewhere other than it really is.

I guess though a major hack is to just add a sleep 1 at least on the cosa side.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 10, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2024
@travier
Copy link
Copy Markdown
Member Author

travier commented May 13, 2024

Failing on missing the teamd package.

@travier
Copy link
Copy Markdown
Member Author

travier commented May 13, 2024

Comment thread extensions-okd-c9s.yaml
@Prashanth684
Copy link
Copy Markdown
Contributor

will there be a way to switch between building c9s and c10s, or will we have a follow up moving to c10s (in packages-openshift.yaml and other places) after this PR?

@dustymabe
Copy link
Copy Markdown
Member

I'm looking into the failures here.

jlebon and others added 9 commits April 4, 2025 20:02
When building the node image, we want it to be able to use the repos
defined in the git repo itself. For the local developer case, this is
implicitly done by `cosa podman-build`:

https://github.com/coreos/coreos-assembler/blob/325ca2be9fc349ba329f49fab65ea207ba338d19/src/cmd-podman-build#L34

But nothing does that in the OpenShift CI case.

So do it.

We should then be able to delete that line from `cosa podman-build` to
avoid duplicate definitions.

Note that in the CentOS Stream case, the canonical repos live in the
node image already so we could use that, but it's cleaner I think to
ensure we're consistently using the same repo definition files whether
we're building the base image, the node image, or the extensions (e.g.
the repo IDs are different, and there are subtleties between using the
compose vs mirror repos).

While we're here, also make sure that we delete the `okd.repo` file we
injected; we don't want that in the final image.
We group the el9 manifest imports and move the glusterfs-fuse
and containernetworking-plugins packages to el9-shared one as
there are not yet shipped for EL9.
The CentOS Stream Storage SIG does not yet provide GlusterFS [1]
for EL10 whereas it does for EL9 [2]. containernetworking-plugins
was dropped in EL10 as per [3][4].

[1] https://mirror.stream.centos.org/SIGs/10-stream/storage/x86_64/
[2] https://mirror.stream.centos.org/SIGs/9-stream/storage/x86_64/
[3] https://gitlab.com/redhat/centos-stream/rpms/containernetworking-plugins/-/commit/56377da48755ece6ddd7e3f6c38b6f7f7db66625
[4] https://issues.redhat.com/browse/CS-2264
This will make it easier to diff with c9s-mirror.repo file.

Also add a link to the issue that is causing us to hardcode x86_64
so that we can remove that hardcoding one day.
- The c9s-sig-virtualization isn't available on all architectures
  so drop it from the global list at the top.
- Drop the repos: appstream from the commented out wasm extension
  since that repo is in the global list at the top it won't need
  to be specified there.
- Add comments in repos used inside extension definitions about
  why they are being named versus being in the global repo list
  at the top.
The openvswitch RPM from the ipsec extension is pulled from this repo.
@dustymabe
Copy link
Copy Markdown
Member

Pushed up a few fixes.. still working on some other things.

@dustymabe
Copy link
Copy Markdown
Member

ok a few more fixes now. I think (hope) the only thing left failing is the okd-scos-images

@dustymabe
Copy link
Copy Markdown
Member

Please everyone refrain from pushing to this PR unless I ask you to do so.

@jlebon
Copy link
Copy Markdown
Member

jlebon commented Apr 5, 2025

will there be a way to switch between building c9s and c10s, or will we have a follow up moving to c10s (in packages-openshift.yaml and other places) after this PR?

We'll maintain both c9s and c10s variants. The repos in packages-openshift.yaml should be adapted so it's conditional on the version as well as centos vs rhel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 5, 2025

@travier: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jbtrystram
Copy link
Copy Markdown
Contributor

jbtrystram commented Apr 5, 2025

/retitle COS-2902: Add C10s variant

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 5, 2025

@travier: This pull request references COS-2902 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

Details

In response to this:

c9s: Point image-c9s to image-rhel-9.4


c9s: Use RPMs from 9.4 RHAOS repo


Revert "c9s.repo: temporarily use mirrored repos"

This reverts commit 88e41a0.


DoNotMerge: CI changes to test C10S builds


manifests: Add initial c10s based variant

See: #1466


kola-denylist.yaml: Update for c10s variant


c10s.repo: Skip GPG checks for now


Corresponding Fedora CoreOS changes: coreos/fedora-coreos-config#3015

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jbtrystram
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jbtrystram, jlebon, travier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [jbtrystram,jlebon,travier]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@joelcapitao
Copy link
Copy Markdown
Contributor

will there be a way to switch between building c9s and c10s, or will we have a follow up moving to c10s (in packages-openshift.yaml and other places) after this PR?

@Prashanth684 there is an on-going follow-up patch #1786 which implements what you are asking for (specially the commit e621356)

@travier travier mentioned this pull request Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants