NE-1802: Bump Golang, k8s.io, OpenShift API, and controller-runtime#225
Conversation
|
@alebedev87: This pull request references NE-1802 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
db88266 to
3f92d3b
Compare
|
|
|
/retest |
3f92d3b to
e9c00b9
Compare
|
@alebedev87: This pull request references NE-1802 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Commits reorganized. |
|
/retest Azure installation failed. |
|
/assign @gcs278 |
|
@alebedev87: This pull request references NE-1802 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| name: release | ||
| namespace: openshift | ||
| tag: rhel-8-release-golang-1.19-openshift-4.12 | ||
| tag: rhel-8-release-golang-1.22-openshift-4.17 |
There was a problem hiding this comment.
Should you be using rhel-9 in the name instead of rhel-8 ?
There was a problem hiding this comment.
I thought of postponing the migration to RHEL9 to not put all the eggs into one basket (rebase from upstream is huge plus this bump turned into a big one). Let's see if we'll have enough time to do RHEL9 migration too. But anyway it will be in a dedicated PR.
|
/retest |
|
@alebedev87, a nit pick, but your commands: Doesn't produce the same versions that you have bumped to. I get Sorry, I am one of those people that like to reproduce the bump as a sanity check 😄 I think you need to pin your versions. |
gcs278
left a comment
There was a problem hiding this comment.
No major issue, just questions.
There was a problem hiding this comment.
Sorry if I have missed a conversation about this, but doesn't splitting the bump & the vendor into two separate commits make the bump commit "broken"?
It could be just my personal philosophy, but I thought best practice was to avoid creating commits that don't compile. Was the motivation just to make reviews easier? I'm open to differing opinions, feel free to disagree, just curious.
There was a problem hiding this comment.
Reminded me of this discussion we had in C-I-O. I think I keep trying to stick to what I said in the discussion: put the vendor dir into a dedicated commit if either is true:
- it's more convenient for the reviewer
- the PR will be backported (vendored files may cause a lot of cherry-pick conflicts)
Was the motivation just to make reviews easier?
Yes. Will rebase the PR (fixup the vendor commit) once ready to move on.
There was a problem hiding this comment.
There might be a small miscommunication, in the CIO discussion, I was referring to splitting out the vendor & bump commit as one together. I referred to that as the "vendor commit" there.
But in this PR, we have a "bump commit" and "vendor dir commit" separate. By bump commit, I mean the go get ... && go mod tidy commands (b871828). By "vendor dir commit", I mean go mod vendor (67ec0f2).
Will rebase the PR (fixup the vendor commit) once ready to move on.
What does that mean exactly? Are you saying your going to combine your b871828 67ec0f2 commits before merging?
There was a problem hiding this comment.
There might be a small miscommunication, in the openshift/cluster-ingress-operator#1097 (comment), I was referring to splitting out the vendor & bump commit as one together. I referred to that as the "vendor commit" there.
But in this PR, we have a "bump commit" and "vendor dir commit" separate. By bump commit, I mean the go get ... && go mod tidy commands (b871828). By "vendor dir commit", I mean go mod vendor (67ec0f2).
Got it now. Yes, I understood the vendor commit as a commit with vendor dir only.
Will rebase the PR (fixup the vendor commit) once ready to move on.
What does that mean exactly? Are you saying your going to combine your b871828 67ec0f2 commits before merging?
Yes, will leave the PR will 2 commits: "bump+vendor dir" and "golangci-lint commit".
There was a problem hiding this comment.
Okay I think we are on the same page now. Thanks!
| RUN make build-operator | ||
|
|
||
| FROM registry.ci.openshift.org/ocp/4.12:base | ||
| FROM registry.ci.openshift.org/ocp/4.17:base |
There was a problem hiding this comment.
Can you remind me how this works? I think we still substitute all of the base image no per the release repo?
Regardless, I think at one point we discussed trying to get this hooked up with the Art bot so it stays in sync. I think this won't be very sustainable to keep updating.
There was a problem hiding this comment.
As of now external-dns-operator repo controls the base, builder and build root images from the repository. So when we need to bump golang or base image we submit a PR only into this repo. That means less automation from ART indeed. But we disabled it to get rid of openshift release branches created automatically. That was causing a confusion about the adherence of ExternalDNS Operator to OCP lifecycle.
As a matter of fact, Dockerfile.openshift is not even used anymore. I updated Dockerfile.openshift just to be consistent but it can be removed.
There was a problem hiding this comment.
As a matter of fact, Dockerfile.openshift is not even used anymore. I updated Dockerfile.openshift just to be consistent but it can be removed.
Oh, so this Dockerfile isn't even used? That makes more sense why this is super outdated. Now I see the regular Dockerfile uses it's own non-OpenShift images, which makes my question about the ART bumps irrelevant.
Should we just delete it then or does it serve a purpose still?
There was a problem hiding this comment.
Should we just delete it then or does it serve a purpose still?
No it doesn't. Let me remove it.
Upd: done.
|
@alebedev87: This pull request references NE-1802 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-1802 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278 : Pinned the |
|
e2e TestExternalDNSRecordLifecycle is failing. Let's see if that is intermittent?
|
| .PHONY: lint | ||
| ## Checks the code with golangci-lint | ||
| lint: $(GOLANGCI_LINT_BIN) | ||
| $(GOLANGCI_LINT_BIN) run -c .golangci.yaml --deadline=30m |
There was a problem hiding this comment.
Question - is there another way to give this a time limit, now that the deadline flag is deprecated?
There was a problem hiding this comment.
Yes. It's already in the golangci-lint config. I don't see the reason to limit in 2 places, so I removed the flag completely. Also, 30m was an overkill, this test should not take more than a couple of seconds.
Upd: maybe a "couple of seconds" was a wrong estimation (failed job). Increased to 10 minutes.
| KUSTOMIZE := go run sigs.k8s.io/kustomize/kustomize/v4 | ||
| K8S_ENVTEST_VERSION := 1.21.4 | ||
| KUSTOMIZE := go run sigs.k8s.io/kustomize/kustomize/v5 | ||
| K8S_ENVTEST_VERSION := 1.30.0 |
There was a problem hiding this comment.
Any reason to use 1.30.0 instead of 1.30.3 as is used in the go modules?
There was a problem hiding this comment.
Yes. sigs.k8s.io/controller-runtime/tools/setup-envtest gets the artifacts from kubebuilder-tools which used to be published from the kubebuilder repository. Here is the link to all the kubebuilder tools releases available which I found in their latest release. There is no 1.30.3 release, that why I'm getting an error when I'm trying to use 1.30.3:
$ make test
go run sigs.k8s.io/controller-tools/cmd/controller-gen "crd:preserveUnknownFields=false" rbac:roleName=external-dns-operator webhook paths="./..." output:crd:artifacts:config=config/crd/bases
go run sigs.k8s.io/controller-tools/cmd/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
unable to fetch hash for requested version: unable fetch metadata for kubebuilder-tools-1.30.3-linux-amd64.tar.gz -- got status "404 Not Found" from GCS
I think that we'll have to upgrade to the latest version of sigs.k8s.io/controller-runtime/tools/setup-envtest and start fetching the envtest binaries from the new location. The problem is that we cannot do it right now as the latest version uses 1.31.z k8s.io dependencies. So we have to postpone it to the next API bump.
| source.Kind[client.Object](operatorCache, &corev1.ConfigMap{}, | ||
| &handler.EnqueueRequestForObject{}, | ||
| predicate.And(predicate.NewPredicateFuncs(ctrlutils.InNamespace(config.SourceNamespace)), predicate.NewPredicateFuncs(ctrlutils.HasName(config.CAConfigMapName))), |
There was a problem hiding this comment.
nit: was this manual or automatic? If source.Kind changed this much, it might be better to put all the signature parameters on the same line.
There was a problem hiding this comment.
was this manual or automatic?
Manual, I had to do it myself. As far as I know, no migration tools were provided by controller-runtime.
If source.Kind changed this much, it might be better to put all the signature parameters on the same line.
Yes, source.Kind is a generic function now. Wouldn't it be too lengthy on a single line?
The failure here is probably due to a too small DnsPollingTimeout. 3 Minutes isn't enough to provision the LB, much less resolve the DNS name. I saw in Loki that the GCP CCM took 5 minutes to provision the LB. That's quite long, probably an anomaly with the GCP API being slow. It's a flake to keep an eye on. |
|
@gcs278 : 3 minutes may be too short indeed. Some time ago, I decreased it from 15 minutes to reduce the elapsed time needed for a failed run. I'll try to do the happy medium and will increase it to 7 minutes. |
7ab2709 to
34958da
Compare
|
Increased the golangci-lint timeout to |
|
/label px-approved No TE is needed for this feature, only release notes. |
|
/label docs-approved |
No release note for this specific change as we do not make release notes for bumping kubernetes dependencies. |
|
All Azure tests failed to get DNS records, not much can be found in Loki. Let's see whether it's a reproducible issue. /retest |
6 similar comments
- Bump Golang to 1.22: - Update go.mod - Update Dockerfiles - Add `CGO_ENABLED=1` to `go test -race` (golang/go#51235). - Bump k8s.io/* modules to 0.30.3 and OpenShift API to the latest. - Bump Controller runtime bumped to 0.18.5: - Controller's `Watch` function now has a single generic source parameter (kubernetes-sigs/controller-runtime#2783). - Manager's `CertDir` option removed, now using the dedicated webhook server option (kubernetes-sigs/controller-runtime#2422). - Cache's `Namespaces` option replaced by `DefaultNamespaces` (kubernetes-sigs/controller-runtime#2421). - Regenerate CRD and bundle manifests: - ExternalDNS API uses `metav1.LabelSelector` for the label filtering. It was updated with `+listType=atomic` marker which resulted in the addition of `x-kubernetes-list-type: atomic` to CRD. - Bump `kustomize` to v5 to fix a conflict caused by k8s.io bumps: - `kyaml` unable to use the bumped `github.com/google/gnostic-models/openapiv2` package.
- Update deprecated fields in `.golangci.yaml`. - Remove deprecated `--deadline` flag. - Replace deprecated `k8s.io/utils/pointer` with new generic `k8s.io/utils/ptr` package.
CI and Makefile use default Dockerfile.
Increase the DNS polling timeout to 7 minutes to prevent false positives on providers with slower load balancer provisioning, such as GCP, where it may take up to 5 minutes.
The RHEL9 migration (openshift/external-dns#61) reset the latest tag for external-dns repository in quay.io.
e506cf5 to
f1b2155
Compare
|
The problems seen in Azure were related to removed operand image. The external-dns instances could not start. I blame the changes in the image mirroring in release repository (initial script was preserving all unique images) as well as my unsolicited tag I used for the tests (unlikely though). Anyway the operand was migrated to RHEL9 so I rebump the operand image. |
|
/lgtm |
|
@alebedev87: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
CGO_ENABLED=1togo test -race(doc: document race detector cgo requirements golang/go#51235).Watchfunction now has a single generic source parameter(⚠️ Source, Event, Predicate, Handler: Add generics support kubernetes-sigs/controller-runtime#2783).
CertDiroption removed, now using the dedicated webhook server option(⚠ Remove deprecated manager, webhook and cluster options kubernetes-sigs/controller-runtime#2422).
Namespacesoption replaced byDefaultNamespaces(⚠️ Allow configuring more granular cache filtering kubernetes-sigs/controller-runtime#2421).
kustomizeto v5 to fix a conflict caused by k8s.io bumps:kyamlunable to use the bumpedgithub.com/google/gnostic-models/openapiv2package.metav1.LabelSelectorfor the label filtering.It was updated with
+listType=atomicmarker which resulted inthe addition of
x-kubernetes-list-type: atomicto CRD.Commands to reproduce the bump: