Skip to content

OCPBUGS-15934: use *resource.Quantity to not automatically set 0 #3815

Closed
QiWang19 wants to merge 2 commits into
openshift:masterfrom
QiWang19:zeroquantity
Closed

OCPBUGS-15934: use *resource.Quantity to not automatically set 0 #3815
QiWang19 wants to merge 2 commits into
openshift:masterfrom
QiWang19:zeroquantity

Conversation

@QiWang19
Copy link
Copy Markdown
Member

@QiWang19 QiWang19 commented Jul 21, 2023

close: https://issues.redhat.com/browse/OCPBUGS-15934
- What I did
Change the type of OverlaySize and LogSizeMax resource.Quantity to pointer *resource.Quantity.
the struct type will be set as 0 automatically when retrieving the ctrcfg object.

I am uncertain if we can make this update to current ContainerruntimeConfig API, I propose this change because the ContainerruntimeConfig API is currently MCO internal.
- How to verify it

# apply the containerruntimeconfig CR
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
 name: pidlimit
spec:
 machineConfigPoolSelector:
   matchLabels:
     pools.operator.machineconfiguration.openshift.io/worker: '' 
 containerRuntimeConfig:
   pidsLimit: 4096 
   logLevel: debug

# current result:
$ oc get containerruntimeconfig  pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
  "logLevel": "debug",
  "logSizeMax": "0",
  "overlaySize": "0",
  "pidsLimit": 4096
}

# after this patch, it will be 
$ oc get containerruntimeconfig  pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
  "logLevel": "debug",
  "pidsLimit": 4096
}

- Description for the changelog

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 21, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jul 21, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@QiWang19 QiWang19 force-pushed the zeroquantity branch 3 times, most recently from 8e0f93a to 763d4d6 Compare July 23, 2023 00:48
@QiWang19
Copy link
Copy Markdown
Member Author

/test all

@QiWang19
Copy link
Copy Markdown
Member Author

/test e2e-aws-ovn

@QiWang19 QiWang19 changed the title [WIP] check empty resource.Quantity OCPBUGS-15934: use *resource.Quantity to not automatically set 0 Jul 24, 2023
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jul 24, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@QiWang19: This pull request references Jira Issue OCPBUGS-15934, which is invalid:

  • expected the bug to target the "4.14.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@QiWang19: This pull request references Jira Issue OCPBUGS-15934, which is invalid:

  • expected the bug to target the "4.14.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

- What I did
Change the type of resource.Quantity to pointer *resource.Quantity so the default 0 will not set by JSON marshal when retrieving the ctrcfg objects.
We can make this change because the ContainerruntimeConfig API is currently MCO internal.
- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@QiWang19
Copy link
Copy Markdown
Member Author

/test all

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@QiWang19: This pull request references Jira Issue OCPBUGS-15934, which is invalid:

  • expected the bug to target the "4.14.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

close: https://issues.redhat.com/browse/OCPBUGS-15934
- What I did
Change the type of OverlaySize and LogSizeMax resource.Quantity to pointer *resource.Quantity.
The bug retrieving the ctrcfg object, the struct type will be set as 0 automatically

I am uncertain if we can make this update to current ContainerruntimeConfig API, I propose this change because the ContainerruntimeConfig API is currently MCO internal.
- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@QiWang19 QiWang19 marked this pull request as ready for review August 1, 2023 16:43
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2023
@openshift-ci openshift-ci Bot requested review from dkhater-redhat and mtrmac August 1, 2023 16:43
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 1, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@QiWang19: This pull request references Jira Issue OCPBUGS-15934, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request.

Details

In response to this:

close: https://issues.redhat.com/browse/OCPBUGS-15934
- What I did
Change the type of OverlaySize and LogSizeMax resource.Quantity to pointer *resource.Quantity.
the struct type will be set as 0 automatically when retrieving the ctrcfg object.

I am uncertain if we can make this update to current ContainerruntimeConfig API, I propose this change because the ContainerruntimeConfig API is currently MCO internal.
- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@QiWang19: This pull request references Jira Issue OCPBUGS-15934, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request.

Details

In response to this:

close: https://issues.redhat.com/browse/OCPBUGS-15934
- What I did
Change the type of OverlaySize and LogSizeMax resource.Quantity to pointer *resource.Quantity.
the struct type will be set as 0 automatically when retrieving the ctrcfg object.

I am uncertain if we can make this update to current ContainerruntimeConfig API, I propose this change because the ContainerruntimeConfig API is currently MCO internal.
- How to verify it

# apply the containerruntimeconfig CR
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: pidlimit
spec:
machineConfigPoolSelector:
  matchLabels:
    pools.operator.machineconfiguration.openshift.io/worker: '' 
containerRuntimeConfig:
  pidsLimit: 4096 
  logLevel: debug

# current result:
$ oc get containerruntimeconfig  pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
 "logLevel": "debug",
 "logSizeMax": "0",
 "overlaySize": "0",
 "pidsLimit": 4096
}

# after this patch, it will be 
$ oc get containerruntimeconfig  pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
 "logLevel": "debug",
 "pidsLimit": 4096
}

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@QiWang19
Copy link
Copy Markdown
Member Author

QiWang19 commented Aug 1, 2023

@mtrmac could you take a look?

Copy link
Copy Markdown
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m afraid I’m quite unfamiliar with this code.

Looking at the history, #2494 supposedly fixed the same bug just by adding the omitempty annotations. Is it known what has changed that this is no longer sufficient? (I can’t, from a quick check, notice anything in the history of resource.Quantity changing the semantics.)

Or did that never work?

var configFileList []generatedConfigFile
ctrcfg := cfg.Spec.ContainerRuntimeConfig
if !ctrcfg.OverlaySize.IsZero() {
if ctrcfg.OverlaySize != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applies to all changes:

Doesn’t this change the semantics of previously-created CRs? If I understand the situation correctly, it’s possible that users create the CRs with these fields missing, and something (patchContainerRuntimeConfigs??) updates them and adds "0" field values.

In the old version, those 0 values were treated as missing, i.e. the system did exactly what the users wanted, just in a confusing way; with this PR, wouldn’t the 0 values actually start being applied?

I suppose one way to tell would be to add (uint? e2e?) tests with CRs of that kind, and ensure that both empty values and "0" values are treated as missing. (Or maybe those tests already exist? I didn’t find them though a quick grep but I didn’t spend much time looking.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my mistake, I didn’t notice that logic.

OverlaySize will, AFAICS, still cause mergeConfigChanges to be called, though with no edits. I guess that doesn’t make a difference.


Aesthetically I’d prefer for the code to consistently use a single condition, so that we just have “set to a relevant value / not set to a relevant value”, not “unset / set to zero / set to non-zero” to track.

But that’s weak a code style preference, to be decided by MCO maintainers.

if err != nil {
t.Errorf("%s: failed with %v. should have succeeded", test.name, err)
}
require.NotContains(t, string(data), "\"overlaySize\"", "\"overlaySize\"")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing LogSizeMax might be worthwhile here.

Signed-off-by: Qi Wang <qiwan@redhat.com>
@QiWang19
Copy link
Copy Markdown
Member Author

QiWang19 commented Aug 2, 2023

Updated: ignore this comment. I was testing on a build based on this PR.
The automatically zero values are set only on some builds. I feel it caused by different json packages somewhere.

4.14.0-ec.4

$ oc version
Client Version: v4.2.0-alpha.0-1974-g13225e0
Kustomize Version: v5.0.1
Server Version: 4.14.0-ec.4
Kubernetes Version: v1.27.3+4aaeaec
$ oc get containerruntimeconfig.machineconfiguration.openshift.io/pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
  "logLevel": "debug",
  "logSizeMax": "0",
  "overlaySize": "0",
  "pidsLimit": 4096
}

4.14.0-0.ci.:

$ oc version
Client Version: v4.2.0-alpha.0-1974-g13225e0
Kustomize Version: v5.0.1
Server Version: 4.14.0-0.ci.test-2023-08-02-171351-ci-ln-q32sdzk-latest
Kubernetes Version: v1.27.1-3201+ba1825544533d2-dirty
$ oc get containerruntimeconfig.machineconfiguration.openshift.io/pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
  "logLevel": "debug",
  "pidsLimit": 4096
}

@mtrmac
Copy link
Copy Markdown
Contributor

mtrmac commented Aug 2, 2023

That’s basically the thing that worries me — if we don’t understand how the bug arose, are we certain that we fixed it?

E.g. what if the ContainerRuntimeConfig object in question is created by something not in the MCO codebase? It’s quite possible that we need the MCO change anyway (if the old bug fix was just not working), but to actually fix the reporter’s problem we might need to find and fix that other non-MCO writer.

@QiWang19
Copy link
Copy Markdown
Member Author

QiWang19 commented Aug 2, 2023

That’s basically the thing that worries me — if we don’t understand how the bug arose, are we certain that we fixed it?

E.g. what if the ContainerRuntimeConfig object in question is created by something not in the MCO codebase? It’s quite possible that we need the MCO change anyway (if the old bug fix was just not working), but to actually fix the reporter’s problem we might need to find and fix that other non-MCO writer.

Forget about my last comment. I was testing on a cluster that build from this patch.
But you are right about the ContainerRuntimeConfig object in question is created by something not in the MCO codebase

@QiWang19
Copy link
Copy Markdown
Member Author

QiWang19 commented Aug 3, 2023

The patchContainerRuntimeConfigs can add the 0 that verified through github.com/clarketm/json.Marshal bytes
https://github.com/openshift/machine-config-operator/pull/3836/files#diff-f8f04fec2662cf966bb0a38e978bcf575bcfe7d3630caf7e2a8089d7ec7e6a70R491, what's with the fix #2494

@mtrmac
Copy link
Copy Markdown
Contributor

mtrmac commented Aug 3, 2023

OK, so #2494 was not actually sufficient. Thanks!

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2023
@QiWang19
Copy link
Copy Markdown
Member Author

QiWang19 commented Aug 8, 2023

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 8, 2023

@QiWang19: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@QiWang19
Copy link
Copy Markdown
Member Author

QiWang19 commented Aug 8, 2023

@yuqi-zhang could you approve?

Copy link
Copy Markdown
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logically seems sound. cc @jkyros in case this affects the api move at all. Also doing a

/hold

in case we want to pre-merge verify, but feel free to unhold if we feel that is not necessary

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 9, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 9, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mtrmac, QiWang19, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2023
@openshift-bot
Copy link
Copy Markdown
Contributor

/jira refresh

The requirements for Jira bugs have changed (Jira issues linked to PRs on main branch need to target different OCP), recalculating validity.

@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Sep 8, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@openshift-bot: This pull request references Jira Issue OCPBUGS-15934, which is invalid:

  • expected the bug to target the "4.15.0" version, but it targets "4.14.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

The requirements for Jira bugs have changed (Jira issues linked to PRs on main branch need to target different OCP), recalculating validity.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yuqi-zhang
Copy link
Copy Markdown
Contributor

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Sep 29, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@yuqi-zhang: This pull request references Jira Issue OCPBUGS-15934, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.0) matches configured target version for branch (4.15.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yuqi-zhang
Copy link
Copy Markdown
Contributor

Sorry for the delay, with openshift/api#1453 and #3747, this will first require a change to openshift/api and then re-brought in.

@QiWang19 QiWang19 closed this Oct 13, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@QiWang19: This pull request references Jira Issue OCPBUGS-15934. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.

Details

In response to this:

close: https://issues.redhat.com/browse/OCPBUGS-15934
- What I did
Change the type of OverlaySize and LogSizeMax resource.Quantity to pointer *resource.Quantity.
the struct type will be set as 0 automatically when retrieving the ctrcfg object.

I am uncertain if we can make this update to current ContainerruntimeConfig API, I propose this change because the ContainerruntimeConfig API is currently MCO internal.
- How to verify it

# apply the containerruntimeconfig CR
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: pidlimit
spec:
machineConfigPoolSelector:
  matchLabels:
    pools.operator.machineconfiguration.openshift.io/worker: '' 
containerRuntimeConfig:
  pidsLimit: 4096 
  logLevel: debug

# current result:
$ oc get containerruntimeconfig  pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
 "logLevel": "debug",
 "logSizeMax": "0",
 "overlaySize": "0",
 "pidsLimit": 4096
}

# after this patch, it will be 
$ oc get containerruntimeconfig  pidlimit -o json | jq '.spec.containerRuntimeConfig'
{
 "logLevel": "debug",
 "pidsLimit": 4096
}

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants