Skip to content

Allow resources in Revision#2117

Closed
MissingRoberto wants to merge 1 commit intoknative:masterfrom
MissingRoberto:allow-resources
Closed

Allow resources in Revision#2117
MissingRoberto wants to merge 1 commit intoknative:masterfrom
MissingRoberto:allow-resources

Conversation

@MissingRoberto
Copy link
Copy Markdown

[fixes #2099]

Proposed Changes

  • Allow resources defined by the user
  • If resources is empty, configure the user-container with the recommended values

@knative-prow-robot knative-prow-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Oct 1, 2018
@MissingRoberto
Copy link
Copy Markdown
Author

/assign @mattmoor

@MissingRoberto
Copy link
Copy Markdown
Author

Let me know if it's necessary to run ./hack/update-codegen.sh please

@mattmoor
Copy link
Copy Markdown
Member

mattmoor commented Oct 1, 2018

/assign @evankanderson

Would it be possible to add an e2e test that sets a memory limit and OOMs?

@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 2, 2018
@knative-prow-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jszroberto
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: evankanderson

If they are not already assigned, you can assign the PR to them by writing /assign @evankanderson in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@MissingRoberto
Copy link
Copy Markdown
Author

/hold

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 2, 2018
Copy link
Copy Markdown
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the spec and tests at the same time.

I don't know if you want to extend helloworld to add an endpoint for the memory allocation, but it would be nice to actually test that the supplied limits are enforced.

}
}

func TestAutoscaleExceedsLimits(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to test in the autoscaler; the resource limits supplied are per-pod, so autoscaling will just make more pods.

At some point, we'll need to put in a circuit-breaker that prevents scaling up if existing pods are crash-looping. @josephburnett to track.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, my intention was to test the ScaleUpAndDown behavior if some resources would be freed or allocated. If pods would be created or destroyed. But that's probably not the responsibility of the autoscaler itself.


func isQuotaReached() func(d *v1beta1.Deployment) (bool, error) {
return func(d *v1beta1.Deployment) (bool, error) {
// TODO Remove this line
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODONE?

}
}

func TestQuotaExceeded(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand how this works. If I wanted to test this for memory, I would:

  1. Write a small server (Go, C++ or Python) which uses a POST request parameter to allocates and fills X MB of RAM (you may need to actually touch each memory page to get Linux to allocate the bytes), then frees the memory and reports memory stats in the HTTP response.
  2. Set a quota limit of 500MB. Send requests for 100, 200, 800, and make sure that the first two succeed, and the third fails.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense

@MissingRoberto MissingRoberto changed the title Allow resources in Revision WIP: Allow resources in Revision Oct 4, 2018
@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 4, 2018
@MissingRoberto
Copy link
Copy Markdown
Author

MissingRoberto commented Oct 4, 2018

@evankanderson thank you for the review of WIP.

I am pushing wip commits because I am not able to get e2e tests to run on minikube.

Let's do NOT merge this yet, because I found something unexpected.

When I set up LimitRanges or QuotaLimits, running the services fail with the following error (even if I specify defaults): failed quota: quota-lite: must specify limits.memory,requests.memory.

kubectl doesn't this problem.

@MissingRoberto
Copy link
Copy Markdown
Author

/test pull-knative-serving-integration-tests

@evankanderson
Copy link
Copy Markdown
Member

Is this working now, or still WIP? (Ping me with a @ mention when you want another review.)

@MissingRoberto
Copy link
Copy Markdown
Author

/test pull-knative-serving-integration-tests

3 similar comments
@MissingRoberto
Copy link
Copy Markdown
Author

/test pull-knative-serving-integration-tests

@MissingRoberto
Copy link
Copy Markdown
Author

/test pull-knative-serving-integration-tests

@MissingRoberto
Copy link
Copy Markdown
Author

/test pull-knative-serving-integration-tests

@MissingRoberto
Copy link
Copy Markdown
Author

/unhold

@evankanderson Finally I got it working. Can you review it, please?

@MissingRoberto MissingRoberto changed the title WIP: Allow resources in Revision Allow resources in Revision Oct 16, 2018
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 16, 2018
@MissingRoberto
Copy link
Copy Markdown
Author

/hold cancel

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 16, 2018
Lifecycle: &corev1.Lifecycle{},
},
want: apis.ErrDisallowedFields("name", "resources", "ports", "volumeMounts", "lifecycle"),
want: apis.ErrDisallowedFields("name", "ports", "volumeMounts", "lifecycle"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can remove resources from the resource definition too in this case

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

userContainer.Resources = userResources

if equality.Semantic.DeepEqual(userContainer.Resources, corev1.ResourceRequirements{}) {
userContainer.Resources = userResources
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should do a deep merge on CPU here? IIUC if I specify only a memory resource here I will get an implicit undefinition of CPU resources.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes lot of sense. I thought about it as well.

@MissingRoberto
Copy link
Copy Markdown
Author

/test pull-knative-serving-integration-tests

@MissingRoberto
Copy link
Copy Markdown
Author

@evankanderson @greghaynes can you review again, please?

@mattmoor
Copy link
Copy Markdown
Member

@jszroberto Can you add unit test coverage of resource limits for applyDefaultResources? Why do you have to manually copy these?

@mattmoor
Copy link
Copy Markdown
Member

Also, needs a rebase.

@mattmoor
Copy link
Copy Markdown
Member

@jszroberto Any chance for a rebase, so we can close this one out?

@knative-metrics-robot
Copy link
Copy Markdown

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
test/test_images/bloatingcow/bloatingcow.go Do not exist 0.0%

@mattmoor
Copy link
Copy Markdown
Member

mattmoor commented Nov 2, 2018

@jszroberto Looks like it's the new test that's failing. Generally when adding new e2e, we should try to run them 10x or so to make sure we're not introducing new flakes. You should be able to do this with -count=10 -run=TestCustomResourcesLimits on your go test command.

@mattmoor
Copy link
Copy Markdown
Member

mattmoor commented Nov 6, 2018

I ran this 10x locally without fail, so...

/test pull-knative-serving-integration-tests

@knative-prow-robot
Copy link
Copy Markdown
Contributor

@jszroberto: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-knative-serving-integration-tests 17da6c1 link /test pull-knative-serving-integration-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mattmoor
Copy link
Copy Markdown
Member

mattmoor commented Nov 6, 2018

@adrcunha it's not clear to me why this is failing in Prow, but not for me locally (I run it 10x).

@bbrowning
Copy link
Copy Markdown
Contributor

@mattmoor If a Container allocates more memory than its limit, the Container becomes a candidate for termination. If the Container continues to consume memory beyond its limit, the Container is terminated. - https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#exceed-a-container-s-memory-limit

The test assumes the container is immediately terminated as soon as it exceeds the memory limit. But, in reality, it takes a bit of time for Kubernetes to OOMKill the pod.

@mattmoor
Copy link
Copy Markdown
Member

mattmoor commented Nov 9, 2018

@bbrowning Interesting that this passed 10x for me then... I guess I got lucky. Let me look at the way we check this to see if I can suggest anything.


b := make([]byte, mb*1024*1024)
b[0] = 1
b[len(b)-1] = 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps initialize the whole thing for good measure?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about the allocation happening virtually and then getting filled in through page faults as the pages are actually consumed. I doubt first and last is good enough for all sizes.

@mattmoor
Copy link
Copy Markdown
Member

@jszroberto Curious if you've had a time to explore this at all?

@MissingRoberto
Copy link
Copy Markdown
Author

@mattmoor no, I didn't. I am still out of office.

@mattmoor
Copy link
Copy Markdown
Member

@jszroberto Any chance you are back? I'd love to get this in.

@mattmoor mattmoor mentioned this pull request Nov 30, 2018
@mattmoor
Copy link
Copy Markdown
Member

I have a version of this in the linked PR that has passed at least one round of integration testing. I'll run it a few more times, but if it is consistent, I'll move ahead unless @jszroberto comes back and wants to incorporate the extra changes in my PR.

@ellistarn
Copy link
Copy Markdown

ellistarn commented Nov 30, 2018 via email

mattmoor added a commit that referenced this pull request Nov 30, 2018
@mattmoor
Copy link
Copy Markdown
Member

I just landed my PR based on this. thanks @jszroberto for implementing this!

@mattmoor mattmoor closed this Nov 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot set "resources" on a container revision spec

9 participants