Allow resources in Revision#2117
Allow resources in Revision#2117MissingRoberto wants to merge 1 commit intoknative:masterfrom MissingRoberto:allow-resources
Conversation
|
/assign @mattmoor |
|
Let me know if it's necessary to run |
|
/assign @evankanderson Would it be possible to add an e2e test that sets a memory limit and OOMs? |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jszroberto If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/hold |
evankanderson
left a comment
There was a problem hiding this comment.
Thanks for updating the spec and tests at the same time.
I don't know if you want to extend helloworld to add an endpoint for the memory allocation, but it would be nice to actually test that the supplied limits are enforced.
test/e2e/autoscale_test.go
Outdated
| } | ||
| } | ||
|
|
||
| func TestAutoscaleExceedsLimits(t *testing.T) { |
There was a problem hiding this comment.
I don't think we need to test in the autoscaler; the resource limits supplied are per-pod, so autoscaling will just make more pods.
At some point, we'll need to put in a circuit-breaker that prevents scaling up if existing pods are crash-looping. @josephburnett to track.
There was a problem hiding this comment.
To be honest, my intention was to test the ScaleUpAndDown behavior if some resources would be freed or allocated. If pods would be created or destroyed. But that's probably not the responsibility of the autoscaler itself.
test/e2e/resources_test.go
Outdated
|
|
||
| func isQuotaReached() func(d *v1beta1.Deployment) (bool, error) { | ||
| return func(d *v1beta1.Deployment) (bool, error) { | ||
| // TODO Remove this line |
test/e2e/resources_test.go
Outdated
| } | ||
| } | ||
|
|
||
| func TestQuotaExceeded(t *testing.T) { |
There was a problem hiding this comment.
I don't quite understand how this works. If I wanted to test this for memory, I would:
- Write a small server (Go, C++ or Python) which uses a POST request parameter to allocates and fills X MB of RAM (you may need to actually touch each memory page to get Linux to allocate the bytes), then frees the memory and reports memory stats in the HTTP response.
- Set a quota limit of 500MB. Send requests for 100, 200, 800, and make sure that the first two succeed, and the third fails.
There was a problem hiding this comment.
Here is an example of consuming memory: https://github.com/knative/docs/blob/6d21cd89cbc8d5ab1c0d11f53343ed494d0980dc/serving/samples/autoscale-go/autoscale.go#L76
|
@evankanderson thank you for the review of WIP. I am pushing wip commits because I am not able to get e2e tests to run on minikube. Let's do NOT merge this yet, because I found something unexpected. When I set up kubectl doesn't this problem. |
|
/test pull-knative-serving-integration-tests |
|
Is this working now, or still WIP? (Ping me with a @ mention when you want another review.) |
|
/test pull-knative-serving-integration-tests |
3 similar comments
|
/test pull-knative-serving-integration-tests |
|
/test pull-knative-serving-integration-tests |
|
/test pull-knative-serving-integration-tests |
|
/unhold @evankanderson Finally I got it working. Can you review it, please? |
|
/hold cancel |
| Lifecycle: &corev1.Lifecycle{}, | ||
| }, | ||
| want: apis.ErrDisallowedFields("name", "resources", "ports", "volumeMounts", "lifecycle"), | ||
| want: apis.ErrDisallowedFields("name", "ports", "volumeMounts", "lifecycle"), |
There was a problem hiding this comment.
nit: we can remove resources from the resource definition too in this case
| userContainer.Resources = userResources | ||
|
|
||
| if equality.Semantic.DeepEqual(userContainer.Resources, corev1.ResourceRequirements{}) { | ||
| userContainer.Resources = userResources |
There was a problem hiding this comment.
I wonder if we should do a deep merge on CPU here? IIUC if I specify only a memory resource here I will get an implicit undefinition of CPU resources.
There was a problem hiding this comment.
That makes lot of sense. I thought about it as well.
|
/test pull-knative-serving-integration-tests |
|
@evankanderson @greghaynes can you review again, please? |
|
@jszroberto Can you add unit test coverage of resource limits for |
|
Also, needs a rebase. |
|
@jszroberto Any chance for a rebase, so we can close this one out? |
[knative/serving##2099]
|
The following is the coverage report on pkg/.
|
|
@jszroberto Looks like it's the new test that's failing. Generally when adding new e2e, we should try to run them 10x or so to make sure we're not introducing new flakes. You should be able to do this with |
|
I ran this 10x locally without fail, so... /test pull-knative-serving-integration-tests |
|
@jszroberto: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@adrcunha it's not clear to me why this is failing in Prow, but not for me locally (I run it 10x). |
|
@mattmoor The test assumes the container is immediately terminated as soon as it exceeds the memory limit. But, in reality, it takes a bit of time for Kubernetes to OOMKill the pod. |
|
@bbrowning Interesting that this passed 10x for me then... I guess I got lucky. Let me look at the way we check this to see if I can suggest anything. |
|
|
||
| b := make([]byte, mb*1024*1024) | ||
| b[0] = 1 | ||
| b[len(b)-1] = 1 |
There was a problem hiding this comment.
Perhaps initialize the whole thing for good measure?
There was a problem hiding this comment.
I'm worried about the allocation happening virtually and then getting filled in through page faults as the pages are actually consumed. I doubt first and last is good enough for all sizes.
|
@jszroberto Curious if you've had a time to explore this at all? |
|
@mattmoor no, I didn't. I am still out of office. |
|
@jszroberto Any chance you are back? I'd love to get this in. |
|
I have a version of this in the linked PR that has passed at least one round of integration testing. I'll run it a few more times, but if it is consistent, I'll move ahead unless @jszroberto comes back and wants to incorporate the extra changes in my PR. |
|
Awesome!
Matt, this still won't enable specific gpu resources, correct? Node
selector is gkes hook for this, but my understanding is that it won't pass
through a Knative spec.
…On Thu, Nov 29, 2018, 8:27 PM Matt Moore ***@***.*** wrote:
I have a version of this in the linked PR that has passed at least one
round of integration testing. I'll run it a few more times, but if it is
consistent, I'll move ahead unless @jszroberto
<https://github.com/jszroberto> comes back and wants to incorporate the
extra changes in my PR.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2117 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACd1iAhlCC1JCX2rtgNaSgGgr5V8R1PTks5u0LO3gaJpZM4XCa_7>
.
|
|
I just landed my PR based on this. thanks @jszroberto for implementing this! |
[fixes #2099]
Proposed Changes
resourcesdefined by the userresourcesis empty, configure the user-container with the recommended values