Skip to content

Do not modify config-map in autoscale test.#2144

Merged
knative-prow-robot merged 3 commits into
knative:masterfrom
josephburnett:noconfig
Oct 4, 2018
Merged

Do not modify config-map in autoscale test.#2144
knative-prow-robot merged 3 commits into
knative:masterfrom
josephburnett:noconfig

Conversation

@josephburnett
Copy link
Copy Markdown
Contributor

This is a step in removing the cluster-level flakiness that the autoscale test seems to introduce. One theory is that reducing the scale-to-zero-threshold is causing the blue-green test to scale down and flake (either not becoming ready or getting the wrong split from the activator).

Proposed Changes

  • Remove config map modification from the autoscale test.

Release Note

@knative-prow-robot knative-prow-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 3, 2018
@josephburnett josephburnett changed the title Do not modify config-map. Do not modify config-map in autoscale test. Oct 3, 2018
@josephburnett
Copy link
Copy Markdown
Contributor Author

/assign @adrcunha

Copy link
Copy Markdown
Contributor

@bbrowning bbrowning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of questions and noticed an erroneous log message, but nothing that would block this from being merged if it's a priority to get the flakes fixed asap.

isDeploymentScaledUp(),
"DeploymentIsScaledUp")
"DeploymentIsScaledUp",
2*time.Minute)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is reducing the time we wait for initial scaleup from the old timeout of 6 minutes down to 2 minutes. Do we know from previous test runs that this always happens within 2 minutes, so that we're not introducing another potential flake?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty certain that if scale up doesn't happen within 2 minutes, it never will. I saw this while developing these changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked through several previous test runs and the scaling up tens to happen in less than one second. I didn't see any cases where it even approached 2 minutes, so this seems fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what makes it okay is that we aren't starting the time until we have 200 responses from all the requests we sent. So processing time can't affect this. It's just purely metrics pipeline and autoscaler latency.

@@ -249,7 +227,8 @@ func TestAutoscaleUpDownUp(t *testing.T) {
clients.KubeClient,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message about "Manually setting ScaleToZeroThreshold" a few lines above this is no longer relevant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

isDeploymentScaledUp(),
"DeploymentScaledUp")
"DeploymentScaledUp",
2*time.Minute)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment with regards to the 2 minute timeout as the previous scaleup block. Lower is better, as long as the tests don't flake with the lower value.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, same deal. All the requests have succeeded. Now we're just giving the autoscaler time to respond. The metrics pipeline has a 60 second window, so 2 minutes is definitely enough to scale up.

@bbrowning
Copy link
Copy Markdown
Contributor

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 4, 2018
Comment thread test/e2e/autoscale_test.go Outdated
cause the test to time out. Failing fast instead. %v`, err)
t.Fatalf("Unable to parse scale-to-zero-threshold as duration: %v", err)
}
scaleToZeroThreshold = threshold
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] No need of another var. can be
scaleToZeroThreshold, err := time.ParseDuration(configMap.Data["scale-to-zero-threshold"])

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good catch. Removed.

@srinivashegde86
Copy link
Copy Markdown
Contributor

/lgtm

@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 4, 2018
@srinivashegde86
Copy link
Copy Markdown
Contributor

/lgtm
/approve

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 4, 2018
@knative-prow-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: josephburnett, srinivashegde86

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 4, 2018
@knative-prow-robot knative-prow-robot merged commit efbbc3a into knative:master Oct 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants