CI Enhancement Proposal #32

theishshah · 2020-06-19T18:19:20Z

No description provided.

jmccormick2001

I'd be curious as to what could be accomplished in the short-term, say within 30 days, to improve the CI performance, some are longer term as you suggest. maybe a breakdown of short-term versus longer-term?

jmccormick2001 · 2020-06-19T18:39:18Z

another question: would it be possible to simulate the performance gains for breaking the CI into separate repos?

jmccormick2001 · 2020-06-19T18:39:54Z

what are your thoughts on measuring CI performance? if we move to Gitlab, is there a way to compare Travis performance versus what we get in Gitlab for the same workloads?

camilamacedo86 · 2020-06-19T18:50:53Z

enhancements/ci-enhancement.md

+
+Travis passes but GitHub does not allow merging
+
+Goveralls requires rebasing


Suggested change

Goveralls requires rebasing

It is not a problem any more. See: operator-framework/operator-sdk#3158

camilamacedo86 · 2020-06-19T18:52:41Z

enhancements/ci-enhancement.md

+
+General flakes and failures
+
+Travis passes but GitHub does not allow merging


this happened a few times but i think it is important to emphasize that this problem is intermittent and is not frequent.

camilamacedo86 · 2020-06-19T18:53:44Z

enhancements/ci-enhancement.md

+
+The primary issues are the following:
+
+Extremely slow TravisCI runs


Suggested change

Extremely slow TravisCI runs

- Extremely slow TravisCI runs

Could we add the itens using the list items mark from markdown? Also, can we remove the space between each item?

camilamacedo86 · 2020-06-19T18:56:55Z

enhancements/ci-enhancement.md

+
+Unit test coverage is subpar
+
+Increased unit testing would also allow us to alleviate some of the e2e testing load


the above item shows a solution. So, should not be in bellows of The primary issues are the following:

estroz

If this is going to be in the enhancements repo, it needs to follow the template. I'm not 100% sure it does need to exist here since it only really affects the development process of the SDK; these changes don't really pertain to anything user-facing.

bharathi-tenneti · 2020-06-19T18:59:07Z

I believe we should wait on separating repos for (Ansible,Helm, and Go), before Enhancing CI. As the e2e tests can be separated out as well.
Increasing unit test coverage can be achieved, by adding this as requirement in JIRA story, wherever is needed(as a sub-task).

camilamacedo86 · 2020-06-19T19:00:13Z

enhancements/ci-enhancement.md

+
+# Proposed Changes
+
+The first item to examine is our CI platform. After basic research I found that GitLab CI has a high bandwidth option available for free for open source applications. This would allow us to run more tests concurrently and avoid some of the bottlenecks found in travis. Additionally each concurrent process has higher computer power available to it. My suggestion would be to dry run GitLab CI for 30 days (within a trial period) to gain knowledge on whether or not our testing speed is faster and/or more reliable. This period would also allow us time to further improve the CI on the software side.


After basic research I found that GitLab CI has a high bandwidth option available for free for open source applications. {...} Additionally, each concurrent process has higher computer power available to it

Could you please provide a technical comparison between both (Travis X GitLab)? I mean how much Travis provide and how much GitLab provides?

Also, what are the problems that we have now that would be solved with? Could we describe from the list what items would be sorted out?

camilamacedo86 · 2020-06-19T19:03:00Z

enhancements/ci-enhancement.md

+# Proposed Changes
+
+The first item to examine is our CI platform. After basic research I found that GitLab CI has a high bandwidth option available for free for open source applications. This would allow us to run more tests concurrently and avoid some of the bottlenecks found in travis. Additionally each concurrent process has higher computer power available to it. My suggestion would be to dry run GitLab CI for 30 days (within a trial period) to gain knowledge on whether or not our testing speed is faster and/or more reliable. This period would also allow us time to further improve the CI on the software side.
+


My suggestion would be to dry run GitLab CI for 30 days (within a trial period)

Can we just switch to GitLabCI? Would not it require a specific syntax/setup? Also, if be easy/low effort the setup could we not just configure a fork a push a PR to check how long will take and compare?

Move to use GitLab do not means get out from GitHub? Would not it mean to move the repo from github.com to gitlab.com? If yes, then I think we need to consider other aspects as well.

camilamacedo86 · 2020-06-19T19:40:41Z

enhancements/ci-enhancement.md

+
+Goveralls requires rebasing
+
+Random websites which are called during docs checks are down and fail entire runs


IMO the problem that we have here is: intermittent timeout issues faced in the doc checks.
See that if the doc or sanity fails we do not run the other tests to optimize the usage of Travis resources since it means that we need to do a change in the code to fix it. E.g broken link, missing licence in a new file, lint code issues such as dead code.

camilamacedo86 · 2020-06-19T19:44:12Z

enhancements/ci-enhancement.md

+Goveralls requires rebasing
+
+Random websites which are called during docs checks are down and fail entire runs
+


Some ideas that might help you with:

Performance Issue
Note that can take upwards of 30mins for a single set of e2e tests to execute.

Unit test coverage of the project is low
May would be possible to decrease the quantity of e2e tests which are the root cause of the performance issues and the reason for the tests take to long to be executed.

E2E test done with shell script
Since the e2e test are done in the shell is hard to troubleshooting and because of this, we cannot use coveralls to check what is or not covered with them.

NOTE> Open question here. by using env test and the modules "github.com/onsi/ginkgo" and ."github.com/onsi/gomega" we will able to troubleshooting/debug the e2e tests?

Low Manutence Ability
The tests do not follow a single standard which makes harder we kept then maintained.

General flakes and failures (intermittent and not frequent)

Travis passes but GitHub does not allow merging

Timeout issues and 404 errors during docs checks

camilamacedo86 · 2020-06-19T19:50:14Z

enhancements/ci-enhancement.md

+
+Each test is a separate function or script
+
+Currently many tests (sometimes unrelated) are jammed together in a single script which is then difficult to identify problem areas


I think we can be a little more specific here. It might happen naturally by following @estroz suggestion to use the template. 👍

fabianvf · 2020-06-19T20:31:11Z

enhancements/ci-enhancement.md

+
+# Motivations
+
+The primary motivation for these enhancements is to improve developer productivity. Much time is lost in slow CI runs, inconsistent flakes and failures in CI, and a general difficulty in debugging said CI runs. 


I would note that another big step to improving productivity would be to make it easy, it at least possible, to run specific tests or set of tests in a way that accurately mimics the CI environment

You say that below but I still think it's a major motivation

fabianvf · 2020-06-19T20:32:33Z

enhancements/ci-enhancement.md

+
+Increased unit testing would also allow us to alleviate some of the e2e testing load
+
+Testing is stitched together across a number of bash scripts


And in those scripts infra setup and actual test execution are completely muddled together, which makes it a nightmare to run tests against a specific, existing clustet

fabianvf · 2020-06-19T20:36:06Z

enhancements/ci-enhancement.md

+
+The first item to examine is our CI platform. After basic research I found that GitLab CI has a high bandwidth option available for free for open source applications. This would allow us to run more tests concurrently and avoid some of the bottlenecks found in travis. Additionally each concurrent process has higher computer power available to it. My suggestion would be to dry run GitLab CI for 30 days (within a trial period) to gain knowledge on whether or not our testing speed is faster and/or more reliable. This period would also allow us time to further improve the CI on the software side.
+
+The second change to implement would be increasing unit test coverage of the SDK CLI and scaffolding. The more of the SDK we can unit test, the less pathways need to be checked in e2e testing. In general we are able to execute unit tests much faster than e2e tests. There is already some unit testing in place and the action item for this would be to simply ensure we have 100% coverage of all of our possible CLI code paths.


I agree with this to a point, but I think we shouldn't make the e2e tests cover less. What would make sense is to have good unit tests and gate the e2e tests on their success, allowing builds that we know are bad early to waste less time

fabianvf · 2020-06-19T20:38:17Z

enhancements/ci-enhancement.md

+
+Allows for more clarity when debugging e2e tests
+
+Allows for a single e2e test format


Due to the range of technologies we support, I'd just want to make sure this isn't too prescriptive. Rewriting Ansible molecule tests to support an arbitrary format could be non-trivial, and testing Ansible without molecule would be a lot more non-trivial

I understand that we also need to add the definition for the common cases (fewer exceptions described by @fabianvf above) that all test would be made with ginkgo and gomega.

And that e2e will use the kb utils(env test)as the tests for the new layout: https://github.com/operator-framework/operator-sdk/tree/master/test/e2e-new

theishshah · 2020-10-26T16:38:01Z

Obsolete.

CI Enhancement Proposal

c7e71ba

jmccormick2001 reviewed Jun 19, 2020

View reviewed changes

camilamacedo86 reviewed Jun 19, 2020

View reviewed changes

estroz suggested changes Jun 19, 2020

View reviewed changes

camilamacedo86 reviewed Jun 19, 2020

View reviewed changes

fabianvf reviewed Jun 19, 2020

View reviewed changes

theishshah closed this Oct 26, 2020


		Travis passes but GitHub does not allow merging

		Goveralls requires rebasing


		General flakes and failures

		Travis passes but GitHub does not allow merging


		The primary issues are the following:

		Extremely slow TravisCI runs


		Unit test coverage is subpar

		Increased unit testing would also allow us to alleviate some of the e2e testing load


		# Proposed Changes

		The first item to examine is our CI platform. After basic research I found that GitLab CI has a high bandwidth option available for free for open source applications. This would allow us to run more tests concurrently and avoid some of the bottlenecks found in travis. Additionally each concurrent process has higher computer power available to it. My suggestion would be to dry run GitLab CI for 30 days (within a trial period) to gain knowledge on whether or not our testing speed is faster and/or more reliable. This period would also allow us time to further improve the CI on the software side.


		Goveralls requires rebasing

		Random websites which are called during docs checks are down and fail entire runs


		Each test is a separate function or script

		Currently many tests (sometimes unrelated) are jammed together in a single script which is then difficult to identify problem areas


		# Motivations

		The primary motivation for these enhancements is to improve developer productivity. Much time is lost in slow CI runs, inconsistent flakes and failures in CI, and a general difficulty in debugging said CI runs.


		Increased unit testing would also allow us to alleviate some of the e2e testing load

		Testing is stitched together across a number of bash scripts


		The first item to examine is our CI platform. After basic research I found that GitLab CI has a high bandwidth option available for free for open source applications. This would allow us to run more tests concurrently and avoid some of the bottlenecks found in travis. Additionally each concurrent process has higher computer power available to it. My suggestion would be to dry run GitLab CI for 30 days (within a trial period) to gain knowledge on whether or not our testing speed is faster and/or more reliable. This period would also allow us time to further improve the CI on the software side.

		The second change to implement would be increasing unit test coverage of the SDK CLI and scaffolding. The more of the SDK we can unit test, the less pathways need to be checked in e2e testing. In general we are able to execute unit tests much faster than e2e tests. There is already some unit testing in place and the action item for this would be to simply ensure we have 100% coverage of all of our possible CLI code paths.


		Allows for more clarity when debugging e2e tests

		Allows for a single e2e test format

CI Enhancement Proposal #32

CI Enhancement Proposal #32

Uh oh!

Conversation

theishshah commented Jun 19, 2020

Uh oh!

jmccormick2001 left a comment

Choose a reason for hiding this comment

Uh oh!

jmccormick2001 commented Jun 19, 2020

Uh oh!

jmccormick2001 commented Jun 19, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

estroz left a comment

Choose a reason for hiding this comment

Uh oh!

bharathi-tenneti commented Jun 19, 2020

Uh oh!

camilamacedo86 Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camilamacedo86 Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camilamacedo86 Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camilamacedo86 Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

theishshah commented Oct 26, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

camilamacedo86 Jun 19, 2020 •

edited

Loading

camilamacedo86 Jun 19, 2020 •

edited

Loading

camilamacedo86 Jun 19, 2020 •

edited

Loading

camilamacedo86 Jun 19, 2020 •

edited

Loading