Skip to content

Conversation

@Kami
Copy link
Member

@Kami Kami commented Jan 19, 2019

This pull request updates travis config to use ubuntu xenial instead of (deprecated a long time ago) precise.
Update: Decided to stick to Precise due to the speed reasons, but add Xenial compatibility in TravisCI config for future.

I'm just experimenting to see if this will affect tons of intermediate build failures we have been seeing in the last couple of days.

@Kami Kami added this to the 3.0.0 milestone Jan 19, 2019
@Kami
Copy link
Member Author

Kami commented Jan 19, 2019

As mentioned on Slack - we need to figure out a path going forward with Travis CI.

It has simply been too slow and unstable for us for the last half a year or so.

[success] 10.60% tests.unit.test_scheduler.ActionExecutionSchedulingQueueItemDBTest.test_no_processing_of_non_requested_actions: 194.7348s

The same test case takes 14 seconds locally. We haven't done anything major in the code which should affect build times in such a negative way. Build times on Travis have been slowly growing from 15 minutes (spent a lot of time getting it to that number a half a year or so ago) to 50+ minutes now.

We don't have much visibility on what is going on Travis, but I would imagine they changed VM size to some tiny one for old VM / sudo required builds or something similar.

EDIT: I some how missed the memo - https://blog.travis-ci.com/2018-10-04-combining-linux-infrastructures.

We never used container based infrastructure, but it looks like this change could some how affect build timings in a negative way.

Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see how infrastructure type change in TravisCI affects the build timings, keeping any other changes in tests minimal for fair comparison.
Previously switching to container-based infra slowed down the build times x1.5-x2.

It's good to experiment here.
I'm 👍 to reassess if timing improves significantly.

@Kami
Copy link
Member Author

Kami commented Jan 19, 2019

Alright, I made changes to ./scripts/travis/prepare-integration.sh so the integration build now passes.

There are still issues with intermediate failures and slow build times for unit tests which seem to be related to Travis environment (aka nothing we can do at the moment).

Kami added 3 commits January 21, 2019 12:32
It turns out that on AMQP publish on Travis, "Channel.open: (504)
CHANNEL_ERROR - second 'channel.open' seen" error is returned almost on
every publish which causes our connection retry wrapper to sleep for 10
seconds and slowing down the tests.

I haven't been able to track down the root cause if this issue yet, but
this workaround should work fine and cause no issues on production.
@Kami Kami requested a review from arm4b January 22, 2019 12:57
Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE precise: https://docs.travis-ci.com/user/reference/precise/.
They're not removing it completely, just don't update with new/security.

Talking about timing. New setup increases overall build timing significantly from 22 to 29mins. I'd say anything that takes 30mins+ in CI tests is counter-productive.

You want to keep the parallel tasks in CI job aligned in timing. The longest task will make the most vote in entire build time aka wait time. This way having new 29min as longest task is much worse comparing to old 22min.

Even if other tasks went faster, you have to wait for longest 2:

And you'll keep adding more and more Unit tests in future which will increase "overall" timing unproportionally high.
By these reasons previously I split Unit tests into 2 pieces to keep the overall build timing under 10mins and set it to precise as fastest Travis Unit tests performer.

Overall, this looks like a tradeoff: being up to date and slower VS outdated and faster.
As said before, I don't think distro makes much sense for Unit tests, keeping in mind that all you need is isolated pinned python version.

Is there any other important point I'm missing?


I'll keep this information here for consideration.

I think more 👀 will be beneficial in this PR, especially someone who works with the repo itself in day-to-day basis like @bigmstone and @m4dcoder.

@arm4b arm4b requested review from bigmstone and m4dcoder January 22, 2019 13:56
@Kami
Copy link
Member Author

Kami commented Jan 22, 2019

@armab I agree.

As long as it's still supported / available on Travis I'm fine with using Precise if the builds indeed finish faster (as you have pointed out, we would still need to wait for the longest build).

There are still issues with using old version of various components and dependencies (e.g. OpenSSL which ships in Precise won't be supported in newer version of cryptography library anymore, etc).

Good news also is that now our code and Travis config also supports Xenial (just need to uncomment and change a couple of lines in Travis config) so if we are ever forced to switch to it, the switch will be easy and fast.

On a related note - I haven't been able to figure out why unit tests are slower but integration ones are faster on Xenial. VMs should have the same amount of resources available to them (RAM and CPU).

@Kami Kami changed the title Use Ubuntu Xenial base VM image on Travis Various fixes so our code also works with Travis Xenial VM image Jan 24, 2019
Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding new Travis infra compatibility works for me, while being on fastest but oldest infra.

Probably need to edit PR description as well to reflect updated change status?

@Kami
Copy link
Member Author

Kami commented Jan 28, 2019

@armab I will just add a comment here about us deciding to stick to Precise due to the speed reasons.

For transparency reasons I usually try to leave the original description and discussion alone and evolve the PR via comments.

@Kami Kami merged commit 6c2550b into master Jan 28, 2019
@Kami Kami deleted the build_changes branch January 28, 2019 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants