Skip to content

Conversation

@adoroszlai
Copy link
Contributor

@adoroszlai adoroszlai commented Jul 30, 2021

What changes were proposed in this pull request?

Introduce a new job (build-info) in the CI workflow to determine which other jobs should be run. The goal is to skip checks which are not applicable to a specific change.

  1. All tests are run for push and scheduled builds (no change here).
  2. For PRs the incoming (merge) commit's list of changed files is tested against various patterns:
    • README changes trigger no tests at all
    • rat is run for all other changes
    • docs is run for changes in hadoop-hdds/docs
    • author is triggered for Java changes
    • checkstyle and findbugs are also triggered for POM changes
    • dependency is only triggered by POM changes
    • acceptance, integration and kubernetes tests are triggered by changes in the corresponding test files
  3. If any uncategorized file is changed (e.g. any non-test Java file), all tests are triggered.
  4. Trigger all tests if specific label (full tests needed) is present in the PR. If it is added later, a new empty commit is needed to re-trigger CI.
  5. Skip all tests for draft PR, same as previously, but without hard-coding in workflow.

Big thanks to Apache Airflow for the libraries and large chunks of the script.

https://issues.apache.org/jira/browse/HDDS-5000

How was this patch tested?

Created test pull requests in my own fork:

Main one with various incremental changes to verify the tests being triggered:
adoroszlai#22

One with full tests needed label, triggers all tests:
adoroszlai#23

Robot test change (minor improvement over the main PR):
adoroszlai#24

Regular CI:
https://github.com/adoroszlai/hadoop-ozone/actions/runs/1082718710

I have also added a bats-based test suite. It is not run as part of CI currently, because it uses optional bats libraries. I haven't figured out how best to integrate in a developer-friendly way (HDDS-5559). But at least it can be run manually and I found it saves lots of time (not having to open lots of PRs in my fork).

@GeorgeJahad
Copy link
Contributor

I like the new bats script a lot. A couple of questions:

  1. The new bats script doesn't get invoked by hadoop-ozone/dev-support/checks/bats.sh.
    Should it be?

  2. When I ran it, I got these errors:

dev-support/ci/selective_ci_checks.sh: line 174: PR_DRAFT: unbound variable
dev-support/ci/bats-support/load.bash.bash does not exist

which I fixed like so:

-load bats-support/load.bash
-load bats-assert/load.bash
+export PR_DRAFT=false
+load bats-support/load
+load bats-assert/load

Was I running it wrong?

  1. Is it worth adding a comment like this?:

# This script confirms that selective_ci_checks.sh works correctly against known sha's.
# To run it you must clone and then install:
# https://github.com/sstephenson/bats.git

# Then clone the following repos into dev-support/ci:
# git clone https://github.com/ztombol/bats-assert
# git clone https://github.com/ztombol/bats-support

Everything else looks good. I like how the 3 functions were unified.

@adoroszlai
Copy link
Contributor Author

  1. The new bats script doesn't get invoked by hadoop-ozone/dev-support/checks/bats.sh.
    Should it be?

Ideally it should, but I know it currently does not. Filed HDDS-5559 for follow-up.

  1. When I ran it, I got these errors:
    ...
    Was I running it wrong?

Thank you for trying it. The only difference is that I used a bit newer version of bats (and its libraries).

  1. Is it worth adding a comment like this?:

Thanks, added with updated URLs.

@fapifta
Copy link
Contributor

fapifta commented Aug 11, 2021

Hi @adoroszlai

this PR seems to be very useful to reduce our build burden on available resources in github actions. So thank you very much to taking this on.

One question... do we have any general documentation about our CI flow, and how the different parts are working together based on what rules? If not, then it would be a nice follow up to document it together with how this part of the CI system is working in the project, if we already have something it would be great to extend with some docs for new selective checks.

@GeorgeJahad
Copy link
Contributor

it would be a nice follow up to document it together with how this
part of the CI system is working in the project, if we already have
something it would be great to extend with some docs for new
selective checks.

@adoroszlai @fapifta I can write something up if you want, but won't have time for a week or so. I would create a separate PR for that. Let me know if that would be helpful.

@fapifta
Copy link
Contributor

fapifta commented Aug 11, 2021

@GeorgeJahad sounds good to me unless Attila insists on doing it :) Would you mind create a separate JIRA to track the effort, and link it to HDDS-5000 as well?

@adoroszlai
Copy link
Contributor Author

Thanks @GeorgeJahad for the offer. Please feel free to take HDDS-5609.

Copy link
Contributor

@fapifta fapifta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeorgeJahad as I believe your questions were addressed just as mine, I am +1 to commit the changes. Please let Attila know if anything is missig still.

@vivekratnavel
Copy link
Contributor

Great work @adoroszlai and @GeorgeJahad! I am waiting for this PR to be committed.

@GeorgeJahad
Copy link
Contributor

GeorgeJahad commented Aug 11, 2021

Please let Attila know if anything is missig still.

I've checked all the most recent changes. Everything looks good to me too!
Thank you @adoroszlai !

@adoroszlai adoroszlai merged commit ee929dc into apache:master Aug 11, 2021
@adoroszlai
Copy link
Contributor Author

Thanks @fapifta, @GeorgeJahad and @vivekratnavel for the reviews.

errose28 added a commit to errose28/ozone that referenced this pull request Aug 12, 2021
* master:
  HDDS-5358. Incorrect cache entry invalidation causes intermittent failure in testGetS3SecretAndRevokeS3Secret (apache#2518)
  HDDS-5608. Fix wrong command in ugrade doc (apache#2524)
  HDDS-5000. Run CI checks selectively (apache#2479)
  HDDS-4929. Select target datanodes and containers to move for Container Balancer (apache#2441)
  HDDS-5283. getStorageSize cast to int can cause issue (apache#2303)
  HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key (apache#2489)
  HDDS-5558. vUnit invocation unit() may produce NPE (apache#2513)
  HDDS-5531. For Link Buckets avoid showing metadata. (apache#2502)
  HDDS-5549. Add 1.1 to supported versions in security policy (apache#2519)
  HDDS-5555. remove pipeline manager v1 code (apache#2511)
  HDDS-5546.OM Service ID change causes OM startup failure. (apache#2512)
  HDDS-5360. DN failed to process all delete block commands in one heartbeat interval (apache#2420)
  HDDS-5021. dev-support Dockerfile is badly outdated (apache#2480)
errose28 added a commit to errose28/ozone that referenced this pull request Aug 12, 2021
* master:
  HDDS-5358. Incorrect cache entry invalidation causes intermittent failure in testGetS3SecretAndRevokeS3Secret (apache#2518)
  HDDS-5608. Fix wrong command in ugrade doc (apache#2524)
  HDDS-5000. Run CI checks selectively (apache#2479)
  HDDS-4929. Select target datanodes and containers to move for Container Balancer (apache#2441)
  HDDS-5283. getStorageSize cast to int can cause issue (apache#2303)
  HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key (apache#2489)
  HDDS-5558. vUnit invocation unit() may produce NPE (apache#2513)
  HDDS-5531. For Link Buckets avoid showing metadata. (apache#2502)
  HDDS-5549. Add 1.1 to supported versions in security policy (apache#2519)
  HDDS-5555. remove pipeline manager v1 code (apache#2511)
  HDDS-5546.OM Service ID change causes OM startup failure. (apache#2512)
  HDDS-5360. DN failed to process all delete block commands in one heartbeat interval (apache#2420)
  HDDS-5021. dev-support Dockerfile is badly outdated (apache#2480)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Pull request that modifies the build process enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants