From 058257460ff480ef1151269c6a6be30cb5012e58 Mon Sep 17 00:00:00 2001 From: Qingping Hou Date: Sat, 14 Aug 2021 01:08:49 -0700 Subject: [PATCH 1/6] document release process --- dev/release/README.md | 301 ++++++++++++++++++++ dev/release/update_change_log-ballista.sh | 2 +- dev/release/update_change_log-datafusion.sh | 2 +- dev/release/update_change_log-python.sh | 2 +- 4 files changed, 304 insertions(+), 3 deletions(-) create mode 100644 dev/release/README.md diff --git a/dev/release/README.md b/dev/release/README.md new file mode 100644 index 0000000000000..62aca440efc77 --- /dev/null +++ b/dev/release/README.md @@ -0,0 +1,301 @@ + + +# Release Process + +## Sub-projects + +The Datafusion repo contains 3 different releasable sub-projects: Datafusion, Ballista and Datafusion python binding. + +We use Datafusion release to drive the release for the other sub-projects. As a +result, Datafusion version bump is required for every release while version +bumps for the Python binding and Ballista are optional. In other words, we can +release a new version of Datafusion without releasing a new version of the +Python binding or Ballista. On the other hand, releasing a new version of the +Python binding or Ballista always requires a new Datafusion version release. + +## Branching + +Datafusion currently only releases from the `master` branch. Given the project +is still in early development state, we are not maintaining an active stable +release backport branch. + +## Prerequisite + +- Have upstream git repo `git@github.com:apache/arrow-datafusion.git` add as git remote `apache`. +- Created a peronal access token in Github for changelog automation script. + - Github PAT should be created with `repo` access +- Make sure your signing key is added to the following files in SVN: + - https://dist.apache.org/repos/dist/dev/arrow/KEYS + - https://dist.apache.org/repos/dist/release/arrow/KEYS + +## Process Overview + +As part of the Apache governance model, official releases consist of signed +source tarballs approved by the PMC. + +We then use the code in the approved source tarball to release to crates.io and +PyPI. + +### Change Log + +We maintain `CHANGELOG.md` for each sub project so our users know what has been +changed between releases. + +The CHANGELOG is managed automatically using +[update_change_log.sh](https://github.com/apache/arrow-datafusion/blob/master/dev/release/update_change_log.sh) + +This script creates a changelog using github PRs and issues based on the labels +associated with them. + +## Prepare release comimts and PR + +Prepare a PR to update `CHANGELOG.md` and versions to reflect the planned +release. + +See [#801](https://github.com/apache/arrow-datafusion/pull/801) for an example. + +Here are the commands that could be used to prepare the `5.1.0` release: + +### Update Version + +Checkout the master commit to be released + +``` +git fetch apache +git checkout apache/master +``` + +Update datafusion version in `datafusion/Cargo.toml` to `5.1.0`. + +If there is new ballista release, update versions in ballista Cargo.tomls, run + +``` +./dev/update_ballista_versions.py 0.5.0 +``` + +If there is new datafusion python binding release, update versions in +`./python/Cargo.toml`. + +Lastly commit the version change: + +``` +git commit -a -m 'Update version' +``` + +### Update CHANGELOG.md + +Create local release rc tags: + +``` +git tag -f 5.1.0-rc-local +# if there is ballista release +git tag -f ballista-0.5.0-rc-local +# if there is python binding release +git tag -f python-0.3.0-rc-local +``` + +Manully edit the previous release version tag in +`dev/release/update_change_log-{ballista,datafusion,python}.sh`. Commits +between the previous verstion tag and the new rc tag will be used to +populate the changelog content. + +```bash +# create the changelog +CHANGELOG_GITHUB_TOKEN= ./dev/release/update_change_log-all.sh +# review change log / edit issues and labels if needed, rerun until you are happy with the result +git commit -a -m 'Create changelog for release' +``` + +Note that when reviewing the change log, rather than editing the +`CHANGELOG.md`, it is preferred to update the issues and their labels. + +You can add `invalid` or `development-process` label to exclude items from +release notes. Add `datafusion`, `ballista` and `python` labels to group items +into each sub-project's change log. + +Send a PR to get these changes merged into `master` branch. If new commits that +could change the change log content landed in the `master` branch before you +could merge the PR, you need to rerun the changelog update script to regenerate +the changelog and update the PR accordingly. + +## Prepare release candidate tarball + +After the PR gets merged, you are ready to create a releaes tarball from the +merged commit. + +(Note you need to be a committer to run these scripts as they upload to the apache svn distribution servers) + +### Pick an Release Candidate (RC) number + +Pick numbers in sequential order, with `0` for `rc0`, `1` for `rc1`, etc. + +### Create git tag for the release: + +While the official release artifact is a signed tarball, we also tag the commit it was created for convenience and code archaeology. + +Using a string such as `5.1.0` as the ``, create and push the tag thusly: + +```shell +git fetch apache +git tag - apache/master +# push tag to Github remote +git push apache +``` + +### Create, sign, and upload tarball + +Run `create-tarball.sh` with the `` tag and `` and you found in previous steps: + +```shell +./dev/release/create-tarball.sh 5.1.0 0 +``` + +The `create-tarball.sh` script + +1. creates and uploads a release candidate tarball to the [arrow + dev](https://dist.apache.org/repos/dist/dev/arrow) location on the + apache distribution svn server + +2. provide you an email template to + send to dev@arrow.apache.org for release voting. + +### Vote on Release Candidate tarball + +Send the email output from the script to dev@arrow.apache.org. The email should look like + +``` +To: dev@arrow.apache.org +Subject: [VOTE][RUST][Datafusion] Release Apache Arrow Datafusion 5.1.0 RC0 + +Hi, + +I would like to propose a release of Apache Arrow Datafusion Implementation, +version 5.1.0. + +This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1] +The proposed release tarball and signatures are hosted at [2]. +The changelog is located at [3]. + +Please download, verify checksums and signatures, run the unit tests, +and vote on the release. + +The vote will be open for at least 72 hours. + +[ ] +1 Release this as Apache Arrow Datafusion 5.1.0 +[ ] +0 +[ ] -1 Do not release this as Apache Arrow Datafusion 5.1.0 because... + +[1]: https://github.com/apache/arrow-datafusion/tree/a5dd428f57e62db20a945e8b1895de91405958c4 +[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-5.1.0 +[3]: https://github.com/apache/arrow-datafusion/blob/a5dd428f57e62db20a945e8b1895de91405958c4/CHANGELOG.md +``` + +For the release to become "official" it needs at least three PMC members to vote +1 on it. + +### Verifying Release Candidates + +The `dev/release/verify-release-candidate.sh` is a script in this repository that can assist in the verification process. Run it like: + +``` +./dev/release/verify-release-candidate.sh 5.1.0 0 +``` + +#### If the release is not approved + +If the release is not approved, fix whatever the problem is and try again with the next RC number + +### If the release is approved, + +Move tarball to the release location in SVN, e.g. +https://dist.apache.org/repos/dist/release/arrow/arrow-datafusion-4.1.0/, using +the `release-tarball.sh` script: + +```shell +./dev/release/release-tarball.sh 5.1.0 0 +``` + +Congratulations! The release is now offical! + +## Finalize the release + +### Create release git tags + +Tag the same release candidate commit with the final release tag + +``` +git co apache/5.1.0-RC0 +git tag 5.1.0 +git push 5.1.0 +``` + +If there is ballista release, also push the ballista tag + +``` +git tag ballista-0.5.0 +git push ballista-0.5.0 +``` + +If there is datafusion python binding release, also push the python tag + +``` +git tag python-0.3.0 +git push python-0.3.0 +``` + +### Publish on Crates.io + +Only approved releases of the tarball should be published to +crates.io, in order to conform to Apache Software Foundation +governance standards. + +An Arrow committer can publish this crate after an official project release has +been made to crates.io using the following instructions. + +Follow [these +instructions](https://doc.rust-lang.org/cargo/reference/publishing.html) to +create an account and login to crates.io before asking to be added as an owner +of the following crates: + +- [datafusion](https://crates.io/crates/datafusion) +- [ballista-core](https://crates.io/crates/ballista-core) +- [ballista-executor](https://crates.io/crates/ballista-executor) +- [ballista-scheduler](https://crates.io/crates/ballista-scheduler) + +Download and unpack the official release tarball + +Verify that the Cargo.toml in the tarball contains the correct version +(e.g. `version = "5.1.0"`) and then publish the crate with the +following commands + +```shell +(cd datafusion && cargo publish) +``` + +If there is ballista release, run + +```shell +(cd ballista/rust/core && cargo publish) +(cd ballista/rust/executor && cargo publish) +(cd ballista/rust/scheduler && cargo publish) +``` + +### Publish on PyPI + +TODO diff --git a/dev/release/update_change_log-ballista.sh b/dev/release/update_change_log-ballista.sh index 68193156622a2..05c5f6fe69849 100755 --- a/dev/release/update_change_log-ballista.sh +++ b/dev/release/update_change_log-ballista.sh @@ -25,4 +25,4 @@ SOURCE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" SOURCE_TOP_DIR="$(cd "${SOURCE_DIR}/../../" && pwd)" CURRENT_VER=$(grep version "${SOURCE_TOP_DIR}/ballista/rust/client/Cargo.toml" | head -n 1 | awk '{print $3}' | tr -d '"') -${SOURCE_DIR}/update_change_log.sh ballista 4.0.0 "ballista-${CURRENT_VER}" +${SOURCE_DIR}/update_change_log.sh ballista 4.0.0 "ballista-${CURRENT_VER}-rc-local" diff --git a/dev/release/update_change_log-datafusion.sh b/dev/release/update_change_log-datafusion.sh index f0f455ad1c9b5..1570c91252756 100755 --- a/dev/release/update_change_log-datafusion.sh +++ b/dev/release/update_change_log-datafusion.sh @@ -25,4 +25,4 @@ SOURCE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" SOURCE_TOP_DIR="$(cd "${SOURCE_DIR}/../../" && pwd)" CURRENT_VER=$(grep version "${SOURCE_TOP_DIR}/datafusion/Cargo.toml" | head -n 1 | awk '{print $3}' | tr -d '"') -${SOURCE_DIR}/update_change_log.sh datafusion 4.0.0 "${CURRENT_VER}" +${SOURCE_DIR}/update_change_log.sh datafusion 4.0.0 "${CURRENT_VER}-rc-local" diff --git a/dev/release/update_change_log-python.sh b/dev/release/update_change_log-python.sh index a48a5b657c5f3..6b864f9be1b2e 100755 --- a/dev/release/update_change_log-python.sh +++ b/dev/release/update_change_log-python.sh @@ -25,4 +25,4 @@ SOURCE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" SOURCE_TOP_DIR="$(cd "${SOURCE_DIR}/../../" && pwd)" CURRENT_VER=$(grep version "${SOURCE_TOP_DIR}/python/Cargo.toml" | head -n 1 | awk '{print $3}' | tr -d '"') -${SOURCE_DIR}/update_change_log.sh python 4.0.0 "python-${CURRENT_VER}" +${SOURCE_DIR}/update_change_log.sh python 4.0.0 "python-${CURRENT_VER}-rc-local" From 29a6691f61bce5bf0e7b7ec0f07c5d7f9d1556ea Mon Sep 17 00:00:00 2001 From: Qingping Hou Date: Sat, 14 Aug 2021 01:17:59 -0700 Subject: [PATCH 2/6] update sections --- dev/release/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 62aca440efc77..6f4444a06c193 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -221,10 +221,12 @@ The `dev/release/verify-release-candidate.sh` is a script in this repository tha If the release is not approved, fix whatever the problem is and try again with the next RC number -### If the release is approved, +## Finalize the release + +### After the release is approved Move tarball to the release location in SVN, e.g. -https://dist.apache.org/repos/dist/release/arrow/arrow-datafusion-4.1.0/, using +https://dist.apache.org/repos/dist/release/arrow/arrow-datafusion-5.1.0/, using the `release-tarball.sh` script: ```shell @@ -233,8 +235,6 @@ the `release-tarball.sh` script: Congratulations! The release is now offical! -## Finalize the release - ### Create release git tags Tag the same release candidate commit with the final release tag From cdfa78fcf03e521c6403f245243b45587f0a260c Mon Sep 17 00:00:00 2001 From: Qingping Hou Date: Sat, 14 Aug 2021 11:24:41 -0700 Subject: [PATCH 3/6] cover ballista-client in releaes doc --- dev/release/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/dev/release/README.md b/dev/release/README.md index 6f4444a06c193..9a0bd5ddd5813 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -274,6 +274,7 @@ create an account and login to crates.io before asking to be added as an owner of the following crates: - [datafusion](https://crates.io/crates/datafusion) +- [ballista](https://crates.io/crates/ballista) - [ballista-core](https://crates.io/crates/ballista-core) - [ballista-executor](https://crates.io/crates/ballista-executor) - [ballista-scheduler](https://crates.io/crates/ballista-scheduler) @@ -291,6 +292,7 @@ following commands If there is ballista release, run ```shell +(cd ballista/rust/client && cargo publish) (cd ballista/rust/core && cargo publish) (cd ballista/rust/executor && cargo publish) (cd ballista/rust/scheduler && cargo publish) From c98b283450c8688662fcf9a5743f36d2422b678a Mon Sep 17 00:00:00 2001 From: Qingping Hou Date: Sat, 14 Aug 2021 13:04:51 -0700 Subject: [PATCH 4/6] add step to call the release vote --- dev/release/README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/dev/release/README.md b/dev/release/README.md index 9a0bd5ddd5813..319d8a24f4e00 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -301,3 +301,11 @@ If there is ballista release, run ### Publish on PyPI TODO + +### Call the vote + +Call the vote on the Arrow dev list by replying to the RC voting thread. The +reply should have a new subject constructed by adding `RESULT ` prefix to the +old subject line. + +TODO: add example mail From 71dcd2231aa2f7b4854d77135db39a9db29097ec Mon Sep 17 00:00:00 2001 From: Qingping Hou Date: Sat, 14 Aug 2021 17:11:16 -0700 Subject: [PATCH 5/6] wrap RESULT prefix with brackets --- dev/release/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 319d8a24f4e00..279c5385c7fd6 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -182,7 +182,7 @@ Send the email output from the script to dev@arrow.apache.org. The email should ``` To: dev@arrow.apache.org -Subject: [VOTE][RUST][Datafusion] Release Apache Arrow Datafusion 5.1.0 RC0 +Subject: [VOTE][Datafusion] Release Apache Arrow Datafusion 5.1.0 RC0 Hi, @@ -305,7 +305,7 @@ TODO ### Call the vote Call the vote on the Arrow dev list by replying to the RC voting thread. The -reply should have a new subject constructed by adding `RESULT ` prefix to the +reply should have a new subject constructed by adding `[RESULT]` prefix to the old subject line. TODO: add example mail From 3fc3da3dbd49096f4b0b5861375b0c6dacc90f2b Mon Sep 17 00:00:00 2001 From: Qingping Hou Date: Sat, 14 Aug 2021 17:28:46 -0700 Subject: [PATCH 6/6] update instructions for how to handle rejected vote --- dev/release/README.md | 13 +++++++------ dev/release/create-tarball.sh | 5 +++++ 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 279c5385c7fd6..7a515732df652 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -84,13 +84,13 @@ git checkout apache/master Update datafusion version in `datafusion/Cargo.toml` to `5.1.0`. -If there is new ballista release, update versions in ballista Cargo.tomls, run +If there is a ballista release, update versions in ballista Cargo.tomls, run ``` ./dev/update_ballista_versions.py 0.5.0 ``` -If there is new datafusion python binding release, update versions in +If there is a datafusion python binding release, update versions in `./python/Cargo.toml`. Lastly commit the version change: @@ -219,7 +219,8 @@ The `dev/release/verify-release-candidate.sh` is a script in this repository tha #### If the release is not approved -If the release is not approved, fix whatever the problem is and try again with the next RC number +If the release is not approved, fix whatever the problem is, merge changelog +changes into master if there is any and try again with the next RC number. ## Finalize the release @@ -245,14 +246,14 @@ git tag 5.1.0 git push 5.1.0 ``` -If there is ballista release, also push the ballista tag +If there is a ballista release, also push the ballista tag ``` git tag ballista-0.5.0 git push ballista-0.5.0 ``` -If there is datafusion python binding release, also push the python tag +If there is a datafusion python binding release, also push the python tag ``` git tag python-0.3.0 @@ -289,7 +290,7 @@ following commands (cd datafusion && cargo publish) ``` -If there is ballista release, run +If there is a ballista release, run ```shell (cd ballista/rust/client && cargo publish) diff --git a/dev/release/create-tarball.sh b/dev/release/create-tarball.sh index ffcb430b5c7c1..94318d0777700 100755 --- a/dev/release/create-tarball.sh +++ b/dev/release/create-tarball.sh @@ -86,6 +86,11 @@ The changelog is located at [3]. Please download, verify checksums and signatures, run the unit tests, and vote on the release. The vote will be open for at least 72 hours. +Only votes from PMC members are binding, but all members of the community are +encouraged to test the release and vote with "(non-binding)". + +The standard verification procedure is documented at https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md#verifying-release-candidates. + [ ] +1 Release this as Apache Arrow Datafusion ${version} [ ] +0 [ ] -1 Do not release this as Apache Arrow Datafusion ${version} because...