-
Notifications
You must be signed in to change notification settings - Fork 1.9k
document release process #875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0582574
29a6691
cdfa78f
c98b283
71dcd22
3fc3da3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,312 @@ | ||
| <!--- | ||
| Licensed to the Apache Software Foundation (ASF) under one | ||
| or more contributor license agreements. See the NOTICE file | ||
| distributed with this work for additional information | ||
| regarding copyright ownership. The ASF licenses this file | ||
| to you under the Apache License, Version 2.0 (the | ||
| "License"); you may not use this file except in compliance | ||
| with the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, | ||
| software distributed under the License is distributed on an | ||
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations | ||
| under the License. | ||
| --> | ||
|
|
||
| # Release Process | ||
|
|
||
| ## Sub-projects | ||
|
|
||
| The Datafusion repo contains 3 different releasable sub-projects: Datafusion, Ballista and Datafusion python binding. | ||
|
|
||
| We use Datafusion release to drive the release for the other sub-projects. As a | ||
| result, Datafusion version bump is required for every release while version | ||
| bumps for the Python binding and Ballista are optional. In other words, we can | ||
| release a new version of Datafusion without releasing a new version of the | ||
| Python binding or Ballista. On the other hand, releasing a new version of the | ||
| Python binding or Ballista always requires a new Datafusion version release. | ||
|
|
||
| ## Branching | ||
|
|
||
| Datafusion currently only releases from the `master` branch. Given the project | ||
| is still in early development state, we are not maintaining an active stable | ||
| release backport branch. | ||
|
|
||
| ## Prerequisite | ||
|
|
||
| - Have upstream git repo `git@github.com:apache/arrow-datafusion.git` add as git remote `apache`. | ||
| - Created a peronal access token in Github for changelog automation script. | ||
| - Github PAT should be created with `repo` access | ||
| - Make sure your signing key is added to the following files in SVN: | ||
| - https://dist.apache.org/repos/dist/dev/arrow/KEYS | ||
| - https://dist.apache.org/repos/dist/release/arrow/KEYS | ||
|
|
||
| ## Process Overview | ||
|
|
||
| As part of the Apache governance model, official releases consist of signed | ||
| source tarballs approved by the PMC. | ||
|
|
||
| We then use the code in the approved source tarball to release to crates.io and | ||
| PyPI. | ||
|
|
||
| ### Change Log | ||
|
|
||
| We maintain `CHANGELOG.md` for each sub project so our users know what has been | ||
| changed between releases. | ||
|
|
||
| The CHANGELOG is managed automatically using | ||
| [update_change_log.sh](https://github.com/apache/arrow-datafusion/blob/master/dev/release/update_change_log.sh) | ||
|
|
||
| This script creates a changelog using github PRs and issues based on the labels | ||
| associated with them. | ||
|
|
||
| ## Prepare release comimts and PR | ||
|
|
||
| Prepare a PR to update `CHANGELOG.md` and versions to reflect the planned | ||
| release. | ||
|
|
||
| See [#801](https://github.com/apache/arrow-datafusion/pull/801) for an example. | ||
|
|
||
| Here are the commands that could be used to prepare the `5.1.0` release: | ||
|
|
||
| ### Update Version | ||
|
|
||
| Checkout the master commit to be released | ||
|
|
||
| ``` | ||
| git fetch apache | ||
| git checkout apache/master | ||
| ``` | ||
|
|
||
| Update datafusion version in `datafusion/Cargo.toml` to `5.1.0`. | ||
|
|
||
| If there is a ballista release, update versions in ballista Cargo.tomls, run | ||
|
|
||
| ``` | ||
| ./dev/update_ballista_versions.py 0.5.0 | ||
| ``` | ||
|
|
||
| If there is a datafusion python binding release, update versions in | ||
| `./python/Cargo.toml`. | ||
|
|
||
| Lastly commit the version change: | ||
|
|
||
| ``` | ||
| git commit -a -m 'Update version' | ||
| ``` | ||
|
|
||
| ### Update CHANGELOG.md | ||
|
|
||
| Create local release rc tags: | ||
|
|
||
| ``` | ||
| git tag -f 5.1.0-rc-local | ||
| # if there is ballista release | ||
| git tag -f ballista-0.5.0-rc-local | ||
| # if there is python binding release | ||
| git tag -f python-0.3.0-rc-local | ||
| ``` | ||
|
|
||
| Manully edit the previous release version tag in | ||
| `dev/release/update_change_log-{ballista,datafusion,python}.sh`. Commits | ||
| between the previous verstion tag and the new rc tag will be used to | ||
| populate the changelog content. | ||
|
|
||
| ```bash | ||
| # create the changelog | ||
| CHANGELOG_GITHUB_TOKEN=<TOKEN> ./dev/release/update_change_log-all.sh | ||
| # review change log / edit issues and labels if needed, rerun until you are happy with the result | ||
| git commit -a -m 'Create changelog for release' | ||
| ``` | ||
|
|
||
| Note that when reviewing the change log, rather than editing the | ||
| `CHANGELOG.md`, it is preferred to update the issues and their labels. | ||
|
|
||
| You can add `invalid` or `development-process` label to exclude items from | ||
| release notes. Add `datafusion`, `ballista` and `python` labels to group items | ||
| into each sub-project's change log. | ||
|
|
||
| Send a PR to get these changes merged into `master` branch. If new commits that | ||
| could change the change log content landed in the `master` branch before you | ||
| could merge the PR, you need to rerun the changelog update script to regenerate | ||
| the changelog and update the PR accordingly. | ||
|
|
||
| ## Prepare release candidate tarball | ||
|
|
||
| After the PR gets merged, you are ready to create a releaes tarball from the | ||
| merged commit. | ||
|
|
||
| (Note you need to be a committer to run these scripts as they upload to the apache svn distribution servers) | ||
|
|
||
| ### Pick an Release Candidate (RC) number | ||
|
|
||
| Pick numbers in sequential order, with `0` for `rc0`, `1` for `rc1`, etc. | ||
|
|
||
| ### Create git tag for the release: | ||
|
|
||
| While the official release artifact is a signed tarball, we also tag the commit it was created for convenience and code archaeology. | ||
|
|
||
| Using a string such as `5.1.0` as the `<version>`, create and push the tag thusly: | ||
|
|
||
| ```shell | ||
| git fetch apache | ||
| git tag <version>-<rc> apache/master | ||
| # push tag to Github remote | ||
| git push apache <version> | ||
| ``` | ||
|
|
||
| ### Create, sign, and upload tarball | ||
|
|
||
| Run `create-tarball.sh` with the `<version>` tag and `<rc>` and you found in previous steps: | ||
|
|
||
| ```shell | ||
| ./dev/release/create-tarball.sh 5.1.0 0 | ||
| ``` | ||
|
|
||
| The `create-tarball.sh` script | ||
|
|
||
| 1. creates and uploads a release candidate tarball to the [arrow | ||
| dev](https://dist.apache.org/repos/dist/dev/arrow) location on the | ||
| apache distribution svn server | ||
|
|
||
| 2. provide you an email template to | ||
| send to dev@arrow.apache.org for release voting. | ||
|
|
||
| ### Vote on Release Candidate tarball | ||
|
|
||
| Send the email output from the script to dev@arrow.apache.org. The email should look like | ||
|
|
||
| ``` | ||
| To: dev@arrow.apache.org | ||
| Subject: [VOTE][Datafusion] Release Apache Arrow Datafusion 5.1.0 RC0 | ||
|
|
||
| Hi, | ||
|
|
||
| I would like to propose a release of Apache Arrow Datafusion Implementation, | ||
| version 5.1.0. | ||
|
|
||
| This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1] | ||
| The proposed release tarball and signatures are hosted at [2]. | ||
| The changelog is located at [3]. | ||
|
|
||
| Please download, verify checksums and signatures, run the unit tests, | ||
| and vote on the release. | ||
|
|
||
| The vote will be open for at least 72 hours. | ||
|
|
||
| [ ] +1 Release this as Apache Arrow Datafusion 5.1.0 | ||
| [ ] +0 | ||
| [ ] -1 Do not release this as Apache Arrow Datafusion 5.1.0 because... | ||
|
|
||
| [1]: https://github.com/apache/arrow-datafusion/tree/a5dd428f57e62db20a945e8b1895de91405958c4 | ||
| [2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-5.1.0 | ||
| [3]: https://github.com/apache/arrow-datafusion/blob/a5dd428f57e62db20a945e8b1895de91405958c4/CHANGELOG.md | ||
| ``` | ||
|
|
||
| For the release to become "official" it needs at least three PMC members to vote +1 on it. | ||
|
|
||
| ### Verifying Release Candidates | ||
|
|
||
| The `dev/release/verify-release-candidate.sh` is a script in this repository that can assist in the verification process. Run it like: | ||
|
|
||
| ``` | ||
| ./dev/release/verify-release-candidate.sh 5.1.0 0 | ||
| ``` | ||
|
|
||
| #### If the release is not approved | ||
|
|
||
| If the release is not approved, fix whatever the problem is, merge changelog | ||
| changes into master if there is any and try again with the next RC number. | ||
|
|
||
| ## Finalize the release | ||
|
|
||
| ### After the release is approved | ||
|
|
||
| Move tarball to the release location in SVN, e.g. | ||
| https://dist.apache.org/repos/dist/release/arrow/arrow-datafusion-5.1.0/, using | ||
| the `release-tarball.sh` script: | ||
|
|
||
| ```shell | ||
| ./dev/release/release-tarball.sh 5.1.0 0 | ||
| ``` | ||
|
|
||
| Congratulations! The release is now offical! | ||
|
|
||
| ### Create release git tags | ||
|
|
||
| Tag the same release candidate commit with the final release tag | ||
|
|
||
| ``` | ||
| git co apache/5.1.0-RC0 | ||
| git tag 5.1.0 | ||
| git push 5.1.0 | ||
| ``` | ||
|
|
||
| If there is a ballista release, also push the ballista tag | ||
|
|
||
| ``` | ||
| git tag ballista-0.5.0 | ||
| git push ballista-0.5.0 | ||
| ``` | ||
|
|
||
| If there is a datafusion python binding release, also push the python tag | ||
|
|
||
| ``` | ||
| git tag python-0.3.0 | ||
| git push python-0.3.0 | ||
| ``` | ||
|
|
||
| ### Publish on Crates.io | ||
|
|
||
| Only approved releases of the tarball should be published to | ||
| crates.io, in order to conform to Apache Software Foundation | ||
| governance standards. | ||
|
|
||
| An Arrow committer can publish this crate after an official project release has | ||
| been made to crates.io using the following instructions. | ||
|
|
||
| Follow [these | ||
| instructions](https://doc.rust-lang.org/cargo/reference/publishing.html) to | ||
| create an account and login to crates.io before asking to be added as an owner | ||
| of the following crates: | ||
|
|
||
| - [datafusion](https://crates.io/crates/datafusion) | ||
| - [ballista](https://crates.io/crates/ballista) | ||
| - [ballista-core](https://crates.io/crates/ballista-core) | ||
| - [ballista-executor](https://crates.io/crates/ballista-executor) | ||
| - [ballista-scheduler](https://crates.io/crates/ballista-scheduler) | ||
|
|
||
| Download and unpack the official release tarball | ||
|
|
||
| Verify that the Cargo.toml in the tarball contains the correct version | ||
| (e.g. `version = "5.1.0"`) and then publish the crate with the | ||
| following commands | ||
|
|
||
| ```shell | ||
| (cd datafusion && cargo publish) | ||
| ``` | ||
|
|
||
| If there is a ballista release, run | ||
|
|
||
| ```shell | ||
| (cd ballista/rust/client && cargo publish) | ||
| (cd ballista/rust/core && cargo publish) | ||
| (cd ballista/rust/executor && cargo publish) | ||
| (cd ballista/rust/scheduler && cargo publish) | ||
| ``` | ||
houqp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ### Publish on PyPI | ||
|
|
||
| TODO | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorgecarleitao Are you using
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the actual job is still present in the repo here. It uses a github action (based on twine I believe).
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. driven by tags. AFAIK currently we can't release binaries without first voting them (as they constitute artifacts beyond source code). This means that we also need to sign and push them to the apache server, i.e. we need to at least download them from github, sign them, upload them, and them upload them to pypi? In summary, it is difficult to automatize. A way to go here would be for Apache to offer a shared key pair that we could place in github secrets that would be used to sign artifacts. However, this requires restrictions on who can push tags to the repos to avoid anyone from releasing. Another issue is that the binary contains software that is not necessarily Apache 2.0 licensed, as the binary is compiled using all dependencies of the crates (e.g. Tokio is only MIT). I think that the same applies for Ballista docker images (not the Dockerfile), since the image now contains software beyond our direct licensing control (thinking about this notice in apache/arrow repo) (admittedly, I did not think about this when donated python-datafusion)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ha, ok, let me think more about how to best handle pypi release tomorrow. I totally missed the binary release part as well :(
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I believe this is not too bad to get started for now. Including the wheel binaries as part of the datafusion release artifacts for voting won't further complicate the current release process. The only difference is we will need to update I will also look into whether it's possible to automate the signing with Github Action by provisioning a Github Action key. Today we are already giving committer access to create signed release tarballs. I believe only committer and PMC member have access to create tags today? If so, tag push based release signing should work.
I will prepare a NOTICE file similar to what arrow has tomorrow for the python binding. |
||
|
|
||
| ### Call the vote | ||
|
|
||
| Call the vote on the Arrow dev list by replying to the RC voting thread. The | ||
| reply should have a new subject constructed by adding `[RESULT]` prefix to the | ||
| old subject line. | ||
|
|
||
| TODO: add example mail | ||
Uh oh!
There was an error while loading. Please reload this page.