Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
74906f8
Fake commit
raulcd Oct 19, 2022
b0c8cbd
Dask and Pandas repositories now use default branches named 'main'.
lafiona Aug 31, 2022
09eb082
Replace 'master' with 'default' in comment about Travis CI default be…
kevingurney Aug 31, 2022
363c7b3
Remove mention of "master" from help text for --arrow-branch crossbow…
kevingurney Aug 31, 2022
d5fa3d4
Pandas repository uses 'main' as the default branch.
lafiona Aug 31, 2022
4a4f987
Add base_branch property to Release object and modify commits_to_pick…
lafiona Aug 31, 2022
ef4342d
Modify 'archery' command line interface to reference the mainline dev…
lafiona Sep 1, 2022
0fcd902
Dynamically compute the default branch name for archery and crossbow …
lafiona Sep 8, 2022
7cc70a7
Performed python linting
lafiona Sep 8, 2022
20d6c22
remove duplicate code
lafiona Sep 13, 2022
f850d52
Print debugging info for default_branch_name Repo class function
lafiona Sep 13, 2022
5ff60d7
Print more debugging info
lafiona Sep 13, 2022
ec4bdf7
Remove resolve_refish print command
lafiona Sep 13, 2022
8dd439e
Add new line between reference object details
lafiona Sep 13, 2022
7732c59
Print branches
lafiona Sep 13, 2022
3081853
Use environment variable, DEFAULT_BRANCH, that is set in the yml file.
lafiona Sep 14, 2022
fd62996
Remove string concatenation, types incompatible
lafiona Sep 14, 2022
eb36ffb
Enable both CI workflows and local repository workflows for getting d…
lafiona Sep 14, 2022
0e8c8ea
Enable both CI workflows and local repository workflows for getting d…
lafiona Sep 14, 2022
c0d5ea2
Add DEFAULT_BRANCH environment variable to archery.yml test step for …
lafiona Sep 14, 2022
07226f9
Set workflow-wide environment variable, DEFAULT_BRANCH, for archery.yml
lafiona Sep 14, 2022
99938f7
Print reason for skipping tests
lafiona Sep 14, 2022
63ecf94
Add 'enable-integration' flag to ensure crossbowcli tests run
lafiona Sep 14, 2022
1f0138e
Factor out GitFixup step DEFAULT_BRANCH value
lafiona Sep 14, 2022
2e07401
Run python linting
lafiona Sep 15, 2022
96e6535
Address bare except and line lengths
lafiona Sep 19, 2022
c66f3f1
Add context to error message when obtaining default branch name.
lafiona Sep 19, 2022
09d898f
add debugging print statement in archery/archery/release/core.py comm…
lafiona Sep 20, 2022
873cd8c
Debugging statements for DefaultBranchName constructor
lafiona Sep 20, 2022
9b94745
Remove base_branch property of Release, instead add default_branch_pr…
lafiona Sep 20, 2022
a6edc7f
Use separate function for computing default branch.
lafiona Sep 20, 2022
ba7fc6f
Refactor the default branch code to be calculated within Release class
lafiona Sep 20, 2022
c097998
Add DEFAULT_BRANCH environment variable to Execute Docker Build step …
lafiona Sep 21, 2022
4139d82
In integration.yml, merge edits from default branch and current feature
lafiona Sep 21, 2022
7d6ab62
Add DEFAULT_BRANCH env var for archery docker run command in .travis.yml
lafiona Sep 22, 2022
7e1d68d
Use git command to get default branch name in .travis.yml
lafiona Sep 23, 2022
d587402
Fix integration.yml merge
lafiona Sep 23, 2022
d944930
Set and export the DEFAULT_BRANCH env var for the archery command.
lafiona Sep 23, 2022
6a2dccd
Remove computation for default branch name from module loading step i…
lafiona Oct 12, 2022
dc6e939
Removing error if default branch cannot be determined, default to
lafiona Oct 12, 2022
3bd8a2e
Alphabetize the standard library imports in dev/archery/archery/relea…
lafiona Oct 12, 2022
c1670e6
Remove error in the case that the default branch name could not be de…
lafiona Oct 12, 2022
a7793a0
Remame DEFAULT_BRANCH env var to ARCHERY_DEFAULT_RBANCH
lafiona Oct 12, 2022
40c6390
Reuse arrow Repo object for getting the default branch name, if needed
lafiona Oct 13, 2022
3775c65
Update the dask and pandas install scripts to use default branch comp…
lafiona Oct 13, 2022
669a14e
Change the flag for indicating upstream development version of Pandas…
lafiona Oct 14, 2022
ddce92e
Run python linting
lafiona Oct 14, 2022
f884601
Update Dask and Pandas version flag in tasks.yml and dev/archery/arch…
lafiona Oct 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/archery.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ on:
- 'dev/tasks/**'
- 'docker-compose.yml'

env:
ARCHERY_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true
Expand All @@ -52,9 +55,7 @@ jobs:
fetch-depth: 0
- name: Git Fixup
shell: bash
run: |
DEFAULT_BRANCH=${{ github.event.repository.default_branch }}
git branch $DEFAULT_BRANCH origin/$DEFAULT_BRANCH || true
run: git branch $ARCHERY_DEFAULT_BRANCH origin/$ARCHERY_DEFAULT_BRANCH || true
- name: Setup Python
uses: actions/setup-python@v4
with:
Expand Down
16 changes: 15 additions & 1 deletion .github/workflows/comment_bot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,29 @@ permissions:
jobs:
crossbow:
name: Listen!
if: startsWith(github.event.comment.body, '@github-actions crossbow')
if: ${{ github.event.issue.pull_request && startsWith(github.event.comment.body, '@github-actions crossbow')}}
runs-on: ubuntu-latest
steps:
- name: Get PR SHA
id: sha
uses: actions/github-script@v4
with:
result-encoding: string
script: |
const { owner, repo, number } = context.issue;
const pr = await github.pulls.get({
owner,
repo,
pull_number: number,
});
return pr.data.head.sha
- name: Checkout Arrow
uses: actions/checkout@v3
with:
path: arrow
# fetch the tags for version number generation
fetch-depth: 0
ref: ${{ steps.sha.outputs.result }}
- name: Set up Python
uses: actions/setup-python@v4
with:
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,10 @@ jobs:
env:
ARCHERY_DOCKER_USER: ${{ secrets.DOCKERHUB_USER }}
ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}
run: archery docker run -e ARCHERY_INTEGRATION_WITH_RUST=1 conda-integration
run: >
archery docker run -e ARCHERY_INTEGRATION_WITH_RUST=1 -e
ARCHERY_DEFAULT_BRANCH=${{ github.event.repository.default_branch }}
conda-integration
- name: Docker Push
if: success() && github.event_name == 'push' && github.repository == 'apache/arrow'
env:
Expand Down
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ install:
- sudo -H pip3 install -e dev/archery[docker]

script:
- export ARCHERY_DEFAULT_BRANCH=$(git rev-parse --abbrev-ref origin/HEAD | sed s@origin/@@)
- |
archery docker run \
${DOCKER_RUN_ARGS} \
Expand Down
2 changes: 1 addition & 1 deletion ci/scripts/install_dask.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ fi

dask=$1

if [ "${dask}" = "master" ]; then
if [ "${dask}" = "upstream_devel" ]; then
pip install https://github.com/dask/dask/archive/main.tar.gz#egg=dask[dataframe]
elif [ "${dask}" = "latest" ]; then
pip install dask[dataframe]
Expand Down
2 changes: 1 addition & 1 deletion ci/scripts/install_pandas.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ else
pip install numpy==${numpy}
fi

if [ "${pandas}" = "master" ]; then
if [ "${pandas}" = "upstream_devel" ]; then
pip install git+https://github.com/pandas-dev/pandas.git --no-build-isolation
elif [ "${pandas}" = "nightly" ]; then
pip install --extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple --pre pandas
Expand Down
13 changes: 8 additions & 5 deletions dev/archery/archery/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ def benchmark_run(ctx, rev_or_path, src, preserve, output, cmake_extras,
help="Hide counters field in diff report.")
@click.argument("contender", metavar="[<contender>",
default=ArrowSources.WORKSPACE, required=False)
@click.argument("baseline", metavar="[<baseline>]]", default="origin/master",
@click.argument("baseline", metavar="[<baseline>]]", default="origin/HEAD",
required=False)
@click.pass_context
def benchmark_diff(ctx, src, preserve, output, language, cmake_extras,
Expand All @@ -542,7 +542,8 @@ def benchmark_diff(ctx, src, preserve, output, language, cmake_extras,

The caller can optionally specify both the contender and the baseline. If
unspecified, the contender will default to the current workspace (like git)
and the baseline will default to master.
and the baseline will default to the mainline development branch (i.e.
default git branch).

Each target (contender or baseline) can either be a git revision
(commit, tag, special values like HEAD) or a cmake build directory. This
Expand All @@ -559,16 +560,18 @@ def benchmark_diff(ctx, src, preserve, output, language, cmake_extras,
Examples:

\b
# Compare workspace (contender) with master (baseline)
# Compare workspace (contender) against the mainline development branch
# (baseline)
\b
archery benchmark diff

\b
# Compare master (contender) with latest version (baseline)
# Compare the mainline development branch (contender) against the latest
# version (baseline)
\b
export LAST=$(git tag -l "apache-arrow-[0-9]*" | sort -rV | head -1)
\b
archery benchmark diff master "$LAST"
archery benchmark diff <default-branch> "$LAST"

\b
# Compare g++7 (contender) with clang++-8 (baseline) builds
Expand Down
12 changes: 9 additions & 3 deletions dev/archery/archery/crossbow/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def check_config(obj, config_path):
'locally. Examples: https://github.com/apache/arrow or '
'https://github.com/kszucs/arrow.')
@click.option('--arrow-branch', '-b', default=None,
help='Give the branch name explicitly, e.g. master, ARROW-1949.')
help='Give the branch name explicitly, e.g. ARROW-1949.')
@click.option('--arrow-sha', '-t', default=None,
help='Set commit SHA or Tag name explicitly, e.g. f67a515, '
'apache-arrow-0.11.1.')
Expand Down Expand Up @@ -157,7 +157,7 @@ def submit(obj, tasks, groups, params, job_prefix, config_path, arrow_version,


@crossbow.command()
@click.option('--base-branch', default="master",
@click.option('--base-branch', default=None,
help='Set base branch for the PR.')
@click.option('--create-pr', is_flag=True, default=False,
help='Create GitHub Pull Request')
Expand Down Expand Up @@ -192,6 +192,12 @@ def verify_release_candidate(obj, base_branch, create_pr,

# Redefine Arrow repo to use the correct arrow remote.
arrow = Repo(path=obj['arrow'].path, remote_url=remote)

# Default value for base_branch is the repository's default branch name
if base_branch is None:
# Get the default branch name from the repository
base_branch = arrow.default_branch_name

response = arrow.github_pr(title=pr_title, head=head_branch,
base=base_branch, body=pr_body,
github_token=obj['queue'].github_token,
Expand Down Expand Up @@ -225,7 +231,7 @@ def verify_release_candidate(obj, base_branch, create_pr,
'locally. Examples: https://github.com/apache/arrow or '
'https://github.com/kszucs/arrow.')
@click.option('--arrow-branch', '-b', default=None,
help='Give the branch name explicitly, e.g. master, ARROW-1949.')
help='Give the branch name explicitly, e.g. ARROW-1949.')
@click.option('--arrow-sha', '-t', default=None,
help='Set commit SHA or Tag name explicitly, e.g. f67a515, '
'apache-arrow-0.11.1.')
Expand Down
38 changes: 34 additions & 4 deletions dev/archery/archery/crossbow/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from io import StringIO
from pathlib import Path
from datetime import date
import warnings

import jinja2
from ruamel.yaml import YAML
Expand Down Expand Up @@ -133,7 +134,7 @@ def format_all(items, pattern):

# configurations for setting up branch skipping
# - appveyor has a feature to skip builds without an appveyor.yml
# - travis reads from the master branch and applies the rules
# - travis reads from the default branch and applies the rules
# - circle requires the configuration to be present on all branch, even ones
# that are configured to be skipped
# - azure skips branches without azure-pipelines.yml by default
Expand Down Expand Up @@ -361,6 +362,29 @@ def signature(self):
return pygit2.Signature(self.user_name, self.user_email,
int(time.time()))

@property
def default_branch_name(self):
default_branch_name = os.getenv("ARCHERY_DEFAULT_BRANCH")

if default_branch_name is None:
try:
ref_obj = self.repo.references["refs/remotes/origin/HEAD"]
target_name = ref_obj.target
target_name_tokenized = target_name.split("/")
default_branch_name = target_name_tokenized[-1]
except KeyError:
# TODO: ARROW-18011 to track changing the hard coded default
# value from "master" to "main".
default_branch_name = "master"
warnings.warn('Unable to determine default branch name: '
'ARCHERY_DEFAULT_BRANCH environment variable is '
'not set. Git repository does not contain a '
'\'refs/remotes/origin/HEAD\'reference. Setting '
'the default branch name to' +
default_branch_name, RuntimeWarning)

return default_branch_name

def create_tree(self, files):
builder = self.repo.TreeBuilder()

Expand All @@ -382,7 +406,7 @@ def create_commit(self, files, parents=None, message='',
if parents is None:
# by default use the main branch as the base of the new branch
# required to reuse github actions cache across crossbow tasks
commit, _ = self.repo.resolve_refish("master")
commit, _ = self.repo.resolve_refish(self.default_branch_name)
parents = [commit.id]
tree_id = self.create_tree(files)

Expand Down Expand Up @@ -546,8 +570,10 @@ def github_overwrite_release_assets(self, tag_name, target_commitish,
'Unsupported upload method {}'.format(method)
)

def github_pr(self, title, head=None, base="master", body=None,
def github_pr(self, title, head=None, base=None, body=None,
github_token=None, create=False):
# Default value for base is the default_branch name()
base = self.default_branch_name() if base is None else base
github_token = github_token or self.github_token
repo = self.as_github_repo(github_token=github_token)
if create:
Expand Down Expand Up @@ -1289,11 +1315,15 @@ def validate(self):
'is: `{}`'.format(task_name, str(e))
)

# Get the default branch name from the repository
arrow_source_dir = ArrowSources.find()
repo = Repo(arrow_source_dir.path)

# validate that the defined tasks are renderable, in order to to that
# define the required object with dummy data
target = Target(
head='e279a7e06e61c14868ca7d71dea795420aea6539',
branch='master',
branch=repo.default_branch_name,
remote='https://github.com/apache/arrow',
version='1.0.0dev123',
r_version='0.13.0.100000123',
Expand Down
3 changes: 2 additions & 1 deletion dev/archery/archery/docker/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,8 @@ def docker_run(obj, image, command, *, env, user, force_pull, force_build,
PYTHON=3.8 archery docker run conda-python

# disable the cache only for the leaf image
PANDAS=master archery docker run --no-leaf-cache conda-python-pandas
PANDAS=upstream_devel archery docker run --no-leaf-cache
conda-python-pandas

# entirely skip building the image
archery docker run --no-pull --no-build conda-python
Expand Down
8 changes: 4 additions & 4 deletions dev/archery/archery/docker/tests/test_docker.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,12 +259,12 @@ def test_arrow_example_validation_passes(arrow_compose_path):
def test_compose_default_params_and_env(arrow_compose_path):
compose = DockerCompose(arrow_compose_path, params=dict(
UBUNTU='18.04',
DASK='master'
DASK='upstream_devel'
))
assert compose.config.dotenv == arrow_compose_env
assert compose.config.params == {
'UBUNTU': '18.04',
'DASK': 'master',
'DASK': 'upstream_devel',
}


Expand Down Expand Up @@ -492,7 +492,7 @@ def test_compose_push(arrow_compose_path):
def test_compose_error(arrow_compose_path):
compose = DockerCompose(arrow_compose_path, params=dict(
PYTHON='3.8',
PANDAS='master'
PANDAS='upstream_devel'
))

error = subprocess.CalledProcessError(99, [])
Expand All @@ -503,7 +503,7 @@ def test_compose_error(arrow_compose_path):
exception_message = str(exc.value)
assert "exited with a non-zero exit code 99" in exception_message
assert "PANDAS: latest" in exception_message
assert "export PANDAS=master" in exception_message
assert "export PANDAS=upstream_devel" in exception_message


def test_image_with_gpu(arrow_compose_path):
Expand Down
48 changes: 44 additions & 4 deletions dev/archery/archery/release/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@
from abc import abstractmethod
from collections import defaultdict
import functools
import re
import os
import pathlib
import re
import shelve
import warnings

Expand Down Expand Up @@ -361,6 +362,45 @@ def commits(self):
commit_range = f"{lower}..{upper}"
return list(map(Commit, self.repo.iter_commits(commit_range)))

@cached_property
def default_branch(self):
default_branch_name = os.getenv("ARCHERY_DEFAULT_BRANCH")

if default_branch_name is None:
try:
# Set up repo object
arrow = ArrowSources.find()
repo = Repo(arrow.path)
origin = repo.remotes["origin"]
origin_refs = origin.refs

# Get git.RemoteReference object to origin/HEAD
origin_head = origin_refs["HEAD"]

# Get git.RemoteReference object to origin/main or
# origin/master
origin_head_reference = origin_head.reference

# Get string value of remote head reference, should return
# "origin/main" or "origin/master"
origin_head_name = origin_head_reference.name
origin_head_name_tokenized = origin_head_name.split("/")

# The last token is the default branch name
default_branch_name = origin_head_name_tokenized[-1]
except KeyError:
# TODO: ARROW-18011 to track changing the hard coded default
# value from "master" to "main".
default_branch_name = "master"
warnings.warn('Unable to determine default branch name: '
'ARCHERY_DEFAULT_BRANCH environment variable is '
'not set. Git repository does not contain a '
'\'refs/remotes/origin/HEAD\'reference. Setting '
'the default branch name to' +
default_branch_name, RuntimeWarning)

return default_branch_name

def curate(self, minimal=False):
# handle commits with parquet issue key specially and query them from
# jira and add it to the issues
Expand Down Expand Up @@ -422,9 +462,9 @@ def changelog(self):
return JiraChangelog(release=self, categories=categories)

def commits_to_pick(self, exclude_already_applied=True):
# collect commits applied on the main branch since the root of the
# collect commits applied on the default branch since the root of the
# maintenance branch (the previous major release)
commit_range = f"{self.previous.tag}..master"
commit_range = f"{self.previous.tag}..{self.default_branch}"

# keeping the original order of the commits helps to minimize the merge
# conflicts during cherry-picks
Expand Down Expand Up @@ -476,7 +516,7 @@ def branch(self):

@property
def base_branch(self):
return "master"
return self.default_branch

@cached_property
def siblings(self):
Expand Down
4 changes: 2 additions & 2 deletions dev/tasks/tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1492,7 +1492,7 @@ tasks:
("3.7", "latest", "latest", False),
("3.8", "latest", "latest", False),
("3.8", "nightly", "nightly", False),
("3.9", "master", "nightly", False)] %}
("3.9", "upstream_devel", "nightly", False)] %}
test-conda-python-{{ python_version }}-pandas-{{ pandas_version }}:
ci: github
template: docker-tests/github.linux.yml
Expand All @@ -1512,7 +1512,7 @@ tasks:
image: conda-python-pandas
{% endfor %}

{% for dask_version in ["latest", "master"] %}
{% for dask_version in ["latest", "upstream_devel"] %}
test-conda-python-3.9-dask-{{ dask_version }}:
ci: github
template: docker-tests/github.linux.yml
Expand Down
Loading