Fall back to most recent known osx version for bootstrap binaries#6681
Conversation
…ries used in the bootstrap phase
|
There's an open question of how we test this fix (and anything around the rust bootstrap scripts) in CI. |
It's crude, but nuke the appropriate Travis caches, which include the Rust bootstrapping cache dirs. |
jsirois
left a comment
There was a problem hiding this comment.
There is too much code in the heredoc, should be lifted into python source code and possibly tested, but this quick POC works and leverages all existing BinaryUtil tests:
diff --git a/build-support/bin/download_binary.sh b/build-support/bin/download_binary.sh
index 268b29113..c226ec7c2 100755
--- a/build-support/bin/download_binary.sh
+++ b/build-support/bin/download_binary.sh
@@ -1,4 +1,4 @@
-#!/bin/bash -eu
+#!/bin/bash -e
# Downloads the specified binary at the specified version from the specified binary host if it's not already present.
# Uses a reimplementation of the BinaryUtils mechanism.
@@ -11,40 +11,42 @@ REPO_ROOT=$(cd $(dirname "${BASH_SOURCE[0]}") && cd ../.. && pwd -P)
# Defines:
# + CACHE_ROOT: The pants cache directory, ie: ~/.cache/pants.
-source "${REPO_ROOT}/build-support/common.sh"
+source "${REPO_ROOT}/build-support/pants_venv"
-if (( $# != 3 && $# != 4 )); then
+if (( $# != 2 && $# != 3 )); then
die "$(cat << USAGE
Usage: $0 host util_name version [filename]
Example: $0 binaries.pantsbuild.org go 1.7.3 go.tar.gz
USAGE
)"
fi
-readonly host="$1"
-readonly util_name="$2"
-readonly version="$3"
-readonly filename="${4:-${util_name}}"
-
-os=$("${REPO_ROOT}/build-support/bin/get_os.sh")
-readonly path="bin/${util_name}/${os}/${version}/${filename}"
-readonly cache_path="${CACHE_ROOT}/${path}"
-
-if [[ ! -f "${cache_path}" ]]; then
- mkdir -p "$(dirname "${cache_path}")"
-
- readonly binary_url="https://${host}/${path}"
- echo >&2 "Downloading ${binary_url} ..."
- curl --fail "${binary_url}" > "${cache_path}".tmp
- mv "${cache_path}"{.tmp,}
- if [[ "${filename}" == *tar.gz ]]; then
- tar -C "$(dirname "${cache_path}")" -xzf "${cache_path}"
- else
- chmod 0755 "${cache_path}"
- fi
-fi
-
-to_output="${cache_path}"
-if [[ "${filename}" == *tar.gz ]]; then
- to_output="$(dirname "${to_output}")"
-fi
-echo "${to_output}"
+readonly util_name="$1"
+readonly version="$2"
+readonly filename="${3:-${util_name}}"
+
+activate_pants_venv 1>&2
+PYTHONPATH="${REPO_ROOT}/src/python" python << SCRIPT
+import os
+from functools import reduce
+
+from pants.binaries.binary_util import BinaryUtil
+from pants.option.global_options import GlobalOptionsRegistrar
+from pants.option.options_bootstrapper import OptionsBootstrapper
+from pants.subsystem.subsystem import Subsystem
+
+
+options_bootstrapper = OptionsBootstrapper()
+subsystems = (GlobalOptionsRegistrar, BinaryUtil.Factory)
+known_scope_infos = reduce(set.union, (ss.known_scope_infos() for ss in subsystems), set())
+options = options_bootstrapper.get_full_options(known_scope_infos)
+for subsystem in subsystems:
+ subsystem.register_options_on_scope(options)
+Subsystem.set_options(options)
+
+binary_util = BinaryUtil.Factory.create()
+path = binary_util.select_binary(supportdir='bin/${util_name}',
+ version='${version}',
+ name='${filename}')
+
+print(os.path.dirname(path) if path.endswith('.tar.gz') else path)
+SCRIPT
diff --git a/build-support/bin/native/cargo.sh b/build-support/bin/native/cargo.sh
index 2bf4dcfb5..69bd03493 100755
--- a/build-support/bin/native/cargo.sh
+++ b/build-support/bin/native/cargo.sh
@@ -16,11 +16,11 @@ download_binary="${REPO_ROOT}/build-support/bin/download_binary.sh"
# The following is needed by grpcio-sys and we have no better way to hook its build.rs than this;
# ie: wrapping cargo.
-cmakeroot="$("${download_binary}" "binaries.pantsbuild.org" "cmake" "3.9.5" "cmake.tar.gz")"
-goroot="$("${download_binary}" "binaries.pantsbuild.org" "go" "1.7.3" "go.tar.gz")/go"
+cmakeroot="$("${download_binary}" "cmake" "3.9.5" "cmake.tar.gz")"
+goroot="$("${download_binary}" "go" "1.7.3" "go.tar.gz")/go"
# Code generation in the bazel_protos crate needs to be able to find protoc on the PATH.
-protoc="$("${download_binary}" "binaries.pantsbuild.org" "protobuf" "3.4.1" "protoc")"
+protoc="$("${download_binary}" "protobuf" "3.4.1" "protoc")"
export GOROOT="${goroot}"
export PATH="${cmakeroot}/bin:${goroot}/bin:${CARGO_HOME}/bin:$(dirname "${protoc}"):${PATH}"
|
Oh now this is much much better. I'll drop the python into a separate file and look at possible testing. |
Eric-Arellano
left a comment
There was a problem hiding this comment.
Thank you for fixing this 🙌
build-support/bin/download_binary.sh
Outdated
|
|
||
| # Use this version as a fallback for platform-specific binaries to download if we are on a version | ||
| # of OSX that we don't explicitly have binaries for yet. | ||
| # TODO: Keep this up to date automatically somehow! |
There was a problem hiding this comment.
Should probably link to a GitHub issue here
There was a problem hiding this comment.
Yes! This is obviated with the new python approach but this is correct.
build-support/bin/download_binary.sh
Outdated
| readonly binary_url="https://${host}/${path}" | ||
| echo >&2 "Downloading ${binary_url} ..." | ||
| curl --fail "${binary_url}" > "${cache_path}".tmp | ||
| readonly output_tmp_file="${cache_path}.tmp" |
There was a problem hiding this comment.
Woah didn't know about readonly #FunctionalBashScripts 😎
There was a problem hiding this comment.
There are some downsides which actually make it incredibly hard to use readonly correctly because bash is a horrible piece of software -- see build-support/bin/check_shell.sh, which runs in the pre-commit hook. zsh has never had any of these issues.
build-support/bin/download_binary.sh
Outdated
|
|
||
| curl_output_file_with_fail "${binary_url}" "$output_tmp_file" \ | ||
| 2> >(tee >&2 "$curl_err_tmp_file") \ | ||
| || if grep -q -F 'The requested URL returned error: 404' "$curl_err_tmp_file" \ |
There was a problem hiding this comment.
Do -q and -F have long flags?
There was a problem hiding this comment.
They do (I think --quiet/--silent and --fixed-strings?). They should absolutely have been used here, as in any script in this repo.
build-support/bin/download_binary.sh
Outdated
| || if grep -q -F 'The requested URL returned error: 404' "$curl_err_tmp_file" \ | ||
| && [[ -n "$maybe_alternate_os" ]]; then | ||
| echo >&2 "Failed initial curl, trying an alternate url..." | ||
| alternate_intermediate_path="$(make_download_path_for_os "$maybe_alternate_os")" |
There was a problem hiding this comment.
Should this also be readonly?
| alternate_intermediate_path="$(make_download_path_for_os "$maybe_alternate_os")" | |
| readonly alternate_intermediate_path="$(make_download_path_for_os "$maybe_alternate_os")" |
There was a problem hiding this comment.
This is one of the things check_shell.sh disallows and leads to very strange correctness issues if you do. Like I said, bash is a horrible piece of software because it doesn't care about bugs like this (and there are 1300 versions in play at once).
build-support/bin/download_binary.sh
Outdated
| path="$(make_download_path_for_os "$("${REPO_ROOT}/build-support/bin/get_os.sh")")" | ||
| readonly cache_path="${CACHE_ROOT}/${path}" | ||
|
|
||
| maybe_alternate_os="$([[ "$(uname)" == 'Darwin' ]] \ |
There was a problem hiding this comment.
This function only adds compatibility for macOS, right? It wouldn't help with Windows or some Linux distro? If so, consider renaming to something like maybe_macos to be more explicit.
There was a problem hiding this comment.
This is absolutely a correct comment. In the commit I just pushed, we use BinaryUtil to avoid having to duplicate this regardless, but this is definitely an edit I would have made.
|
@Eric-Arellano I'll reply in full to your review because bash is strange and nasty but will be pushing a commit which uses @jsirois's approach in his earlier review which delegates to BinaryUtil, obviating some of the issues you pointed out. |
Eric-Arellano
left a comment
There was a problem hiding this comment.
This python code makes me much happier haha 🐍
| path = binary_util.select(binary_request) | ||
|
|
||
| print(path) | ||
|
|
There was a problem hiding this comment.
Is this supposed to be here, or leftover from debugging? Maybe print a message before the path to explain what it’s for.
There was a problem hiding this comment.
This is invoked by a shell script to read the output! There could be a comment here -- I will add that to make it clear.
|
|
||
|
|
||
| def main(): | ||
| util_name, version, filename = tuple(sys.argv[1:]) |
There was a problem hiding this comment.
argparse is much more robust and has the benefit of allowing —help. It’s fine for what we have now, but will make it harder to do things like extend this script down the road
Have you worked with it before? If not, I can give the code to get it working! (I’m on phone atm)
There was a problem hiding this comment.
I have used it once years ago and not long after I ended up fixing the coffeescript argument parsing because I had been thinking about argument parsing so much: jashkenas/coffeescript#3946. I would love the code to do so -- that feels like it's probably half of the point of making this a python script. No rush to do that.
There was a problem hiding this comment.
Talk to your team-mates, in particular Yi: Fire.fire(a_method_with_the_obvious_signature)
There was a problem hiding this comment.
Argparse is one of my favorite parts of Python. It's so easy to make high quality scripts with it.
Below is a demo of how it works so that you can see things like optional arguments, which it looks like we don't actually use. Should be easy to convert to this actual program, which I'm happy to help with also.
import argparse
def create_parser():
parser = argparse.create_parser(description="My script now has fancy CLI options!")
parser.add_argument("util_name", help="...")
parser.add_argument("-o", "--optional", help="...")
return parser
def main():
args = create_parser().parse_args()
supportdir = 'bin/{}'.format(args.util_name)If you don't like this line args = create_parser().parse_args(), you can have create_parser() instead be parse_args() and simply return the args. I only set it up this way to make create_parse() a pure function & keep side effects in main().
jsirois
left a comment
There was a problem hiding this comment.
If bootstrap_binary_util.py was housed under src/python/pants/binaries/ - perhaps just at the bottom of binary_util.py - actually testing the script would be that much easier.
|
|
||
|
|
||
| def main(): | ||
| util_name, version, filename = tuple(sys.argv[1:]) |
There was a problem hiding this comment.
Talk to your team-mates, in particular Yi: Fire.fire(a_method_with_the_obvious_signature)
| # BinaryRequest requires the `name` field to be provided without an extension, as it appends the | ||
| # archiver's extension if one is provided, so we have to remove it here. | ||
| filename = re.sub( | ||
| r'\.{}\Z'.format(re.escape(archiver_for_current_binary.extension)), |
There was a problem hiding this comment.
I had to look up \Z, which was fun, I did not know $ had a synonym or vice-versa - but, if you got an acrhiver back, you know you can just lop off the filename verbatim - perhaps filename[:-(len(extension)+1)] or filename.rsplit('.' + extension, 1)[0]?
There was a problem hiding this comment.
Defaults for multiline modes differ across languages and other languages such as emacs lisp have even more customized extensions for e.g. symbol boundaries so it's easier for me to be explicit than to wonder whether a stray $ is going to slurp a whole file in, or worse, make repeated regex applications start at increasingly more incorrect positions (realizing now that's also more commonly performed in elisp than elsewhere).
This particular regex replace, however, was definitely a 3-4 second solution -- either of what you suggested seem more appropriate and I will make that fix.
There was a problem hiding this comment.
I don’t fully follow this conversation, but if you go with regex consider a very brief comment explaining what Z means
There was a problem hiding this comment.
Sorry -- "being explicit" refers to using \Z compulsively instead of the more recognizable $ (although if we wish to be truly pedantic, filenames can have newlines, so this should really also have a TODO to replace all such filename manipulations with methods in util or something so we can contain the pedantry).
|
Moved the main method to the bottom of |
Eric-Arellano
left a comment
There was a problem hiding this comment.
Argparse code looks great! I like the nargs part. If you haven’t already, run the script with —help 😎
I would take credit but it's straight from https://docs.python.org/3/library/argparse.html#usage! Love python. |
) ### Problem See the end of #6591 -- while we fall back to the most recent OSX version to download binaries for use with e.g. NativeTool, this fix does not apply for the binaries we fetch in shell scripts to bootstrap the pants rust code, leading to the following on OSX Mojave (10.14) in the pants repo: ``` Downloading https://binaries.pantsbuild.org/bin/cmake/mac/10.14/3.9.5/cmake.tar.gz ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (22) The requested URL returned error: 404 ``` ### Solution - Append a `main()` method at the bottom of `binary_util.py` which can be invoked in the bootstrap phase by `download_binary.sh`. ### Result I can run `./pants goals` on my OSX 10.14 laptop in the pants repo after an `rm -rf ~/.cache/pants`.
|
See #7009 for issue arising out of this change. |
Problem
See the end of #6591 -- while we fall back to the most recent OSX version to download binaries for use with e.g. NativeTool, this fix does not apply for the binaries we fetch in shell scripts to bootstrap the pants rust code, leading to the following on OSX Mojave (10.14) in the pants repo:
Solution
main()method at the bottom ofbinary_util.pywhich can be invoked in the bootstrap phase bydownload_binary.sh.Result
I can run
./pants goalson my OSX 10.14 laptop in the pants repo after anrm -rf ~/.cache/pants.