Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions ci/travis_script_manylinux.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,7 @@
set -ex

pushd python/manylinux1
git clone ../../ arrow
docker build -t arrow-base-x86_64 -f Dockerfile-x86_64 .
docker run --shm-size=2g --rm -e PYARROW_PARALLEL=3 -v $PWD:/io arrow-base-x86_64 /io/build_arrow.sh
docker run --shm-size=2g --rm -e PYARROW_PARALLEL=3 -v $PWD:/io -v $PWD/../../:/arrow quay.io/xhochy/arrow_manylinux1_x86_64_base:latest /io/build_arrow.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean build products will spill in the local checkout?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Python products do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That sounds ok to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ADDing arrow to the docker image instead of mounting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't run a docker build step anymore here and from my experience adding takes quite some time (at least on OSX where the files need to be sent to the docker daemon).

Copy link
Member

@kszucs kszucs Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to add the manylinux image to #2572.
Another question: could We rely on cmake's 3rdparty toolchain instead of building the dependencies in additional scripts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're building them separately as we do some tricks like the boost namespacing but it is also a case of caching. The docker image is pre-built and will only be pulled into the Travis so that the thirdparties are not rebuild every CI run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps We could define a target to build the 3rdparty deps and cache them similarly. Thanks for the clarification!


# Testing for https://issues.apache.org/jira/browse/ARROW-2657
# These tests cannot be run inside of the docker container, since TensorFlow
Expand Down
38 changes: 0 additions & 38 deletions python/manylinux1/Dockerfile-x86_64

This file was deleted.

21 changes: 6 additions & 15 deletions python/manylinux1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,22 +32,17 @@ for all supported Python versions and place them in the `dist` folder.
### Build instructions

```bash
# Create a clean copy of the arrow source tree
git clone ../../ arrow
# Build the native baseimage
docker build -t arrow-base-x86_64 -f Dockerfile-x86_64 .
# Build the python packages
docker run --shm-size=2g --rm -t -i -v $PWD:/io arrow-base-x86_64 /io/build_arrow.sh
docker run --shm-size=2g --rm -t -i -v $PWD:/io -v $PWD/../../:/arrow quay.io/xhochy/arrow_manylinux1_x86_64_base:latest /io/build_arrow.sh
# Now the new packages are located in the dist/ folder
ls -l dist/
```

### Updating the build environment

In addition to the docker images that contains the Arrow C++ and Parquet C++
builds, we also have another base image that only contains their dependencies.
This image is less often updated. In the case we want to update a dependency to
a new version, we also need to adjust it. You can rebuild this image using
The base docker image is less often updated. In the case we want to update
a dependency to a new version, we also need to adjust it. You can rebuild
this image using

```bash
docker build -t arrow_manylinux1_x86_64_base -f Dockerfile-x86_64_base .
Expand All @@ -57,9 +52,5 @@ For each dependency, we have a bash script in the directory `scripts/` that
downloads the sources, builds and installs them. At the end of each dependency
build the sources are removed again so that only the binary installation of a
dependency is persisted in the docker image. When you do local adjustments to
this image, you need to change the `FROM` line in `Dockerfile-x86_64` to pick up
the new image:

```
FROM arrow_manylinux1_x86_64_base
```
this image, you need to change the name of the docker image in the `docker run`
command.
8 changes: 6 additions & 2 deletions python/manylinux1/build_arrow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ for PYTHON_TUPLE in ${PYTHON_VERSIONS}; do
fi

echo "=== (${PYTHON}) Building Arrow C++ libraries ==="
ARROW_BUILD_DIR=/arrow/cpp/build-PY${PYTHON}-${U_WIDTH}
ARROW_BUILD_DIR=/tmp/build-PY${PYTHON}-${U_WIDTH}
mkdir -p "${ARROW_BUILD_DIR}"
pushd "${ARROW_BUILD_DIR}"
PATH="${CPYTHON_PATH}/bin:$PATH" cmake -DCMAKE_BUILD_TYPE=Release \
Expand All @@ -77,16 +77,20 @@ for PYTHON_TUPLE in ${PYTHON_VERSIONS}; do
-DARROW_JEMALLOC=ON \
-DARROW_RPATH_ORIGIN=ON \
-DARROW_PYTHON=ON \
-DARROW_PARQUET=ON \
-DPythonInterp_FIND_VERSION=${PYTHON} \
-DARROW_PLASMA=ON \
-DARROW_TENSORFLOW=ON \
-DARROW_ORC=ON \
-DBoost_NAMESPACE=arrow_boost \
-DBOOST_ROOT=/arrow_boost_dist \
-GNinja ..
-GNinja /arrow/cpp
ninja install
popd

# Check that we don't expose any unwanted symbols
/io/scripts/check_arrow_visibility.sh

# Clear output directory
rm -rf dist/
echo "=== (${PYTHON}) Building wheel ==="
Expand Down