From 986a3bdfa0ad9353758b58bd79a4402a91196456 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Tue, 4 Oct 2022 10:14:27 +0200 Subject: [PATCH 01/10] Use Python 3.10 in the instructions --- docs/source/developers/python.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 3f33f2e68a7..9281ae3f656 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -213,7 +213,7 @@ dependencies for Arrow C++ and PyArrow as pre-built binaries, which can make Arrow development easier and faster. Let's create a conda environment with all the C++ build and Python dependencies -from conda-forge, targeting development for Python 3.9: +from conda-forge, targeting development for Python 3.10: On Linux and macOS: @@ -225,7 +225,7 @@ On Linux and macOS: --file arrow/ci/conda_env_python.txt \ --file arrow/ci/conda_env_gandiva.txt \ compilers \ - python=3.9 \ + python=3.10 \ pandas As of January 2019, the ``compilers`` package is needed on many Linux @@ -509,7 +509,7 @@ First, starting from a fresh clone of Apache Arrow: --file arrow\ci\conda_env_cpp.txt ^ --file arrow\ci\conda_env_python.txt ^ --file arrow\ci\conda_env_gandiva.txt ^ - python=3.9 + python=3.10 $ conda activate pyarrow-dev Now, we build and install Arrow C++ libraries. From d0d1dd05e2478f062a3fbf6ffe8772b12bea0dd2 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Tue, 4 Oct 2022 10:24:04 +0200 Subject: [PATCH 02/10] Remove subsection for Python integration tests - no longer needed as tests are run with pytest --- docs/source/developers/python.rst | 46 ------------------------------- 1 file changed, 46 deletions(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 9281ae3f656..ad3e0ad33f6 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -598,52 +598,6 @@ Then run the unit tests with: incompatibilities when ``pyarrow`` is later built without ``--bundle-arrow-cpp``. -Running C++ unit tests for Python integration ---------------------------------------------- - -Running C++ unit tests should not be necessary for most developers. If you do -want to run them, you need to pass ``-DARROW_BUILD_TESTS=ON`` during -configuration of the Arrow C++ library build: - -.. code-block:: - - $ mkdir arrow\cpp\build - $ pushd arrow\cpp\build - $ cmake -G "%PYARROW_CMAKE_GENERATOR%" ^ - -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^ - -DARROW_BUILD_TESTS=ON ^ - -DARROW_COMPUTE=ON ^ - -DARROW_CSV=ON ^ - -DARROW_CXXFLAGS="/WX /MP" ^ - -DARROW_DATASET=ON ^ - -DARROW_FILESYSTEM=ON ^ - -DARROW_HDFS=ON ^ - -DARROW_JSON=ON ^ - -DARROW_PARQUET=ON ^ - .. - $ cmake --build . --target INSTALL --config Release - $ popd - -Getting ``arrow-python-test.exe`` (C++ unit tests for python integration) to -run is a bit tricky because your ``%PYTHONHOME%`` must be configured to point -to the active conda environment: - -.. code-block:: - - $ set PYTHONHOME=%CONDA_PREFIX% - $ pushd arrow\cpp\build\release\Release - $ arrow-python-test.exe - $ popd - -To run all tests of the Arrow C++ library, you can also run ``ctest``: - -.. code-block:: - - $ set PYTHONHOME=%CONDA_PREFIX% - $ pushd arrow\cpp\build - $ ctest - $ popd - Caveats ------- From 952d3ab9e3d4cc61a4423dde32bdc212eb95a286 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Tue, 4 Oct 2022 10:42:47 +0200 Subject: [PATCH 03/10] Change INSTALL to install --- docs/source/developers/python.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index ad3e0ad33f6..c8c59720e1e 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -549,7 +549,7 @@ Let's configure, build and install the Arrow C++ libraries: -DARROW_WITH_ZLIB=ON ^ -DARROW_WITH_ZSTD=ON ^ .. - $ cmake --build . --target INSTALL --config Release + $ cmake --build . --target install --config Release $ popd Now, we can build pyarrow: From a056680e3a5f106e4db16388a289ed6743459c89 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Sat, 8 Oct 2022 06:34:00 +0200 Subject: [PATCH 04/10] Remove setting of PATH - not searched by Python anymore --- docs/source/developers/python.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index c8c59720e1e..63f06e03785 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -518,13 +518,11 @@ We set a number of environment variables: - the path of the installation directory of the Arrow C++ libraries as ``ARROW_HOME`` -- add the path of installed DLL libraries to ``PATH`` - and the CMake generator to be used as ``PYARROW_CMAKE_GENERATOR`` .. code-block:: $ set ARROW_HOME=%cd%\arrow-dist - $ set PATH=%ARROW_HOME%\bin;%PATH% $ set PYARROW_CMAKE_GENERATOR=Visual Studio 15 2017 Win64 Let's configure, build and install the Arrow C++ libraries: From 0376c7e93e7c8c4032103178819d75926a36eddf Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Sat, 8 Oct 2022 06:40:36 +0200 Subject: [PATCH 05/10] Use CONDA_PREFIX --- docs/source/developers/python.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 63f06e03785..c19bb161eb8 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -522,7 +522,7 @@ We set a number of environment variables: .. code-block:: - $ set ARROW_HOME=%cd%\arrow-dist + $ set ARROW_HOME=%CONDA_PREFIX% $ set PYARROW_CMAKE_GENERATOR=Visual Studio 15 2017 Win64 Let's configure, build and install the Arrow C++ libraries: From 58ef6f15e259a09d30457b0dbd1697b678ee5616 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Sat, 8 Oct 2022 06:43:09 +0200 Subject: [PATCH 06/10] Update note about PATH --- docs/source/developers/python.rst | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index c19bb161eb8..1da889e72db 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -574,14 +574,6 @@ Then run the unit tests with: .. note:: - With the above instructions the Arrow C++ libraries are not bundled with - the Python extension. This is recommended for development as it allows the - C++ libraries to be re-built separately. - - As a consequence however, ``python setup.py install`` will also not install - the Arrow C++ libraries. Therefore, to use ``pyarrow`` in python, ``PATH`` - must contain the directory with the Arrow .dll-files. - If you want to bundle the Arrow C++ libraries with ``pyarrow``, add the ``--bundle-arrow-cpp`` option when building: @@ -592,7 +584,7 @@ Then run the unit tests with: Important: If you combine ``--bundle-arrow-cpp`` with ``--inplace`` the Arrow C++ libraries get copied to the source tree and are not cleared by ``python setup.py clean``. They remain in place and will take precedence - over any later Arrow C++ libraries contained in ``PATH``. This can lead to + over any later Arrow C++ libraries contained in ``CONDA_PREFIX``. This can lead to incompatibilities when ``pyarrow`` is later built without ``--bundle-arrow-cpp``. From f4c4cf07269f0f468ab36cb5f4eb9a795434dd70 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Thu, 13 Oct 2022 16:12:41 +0200 Subject: [PATCH 07/10] Add info about CONDA_PREFIX --- docs/source/developers/python.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 1da889e72db..2b6d5ba78e4 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -517,7 +517,10 @@ Now, we build and install Arrow C++ libraries. We set a number of environment variables: - the path of the installation directory of the Arrow C++ libraries as - ``ARROW_HOME`` + ``ARROW_HOME``. When using a conda environment, Arrow C++ is installed + in the environment directory, which path is saved in the + `CONDA_PREFIX `_ + environment variable. - and the CMake generator to be used as ``PYARROW_CMAKE_GENERATOR`` .. code-block:: From 55cfea76d08b5bd6571cd28eb4ef8d363f29b829 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Thu, 13 Oct 2022 16:35:18 +0200 Subject: [PATCH 08/10] Change to use Ninja --- docs/source/developers/python.rst | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 2b6d5ba78e4..7eff3d156bb 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -514,19 +514,15 @@ First, starting from a fresh clone of Apache Arrow: Now, we build and install Arrow C++ libraries. -We set a number of environment variables: - -- the path of the installation directory of the Arrow C++ libraries as - ``ARROW_HOME``. When using a conda environment, Arrow C++ is installed - in the environment directory, which path is saved in the - `CONDA_PREFIX `_ - environment variable. -- and the CMake generator to be used as ``PYARROW_CMAKE_GENERATOR`` +We set the path of the installation directory of the Arrow C++ libraries as +``ARROW_HOME``. When using a conda environment, Arrow C++ is installed +in the environment directory, which path is saved in the +`CONDA_PREFIX `_ +environment variable. .. code-block:: $ set ARROW_HOME=%CONDA_PREFIX% - $ set PYARROW_CMAKE_GENERATOR=Visual Studio 15 2017 Win64 Let's configure, build and install the Arrow C++ libraries: @@ -534,7 +530,7 @@ Let's configure, build and install the Arrow C++ libraries: $ mkdir arrow\cpp\build $ pushd arrow\cpp\build - $ cmake -G "%PYARROW_CMAKE_GENERATOR%" ^ + $ cmake -G "Ninja" ^ -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^ -DCMAKE_UNITY_BUILD=ON ^ -DARROW_COMPUTE=ON ^ From f76b6cd58ae3d96ee821533b44df95b8c5d18df4 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Thu, 13 Oct 2022 17:14:01 +0200 Subject: [PATCH 09/10] Add back the info about Arrow C++ not being bundled for ease of development --- docs/source/developers/python.rst | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 7eff3d156bb..373c63d2c19 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -573,6 +573,10 @@ Then run the unit tests with: .. note:: + With the above instructions the Arrow C++ libraries are not bundled with + the Python extension. This is recommended for development as it allows the + C++ libraries to be re-built separately. + If you want to bundle the Arrow C++ libraries with ``pyarrow``, add the ``--bundle-arrow-cpp`` option when building: From b8cc22df222e48ffa18c3ceb25284f7d1f179d24 Mon Sep 17 00:00:00 2001 From: Alenka Frim Date: Wed, 19 Oct 2022 09:03:39 +0200 Subject: [PATCH 10/10] Use ARROW_HOME=%CONDA_PREFIX%\Library in the instructions --- docs/source/developers/python.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst index 373c63d2c19..4ff9dad1fb8 100644 --- a/docs/source/developers/python.rst +++ b/docs/source/developers/python.rst @@ -522,7 +522,7 @@ environment variable. .. code-block:: - $ set ARROW_HOME=%CONDA_PREFIX% + $ set ARROW_HOME=%CONDA_PREFIX%\Library Let's configure, build and install the Arrow C++ libraries: