-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-6465: [Python] Improvement to Windows build instructions #5294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
cd8e093
b6714fd
927c6cc
121faff
f2ea51b
a57ea7d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -379,8 +379,19 @@ debugging a C++ unitttest, for example: | |
| Building on Windows | ||
| =================== | ||
|
|
||
| First, we bootstrap a conda environment similar to above, but skipping some of | ||
| the Linux/macOS-only packages: | ||
| Building on Windows requires one of the following compilers to be installed: | ||
|
|
||
| - `Build Tools for Visual Studio 2017 <https://download.visualstudio.microsoft.com/download/pr/3e542575-929e-4297-b6c6-bef34d0ee648/639c868e1219c651793aff537a1d3b77/vs_buildtools.exe>`_ | ||
| - `Microsoft Build Tools 2015 <http://download.microsoft.com/download/5/F/7/5F7ACAEB-8363-451F-9425-68A90F98B238/visualcppbuildtools_full.exe>`_ | ||
| - Visual Studio 2015 | ||
| - Visual Studio 2017 | ||
|
|
||
| During the setup of Build Tools ensure at least one Windows SDK is selected. | ||
|
|
||
| Visual Studio 2019 and its build tools are currently not supported. | ||
|
|
||
| We bootstrap a conda environment similar to above, but skipping some of the | ||
| Linux/macOS-only packages: | ||
|
|
||
| First, starting from fresh clones of Apache Arrow: | ||
|
|
||
|
|
@@ -390,60 +401,140 @@ First, starting from fresh clones of Apache Arrow: | |
|
|
||
| .. code-block:: shell | ||
|
|
||
| conda create -y -n pyarrow-dev -c conda-forge ^ | ||
| --file arrow\ci\conda_env_cpp.yml ^ | ||
| --file arrow\ci\conda_env_python.yml ^ | ||
| python=3.7 | ||
| conda create -y -n pyarrow-dev -c conda-forge ^ | ||
| --file arrow\ci\conda_env_cpp.yml ^ | ||
| --file arrow\ci\conda_env_python.yml ^ | ||
| --file arrow\ci\conda_env_gandiva.yml ^ | ||
| python=3.7 | ||
| conda activate pyarrow-dev | ||
|
|
||
| Now, we build and install Arrow C++ libraries | ||
| Now, we build and install Arrow C++ libraries. | ||
|
|
||
| We set a number of environment variables: | ||
|
|
||
| - the path of the installation directory of the Arrow C++ libraries as | ||
| ``ARROW_HOME`` | ||
| - add the path of installed DLL libraries to ``PATH`` | ||
| - and choose the compiler to be used | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| mkdir cpp\build | ||
| cd cpp\build | ||
| set ARROW_HOME=C:\thirdparty | ||
| cmake -G "Visual Studio 14 2015 Win64" ^ | ||
| -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^ | ||
| -DARROW_CXXFLAGS="/WX /MP" ^ | ||
| -DARROW_GANDIVA=on ^ | ||
| -DARROW_PARQUET=on ^ | ||
| -DARROW_PYTHON=on .. | ||
| cmake --build . --target INSTALL --config Release | ||
| cd ..\.. | ||
| set ARROW_HOME=%cd%\arrow-dist | ||
| set PATH=%ARROW_HOME%\bin;%PATH% | ||
| set PYARROW_CMAKE_GENERATOR=Visual Studio 15 2017 Win64 | ||
|
|
||
| After that, we must put the install directory's bin path in our ``%PATH%``: | ||
| This assumes Visual Studio 2017 or its build tools are used. For Visual Studio | ||
| 2015 and its build tools use the following instead: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| set PATH=%ARROW_HOME%\bin;%PATH% | ||
| set PYARROW_CMAKE_GENERATOR=Visual Studio 14 2015 Win64 | ||
|
|
||
| Let's configure, build and install the Arrow C++ libraries: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| mkdir arrow\cpp\build | ||
| pushd arrow\cpp\build | ||
| cmake -G "%PYARROW_CMAKE_GENERATOR%" ^ | ||
| -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^ | ||
| -DARROW_CXXFLAGS="/WX /MP" ^ | ||
| -DARROW_GANDIVA=on ^ | ||
| -DARROW_PARQUET=on ^ | ||
| -DARROW_PYTHON=on ^ | ||
| .. | ||
| cmake --build . --target INSTALL --config Release | ||
| popd | ||
|
|
||
| Now, we can build pyarrow: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| cd python | ||
| pushd arrow\python | ||
| set PYARROW_WITH_GANDIVA=1 | ||
| set PYARROW_WITH_PARQUET=1 | ||
| python setup.py build_ext --inplace | ||
| popd | ||
|
|
||
| .. note:: | ||
|
|
||
| For building pyarrow, the above defined environment variables need to also | ||
| be set. Remember this if to want to re-build ``pyarrow`` after your initial build. | ||
|
|
||
| Then run the unit tests with: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| pushd arrow\python | ||
| py.test pyarrow -v | ||
| popd | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At line 474, I cannot execute
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had not tried this yet. Can confirm your observations. Could this be an obsolete part of the instructions? In the linux instructions neither
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You need to pass
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @wesm Thanks. Also, I can find a
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @wesm Apologies. I forgot the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you. This change also works well for my environment.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not necessary for most developers to run the C++-based unit test, so we should put this off in a separate Advanced section
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see this is already done |
||
|
|
||
| .. note:: | ||
|
|
||
| With the above instructions the Arrow C++ libraries are not bundled with | ||
| the Python extension. This is recommended for development as it allows the | ||
| C++ libraries to be re-built separately. | ||
|
|
||
| As a consequence however, ``python setup.py install`` will also not install | ||
| the Arrow C++ libraries. Therefore, to use ``pyarrow`` in python, ``PATH`` | ||
| must contain the directory with the Arrow .dll-files. | ||
|
|
||
| If you want to bundle the Arrow C++ libraries with ``pyarrow`` add | ||
| ``--bundle-arrow-cpp`` as build parameter: | ||
|
|
||
| ``python setup.py build_ext --bundle-arrow-cpp`` | ||
|
|
||
| Important: If you combine ``--bundle-arrow-cpp`` with ``--inplace`` the | ||
| Arrow C++ libraries get copied to the python source tree and are not cleared | ||
| by ``python setup.py clean``. They remain in place and will take precedence | ||
| over any later Arrow C++ libraries contained in ``PATH``. This can lead to | ||
| incompatibilities when ``pyarrow`` is later built without | ||
| ``--bundle-arrow-cpp``. | ||
|
|
||
| Running C++ unit tests for Python integration | ||
| --------------------------------------------- | ||
|
|
||
| Getting ``python-test.exe`` to run is a bit tricky because your | ||
| ``%PYTHONHOME%`` must be configured to point to the active conda environment: | ||
| Running C++ unit tests should not be necessary for most developers. If you do | ||
| want to run them, you need to pass ``-DARROW_BUILD_TESTS=ON`` during | ||
| configuration of the Arrow C++ library build: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| mkdir arrow\cpp\build | ||
| pushd arrow\cpp\build | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Is it better to add
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you are quite right. Just in case somebody jumps directly to this section. |
||
| cmake -G "%PYARROW_CMAKE_GENERATOR%" ^ | ||
| -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^ | ||
| -DARROW_CXXFLAGS="/WX /MP" ^ | ||
| -DARROW_GANDIVA=on ^ | ||
| -DARROW_PARQUET=on ^ | ||
| -DARROW_PYTHON=on ^ | ||
| -DARROW_BUILD_TESTS=ON ^ | ||
| .. | ||
| cmake --build . --target INSTALL --config Release | ||
| popd | ||
|
|
||
|
|
||
| Getting ``arrow-python-test.exe`` (C++ unit tests for python integration) to | ||
| run is a bit tricky because your ``%PYTHONHOME%`` must be configured to point | ||
| to the active conda environment: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| set PYTHONHOME=%CONDA_PREFIX% | ||
| pushd arrow\cpp\build\release\Release | ||
| arrow-python-test.exe | ||
| popd | ||
|
|
||
| To run all tests of the Arrow C++ library, you can also run ``ctest``: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| set PYTHONHOME=%CONDA_PREFIX% | ||
| pushd arrow\cpp\build | ||
| ctest | ||
| popd | ||
|
|
||
|
|
||
| Now ``python-test.exe`` or simply ``ctest`` (to run all tests) should work. | ||
|
|
||
| Windows Caveats | ||
| --------------- | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed these steps on a fresh Windows 10 instance. Most of them work well.
An issue is that when I used
Build Tools for Visual Studio 2017, this command shows the attached error. When I update this line as `self.cmake_generator = 'Visual Studio 15 2017 Win64', it works well.Should we update this document or should we update the
setup.py?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have stumbled across this error as well since my last commit.
From what I can tell, this does not happen, when one follows the recipe in one go: i.e. configure arrow library, build arrow library then build extension.
I encountered this error when I wanted to rebuild the python extension without reconfiguring and re-building the arrow library. For this I did
python setup.py clean. After that when I runpython setup.py build_ext --inplacethe issue you described pops up.Simpler than changing the line you referenced would be to introduce into the build recipe the step:
set PYARROW_CMAKE_GENERATOR=Visual Studio 17 2017 Win64One could then change the
cmakeline to use this once defined variable like so:cmake -G "%PYARROW_CMAKE_GENERATOR%" ^What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. I agree with your proposal. This change also works for my environment.