-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-6465: [Python] Improvement to Windows build instructions #5294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The current instructions for building the pyarrow python extension are incomplete. Problems include: - missing re2, llvm, clang prerequisites - missing info on which MSVC toolsets are supported - missing info on how the build commands to different MSVC toolsets - missing warning about currently broken Windows build config This amends the python developer documentation with above.
docs/source/developers/python.rst
Outdated
| the Linux/macOS-only packages: | ||
| .. warning:: | ||
| Building on Windows is currently broken. The issue is being worked on | ||
| under the `Github PR #5247 <https://github.com/apache/arrow/pull/5247>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this. Now that https://issues.apache.org/jira/browse/ARROW-6457 is merged it's building fine -- since our CI has been continuously green the issue seems to be localized to certain build configurations and systems
Codecov Report
@@ Coverage Diff @@
## master #5294 +/- ##
=========================================
Coverage ? 65.16%
=========================================
Files ? 497
Lines ? 68417
Branches ? 0
=========================================
Hits ? 44581
Misses ? 23836
Partials ? 0Continue to review full report at Codecov.
|
| .. code-block:: shell | ||
| pushd arrow\python | ||
| py.test pyarrow -v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed these steps on a fresh Windows 10 instance. Most of them work well.
An issue is that when I used Build Tools for Visual Studio 2017, this command shows the attached error. When I update this line as `self.cmake_generator = 'Visual Studio 15 2017 Win64', it works well.
Should we update this document or should we update the setup.py?
running build_ext
creating build
creating build\temp.win-amd64-3.7
creating build\temp.win-amd64-3.7\Release
-- Running cmake for pyarrow
C:\Users\abc\Miniconda3\envs\pyarrow-dev\Library\bin\cmake.exe -DPYTHON_EXECUTABLE=C:\Users\abc\Miniconda3\envs\pyarrow-dev\python.exe -G "Visual Studio 14 2015 Win64" -DPYARROW_BUILD_PARQUET=on -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_BUILD_GANDIVA=on -DCMAKE_BUILD_TYPE=release C:\arrow\arrow\python
-- Selecting Windows SDK version to target Windows 10.0.15063.
CMake Error at CMakeLists.txt:22 (project):
Failed to run MSBuild command:
MSBuild.exe
to get the value of VCTargetsPath:
Microsoft (R) Build Engine version 15.9.21+g9802d43bc3 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.
Build started 9/6/2019 9:55:02 AM.
Project "C:\arrow\arrow\python\build\temp.win-amd64-3.7\Release\CMakeFiles\3.15.3\VCTargetsPath.vcxproj" on node 1 (default targets).
C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\Common7\IDE\VC\VCTargets\Microsoft.Cpp.Platform.targets(67,5): error MSB8020: The build tools for v140 (Platform Toolset = 'v140') cannot be found. To build using the v140 build tools, please install v140 build tools. Alternatively, you may upgrade to the current Visual Studio tools by selecting the Project menu or right-click the solution, and then selecting "Retarget solution". [C:\arrow\arrow\python\build\temp.win-amd64-3.7\Release\CMakeFiles\3.15.3\VCTargetsPath.vcxproj]
Done Building Project "C:\arrow\arrow\python\build\temp.win-amd64-3.7\Release\CMakeFiles\3.15.3\VCTargetsPath.vcxproj" (default targets) -- FAILED.
Build FAILED.
"C:\arrow\arrow\python\build\temp.win-amd64-3.7\Release\CMakeFiles\3.15.3\VCTargetsPath.vcxproj" (default target) (1) ->
(PlatformPrepareForBuild target) ->
C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\Common7\IDE\VC\VCTargets\Microsoft.Cpp.Platform.targets(67,5): error MSB8020: The build tools for v140 (Platform Toolset = 'v140') cannot be found. To build using the v140 build tools, please install v140 build tools. Alternatively, you may upgrade to the current Visual Studio tools by selecting the Project menu or right-click the solution, and then selecting "Retarget solution". [C:\arrow\arrow\python\build\temp.win-amd64-3.7\Release\CMakeFiles\3.15.3\VCTargetsPath.vcxproj]
0 Warning(s)
1 Error(s)
Time Elapsed 00:00:00.94
Exit code: 1
-- Configuring incomplete, errors occurred!
See also "C:/arrow/arrow/python/build/temp.win-amd64-3.7/Release/CMakeFiles/CMakeOutput.log".
error: command 'C:\\Users\\abc\\Miniconda3\\envs\\pyarrow-dev\\Library\\bin\\cmake.exe' failed with exit status 1
(pyarrow-dev) C:\arrow\arrow\python>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have stumbled across this error as well since my last commit.
From what I can tell, this does not happen, when one follows the recipe in one go: i.e. configure arrow library, build arrow library then build extension.
I encountered this error when I wanted to rebuild the python extension without reconfiguring and re-building the arrow library. For this I did python setup.py clean. After that when I run python setup.py build_ext --inplace the issue you described pops up.
Simpler than changing the line you referenced would be to introduce into the build recipe the step:
set PYARROW_CMAKE_GENERATOR=Visual Studio 17 2017 Win64
One could then change the cmake line to use this once defined variable like so:
cmake -G "%PYARROW_CMAKE_GENERATOR%" ^
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. I agree with your proposal. This change also works for my environment.
| pushd arrow\python | ||
| py.test pyarrow -v | ||
| popd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At line 474, I cannot execute python-test.exe or ctest at arrow\python. I may miss something to build python-test.exe
(pyarrow-dev) C:\arrow\arrow\python>ctest
*********************************
No test configuration file found!
*********************************
Usage
ctest [options]
(pyarrow-dev) C:\arrow\arrow\python>python-test
'python-test' is not recognized as an internal or external command,
operable program or batch file.
(pyarrow-dev) C:\arrow\arrow\python>dir python-test*
Volume in drive C is OS
Volume Serial Number is 7884-B5DB
Directory of C:\arrow\arrow\python
File Not Found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not tried this yet. Can confirm your observations.
Could this be an obsolete part of the instructions? In the linux instructions neither python-test nor ctest seem to make an appearance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to pass -DARROW_BUILD_TESTS=ON to build arrow-python-test.exe, the unit tests are now toggled off by default. I don't think most developers need to build this executable but this should go in a separate section for completeness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wesm Thanks. ctest runs now but it must be run from the cpp/build directory.
Also, arrow-python-test fails:
Start 58: arrow-python-test
58/83 Test #58: arrow-python-test .........................Exit code 0xc0000409
***Exception: 0.51 sec
I can find a arrow-python-test.exe in cpp/build/release/Release. Unfortunately running it directly is not working either.
(pyarrow-dev) Z:\dev\arrow\cpp\build\release\Release>arrow-python-test.exe
Fatal Python error: initfsencoding: unable to load the file system codec
ModuleNotFoundError: No module named 'encodings'
Current thread 0x00002390 (most recent call first):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wesm Apologies. ctest and arrow-python-test.exe do run ok.
I forgot the set PYTHONHOME=%CONDA_PREFIX%...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. This change also works well for my environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary for most developers to run the C++-based unit test, so we should put this off in a separate Advanced section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is already done
|
@kiszk I am encountering further issues: Issue 1: Issue 2: To fix this, I again need to |
It may or may not be self-evident depending on the person. Probably a good idea to mention it explicitly in the document.
In general yes this is the recommendation for development. You can try using You can mention both options |
|
@wesm I updated my local copy from #5297 to the current master and am now encountering new problems with the build: I cherry-picked 314e9f0 onto 2164e3b (common ancestor of master and #5297) and that works ok. So somewhere since 2164e3b an issue seems to have cropped up. Can you guess where this was introduced? Or shall I try building at different points in between 2164e3b and I thought, that maybe a dll was no longer getting built, but that does not seem to be the issue. With 314e9f0: With #5239: |
|
|
||
| .. code-block:: shell | ||
| pushd arrow\cpp\build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is it better to add mkdir arrow\cpp\build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are quite right. Just in case somebody jumps directly to this section.
Never mind, I finally found the culprit: When I tested In my later builds without I added a note to this effect in the updated instructions. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me. Were you planning to do other changes or can this be merged?
docs/source/developers/python.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "libraries".
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I fixed a couple typos. Merging this, thank you!
The current instructions for building the pyarrow python extension are incomplete.
Problems include:
This amends the python developer documentation with above.